Foundations for a theory of mind for a humanoid robot
Abstract (summary)
Human social dynamics rely upon the ability to correctly attribute beliefs, goals, and percepts to other people. The set of abilities that allow an individual to infer these hidden mental states based on observed actions and behavior has been called a “theory of mind” (Premack & Woodruff, 1978). Existing models of theory of mind have sought to identify a developmental progression of social skills that serve as the basis for more complex cognitive abilities. These skills include detecting eye contact, identifying self-propelled stimuli, and attributing intent to moving objects.
If we are to build machines that interact naturally with people, our machines must both interpret the behavior of others according to these social rules and display the social cues that will allow people to naturally interpret the machine's behavior.
Drawing from the models of Baron-Cohen (1995) and Leslie (1994), a novel architecture called embodied theory of mind was developed to link high-level cognitive skills to the low-level perceptual abilities of a humanoid robot. The implemented system determines visual saliency based on inherent object attributes, high-level task constraints, and the attentional states of others. Objects of interest are tracked in real-time to produce motion trajectories which are analyzed by a set of naive physical laws designed to discriminate animate from inanimate movement. Animate objects can be the source of attentional states (detected by finding faces and head orientation) as well as intentional states (determined by motion trajectories between objects). Individual components are evaluated by comparisons to human performance on similar tasks, and the complete system is evaluated in the context of a basic social learning mechanism that allows the robot to mimic observed movements. (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)