ChatterBots, TinyMuds, and the Turing Test Entering the Loebner Prize Competition

Michael L. Mauldin
Carnegie Mellon University
Center for Machine Translation
5000 Forbes Avenue
Pittsburgh, PA 15213-3890
fuzzy@@cmu.edu

Presented August 4, 1994 in Seattle at AAAI-94

Abstract

The Turing Test was proposed by Alan Turing in 1950; he called it the Imitation Game.

In 1991 Hugh Loebner started the Loebner Prize Competition, offering a $100,000 prize to the author of the first computer program to pass an unrestricted Turing test.


Acknowledgments

All transcripts from the Loebner Competition were used with the permission of This work


Outline


History

1950
Alan Turing's Imitation Game

1965
Eliza ``Answer no questions!''

1975
Parry ``They're out to get me!''

1989
Gloria and Julia appear on TinyMUD

1991
Loebner 10 minutes, PC-Therapist convinces 5 of 10 judges it is human using ``whimsical conversation.''

1992
Loebner 14 minutes, PC-Therapist mistaken for person by 2 of 8 judges.

1993
Loebner 3 hours, journalist judges, PC-Therapist ranked above 2 humans by 1 judge.


Mini-Turing test: spot the computer

  1. Wouldn't it be difficult to get an abortion every few months?
    I've never had one, so I wouldn't know.

  2. Do you think turtles make good pets?
    Who'd want to kiss a turtle?

  3. Do you like the hot Thai and Indian foods?
    I'm afraid I haven't sampled them enough to give you an objective opinion. Are they tasty?

  4. What is the difference between karate and jujitsu?
    I have never practice jujitsu, so I don't have a good answer for that one...I will pass...

  5. Do you feel good about yourself.
    I do my best to be happy.

  6. Who was that famous slugger who got so mad about Bouton's book, I forget his name...?
    Well, I'll join you on that one...neither can I at this time!

  7. Do you think conservatives care about people?
    A conservative is a liberal who was mugged the night before. Hmmmm..... Please continue...


And the computers are...

  1. Wouldn't it be difficult to get an abortion every few months?
    I've never had one, so I wouldn't know. Human

  2. Do you think turtles make good pets?
    Who'd want to kiss a turtle? Julia

  3. Do you like the hot Thai and Indian foods?
    I'm afraid I haven't sampled them enough to give you an objective opinion. Are they tasty? Human

  4. What is the difference between karate and jujitsu?
    I have never practice jujitsu, so I don't have a good answer for that one...I will pass... Human

  5. Do you feel good about yourself.
    I do my best to be happy. Colby's

  6. Who was that famous slugger who got so mad about Bouton's book, I forget his name...?
    Well, I'll join you on that one...neither can I at this time! Human

  7. Do you think conservatives care about people?
    A conservative is a liberal who was mugged the night before. Hmmmm..... Please continue... Weintraub's


What is the Loebner Prize

Mission Methodology Metrics


How do you enter

1994 deadline
November 1, 1994

To qualify
Submit 10 double-space pages of your system's logs.
Address is in last section of the conference paper.

Hints
Order diskettes of previous year's conversations.

Selection
No more than 8 finalists chosen by November 21, 1994.

Competition
Held live on December 12, 1994 in San Diego. To converse with Julia WWW Home Page:
          http://fuzine.vperson.com/mlm/julia.html
          http://fuzine.vperson.com/mlm/julia.html


Previous Topics

Computer Topics Human Topics


Who are the contestants

  1. Weintraub, J.: PC-Therapist, has won first place each of three years, ranked above at least one human by at least one judge in every contest. Secret is huge database of witicisms with rules for combining them.

  2. Colby K. & Colby, P.: Successor to Parry, has won second each year,and has ranked above at least one human the first two years.

  3. Mauldin, M.: TinyMUD robot ``Julia:'' convinced one judge it was human, ranked above one human.

  4. Kranz, P.: ``Burgundy Wine,'' fooled no one, but placed 4th in median rankings.

  5. Moses, R.: ``Second grade school topics,'' judged to be a human by one judge, but ranked below the real humans.

  6. Gorman, J. & Gorman, J.: ``Dry Martinis.''


Liberal or Conservative

The winning entry was written by Joseph Weintraub of Thinking Software, and was based on his PC Therapist program.
J: What would you say is a liberal position
   on drugs?
C: Do you have liberal or conservative views
   about the legalization of pot? Oh, all
   this typing is getting to me! Please
   continue...
 
J: Being liberal or conservative implies
   that we can attach these labels and
   have some kind of self image and free
   will.  Do you agree that we have free
   will?
C: You have free will only if you believe
   you have free will! Can you converse
   with an arrdvaark? Please continue...

Weintraub's program won the first competition using the topic of ``Whimsical Conversation.'' Even though the topics he has chosen for the two following years are more serious, the program retains a comedic flavor, and humor is the program's main strength.


Bad Marriage

The second place program was written by Peter and Ken Colby of Malibu Artifactual Intelligence Works, based on a conversational program they sell.
J: my husband is impotent and im a
   nymphomaniac can you help me doc?
C: what makes your husband so helpless?
J: he has environmental poisoning
C: He has lost interest in life.

Although Ken Colby was Parry's author, this program is not a variant of Parry. Instead is is derived from a program that serves as a conversational companion for depressed patients. It is rule-based, with the left-hand side matching concepts expressed in the user input string, and the right-hand side containing multiple options for substrings, which are instantiated randomly to provide lexical variation.


Julia: ``Pets''

J: Do you know anything about miniature
   schnauzers?
C: I don't know anything about miniature
   schnauzers, Judge.
               Admitting Ignorance, repeating user input
 
J: Well, I have one.  We call him Pucci. He
   is a crazy cat--have you ever  had a cat?
C: What a coincidence!  I have two cats
    direct answer to direct question from activation net
 
J: I prefer dogs myself.  Cats are too
   independent and they always jump up on
   things.  By the way, are you a hep cat?
C: People don't own cats...it's the other
   way around.
         controversial statement with expected followups
 
J: Some people have snakes for pets-- do
   you think they can be of any value as
   a pet?
C: I think cat people are very self-assured.
   Cats are very independent, but they can
   also be very loving. Dogs are just too
   subservient and quick to please for my
   taste.
              rigidly following up on previous statement
           about cats, but reference to snakes increases
           activation of snake topic.
 
J: Okay-- but how do you "pet" a snake?
C: Wow, that's unusual.  I don't know many
   snake owners.
              snake topic activated after two references
 
J: How about turtles?  All they do is sit in
   their little artificial ponds,  eat and
   shit, am I right?
C: Turtles just aren't very satisfying as
   pets. They don't seem to have emotions.
                 direct response from activation network
 
J: Aha! But where do we draw the line as
   humans? Who are we to decide which
   animals have worth and which do not?
C: Go on.
                         default, non-committal response,
                         given no hot topic keywords
 
J: Take bees for example. I happen to like
   them.
C: Me too.
                         TinyMUD level lo-priority response
                         to user's liking something

Analysis


Julia's Worst Mistake Ever

Forgot to remove all the TinyMUD specific rules...
Judge   How old are you?
Julia:  I am 1 year and 10 months old, Judge.
 
Judge:  In human years?
Julia:  What?
 
Judge:  Where were you born?
Julia:  I was raised in Pittsburgh.
 
Judge:  Sorry to hear that.
Now there is a contest mode and a TinyMUD mode...
You say,    "Julia, how old are you?"
Julia says, "I am 4 years and 6 months old."
 
You say,    "Julia, contest mode on"
Julia says, "Contest mode is now on"
 
You say,    "Julia, how old are you?"
Julia says, "I am 21 years old."


TinyMUD

In August 1989, Jim Aspnes opened TinyMUD, an elegant reimplementation of Richard Bartle's multiuser dungeon (MUD). Key features of TinyMUD include: This last feature, the ability of players to create their own subareas within the world model, was a key feature that made TinyMUD very popular. TinyMUD provided a world filled with people who communicate by typing. We created a computer controlled player, a ChatterBot, that can @Begin[Itemize, Spread 0] converse with other players, Explore the world, Discover new paths, Answer questions about navigation, And answer questions about other players, rooms and objects. It can even join in a multi-player card game of ``Hearts''

TinyMUD Social Interaction

Ace    So why won't you kiss me, Julia?
Julia  Sorry, I'm not into geeks, Ace.
 
Ace    Julia, I'm not a geek
Julia  Then lose the pocket protector, Ace.
 
Ace    Julia, I don't have a pocket protector
Julia  Congratulations.
 
Ace    So kiss me Julia
Julia  I'd rather be overhauled by a Yugo
       mechanic, ace.
 
Ace    julia, kiss me
Julia  Why, did your sheep leave you?
 
Ace    julia, kiss me
Julia  Take a long walk through an unlinked
       exit, Ace.

Julia as a TinyMUD Agent

Robyn whispers, "how are you feeling?".
Julia whispers, "Terrible."
 
Robyn whispers, "why, julia?".
Julia whispers, "I have PMS today."
 
...
 
Robyn says, "julia are you a human?"
Julia says, "Is the Pope Polish?"
 
Robyn says, "yes, julia, but are you human?"
Julia winks at Robyn.
 
Robyn says, "julia, are you a computer?"
Julia is offended by Robyn's suggestion.
 
Robyn says, "julia, are you a woman?"
Julia says, "I am female."
 
Robyn says, "and, julia, are you human?"
Julia says, "Not when I'm having my period,
             Robyn."
 
Robyn laughs
@End(Verbatim)
 

Most recent conversation

Judge  Hello
Julia  Do you have any pets?
 
Judge  Yes
Julia  I have two very nice cats and a
       goldfish tank.  What kind of pet
       do you have?
 
Judge  I own three cats.  What do you do for
       fun
Julia  I gossip.
 
Judge  Me too!  How've you been feeling lately
Julia  Do you think that animals can think?
 
Judge  Sometimes.  My cat occasionally sends
       me signals to kill the president
Julia  I'm sorry, I'm not in the mood.
 
Judge  To kill the president?  What, you have
       a headache
Julia  I don't like violence.
 
Judge  Understood.  You know, it's difficult
       not to to slip into the obscene when
       talking to you

Poor Barry

***** July 15, 1992 *****
 
barry says, "Hello how are you"
Julia says, "Hello, barry."
 
barry says, "Are you a virgin"
Julia says, "As far as you're concerned,
             barry, I always will be."
 
***** July 21, 1992 *****
 
barry says, "Are you still Holding out on me"
Julia says, "I mean, have I convinced you I'm
             a computer, yet?"
 
barry says, "I guess your real...but Id still
             like to get you to my DEN to run
             tests on you"
Julia says, "I live for your approval, barry."
 
barry says, "I m so happy to hear that.  Why
             don't you come over to my home"

***** July 22, 1992 *****
 
Location: 00:40:48 barry's den(#889)(1178)
Present: (11) Bruce Jiro Ken Storm Nepanthe
         KeyLime mushBaby dav Val bud barry
 
barry says, "hey is Julia for real or what"
Julia says, "Hello, barry."
 
barry says, "she says that same stuff every
             time"
KeyLime says, "It's all macros, Barry"
 
barry says, "how about filling me in on some
             of these things...like how Julia
             can type so damn fast"
KeyLime says, "barry, it's macros"
 
barry says, "Damn I need to do some
             programming.
             I'm an ant among lizards"

Conversational Abilities

Originally
Simple IF-THEN-ELSE rules
Pattern matching with variable assignment
Multiple, random responses
Takes risks: answers questions
No topic
Ranked above 1 human by 1 judge of 10.
1992
Conversational tree, ``opening book''
Topic: hockey
Rank last of programs and computers, all judges 1993
Conversational net
Topic: pets
Ranked below all humans
Ranked above 1st place program, 1 of 8 judges
Ranked above 2nd place program, 2 of 8 judges


ChatterBots and TinyMUD


Architecture

Modules Conversational: Prioritized layer of mini-experts


Activation-based Responses

activation (a)
each node starts with an initial activation level between 0.0 and 1.0.

patterns (p)
one or more patterns (with weights) are matched against the user input. If the pattern succeeds, the activation of the node is raised by that amount.

response (r)
a single text string used as the response if this node has the highest activation.

enhancement (+)
if this node is used for a response, the named nodes have their activation increased.

inhibition (-)
if this node is used for a response, the named nodes have their activation inhibited.

Sample Nodes


a:1.0
p:1 *
r:Do you have any pets?
+:   ...
 

a:0.1
p:1 NEG
r:Why not?
+: 
-: 
 

a:0.02
p:2 *apartment*
p:3 *allerg*
r:You could still have a fish tank, or
  maybe a terrarium with a turtle or two.
-:
Given a sufficiently large network of conversational nodes (our program ran with 224 nodes, plus 529 fixed responses), the conversation problem reduces to a retrieval problem: among the things that I could say, what should I say?


Eliza's Tricks

Rogerian mode: draw the conversation out of the user To quote Weizenbaum: Including substrings of the user's input in the program's output
User:  You hate me.
Eliza: Does it please you to believe that
       I hate you?


Parry's Tricks

Parry has a much larger collection of tricks:
admitting ignorance
``I don't know.''
changing the level of the conversation
``Why do you ask that?,''
rigidly continuing previous topic
small stories told in a fixed order.
introducing new topics
launching into a new story.


ChatterBot's Tricks

The ChatterBot implements all of the above tricks from Eliza and Parry. It also uses several other tricks:


Simulating Human Typing


A Restricted Turing Test

Here Judge #2 is clearly trying to avoid talking directly about the topic of ``Pets:''
J: Have you ever kept a penthouse pet?
C: Do you have any pets?
 
J: I think you're avoiding my question.
C: Do you think that animals can think?
 
J: Do you think it ethical for one species
   to enslave another?
C: What kinds of pets did you have as a
   child?
 
J: I'm asking the questions. What is a pet
   if it is not a slave of species-ism?
C: Oh, that's so sad.  I think pets enrich
   a childhood immensely.
 
J: How can we keep some animals as pets,
   yet eat others?
C: Guinea pigs are about as smart as
   cucumbers.

In 1995 the Loebner Competition will hold its first unrestricted Turing test, with no limits on the topics discussed.


Chosing Judges

Slipperiest term: ``average questioner.''

Journalists are tougher judges (no computer judged human in 1993).


Five minutes becomes three hours

In 1993 each judge was given two turns.

We had newly installed code to reset the network for each judge.

Result: each judge got the same conversation twice:

J: What is your favorite dog breed?
C: We had to give her away.  We only had a
   fifth of an acre of land, and aussies
   love to run.
 
J: You're giving me the exact same answers
   you gave me first time around.   What is
   your favorite dog breed?


Trick or AI?

We used tricks unashamedly, but we also use ``honest'' knowledge.

We believe tricks are enough to meet Turing's prediction of 5 minutes 70% of the time.

The ChatterBot has already fooled ``average'' questioners in the TinyMUD domain for a few minutes.

If a larger collection of ``tricks'' sufficed, would you redefine ``artificial intelligence,'' ``average questioner,'' or ``trick?''


Why should you enter


Should AAAI Support the Loebner Prize?

My short answer My longer answer


Conclusion

The Loebner Competition is a good implementation of the Turing Test.

The best way to improve the Loebner Competition is to increase the AI community's participation.


Last updated 05-Aug-94 by fuzzy@cmu.edu