Putting a face on A.I.

As the saying goes, a picture is worth a thousand words — however a moment with someone could be priceless. As people interact with one another, our minds are tuned not only to listen to the sound of their words, but our eyes acutely observe the motions of their body and face.

These subtle cues provide context and added meaning to the conversation. The understanding and study of how human expressions and reactions are interpreted in conversations is a science called emotional intelligence.

Emotions and expressions can differentiate us as individuals, allowing a deeper connection with each other.

Putting a face on A.I.

Universal signs and expressions can invoke a consistent reaction. A smile generally means happiness; a frown is interpreted as discontent, known virtually worldwide.

Universal expressions that help people communicate despite language barriers are being used to create a more enriching experience with technology. With the introduction of machine learning techniques and the desire for a more intuitive, personal connection with bots, IBM Watson and Soul Machines have been working on ways to make machine interface conversations approachable.

Some expressions are universal. What emotions do you see?

That is where the introduction of emotional intelligence and understanding customer’s behaviors in the moment makes a difference. Many bots are good at understanding what someone is saying, but appreciating how they say it will become more important.

Understanding emotional states can often make the difference between successful conversation and losing business. With machines, its even more important as people tend to be more open and forthright. This connection between people and machines if successfully established can help progress an exchange with a conversational system from a one-time transaction to a relationship built on empathetic interactions.

How can a company increase the lifetime value of a customer and increase the quality of the relationship? By using situational context signals that customers provide—for example mood and tone. Up to now, some companies leverage demographics, log notes, and perhaps even social media, however that may not predict how well the person is going to react in the moment of the dialog.

Human faces are where trusted relationships are formed and humans are naturally drawn to them. That has been confirmed in the fact that so many social robots tend to have human-like facial features—for example C3PO. The human face has over 40 individual muscles that help express over 20 facial expressions.

These muscles help form a context in a conversation that have helped specialists understand deeper meaning in the dialog. It does not take a Ph.D. to understand that a wink after a sentence may be an indication to not take the sentence at face value. However, it does take a handful of researchers to help machines understand these subtleties.

Some of those scientists work at Soul Machines and IBM. They are pushing emotional intelligence research to the next level by proactively understanding someone’s emotional state from both a visual component and audio component in real-time.

This research is embodied in a life-like avatar that is created by Soul Machines. Their computing engine uses neural networks to mimic the human nervous system and provides an experience to get the furthest past the uncanny valley. When this experience is coupled with Watson, the ability to jump from transaction to interaction begins to unfold. The way in which these two technologies work in tandem result in a seamless interactive discussion.

Take a look:

When the user starts the system, a camera begins to read and interpret their facial expressions. From there, the system processes several signals simultaneously. As the end user interacts, a microphone picks up their inquiry and Watson Speech-to-Text transcribes the audio into unstructured text.

From there, Watson Conversation interprets the intent of the inquiry and provides an appropriate response. In parallel, Soul Machine’s engine is discerning the emotion from the face and the tone in the voice to better understand how someone is interacting with the system.

As this all happens within seconds, the user is return a response and an expression that is suitable based on that specific users’ interaction. The overall experience can help the end user to feel special and catered to—that, for many, is priceless.

Learn about Watson Conversation Service

Learn about Soul Machines

This article was originally published on Medium .

Contact Us

For registration or assistance, please contact our Event Customer Service team: