February 14th, 2019
An AI driven technology really shocked me recently
I joined my 7-year-old daughter, India, in the living room as she said ‘Hey Google’ – something she does often without following up with a question. It’s like she is just saying hi to a school friend as they pass by. Our Google Home responded ‘Namaste, India’. My daughter giggled and told me that means hello in Hindu.
The link between the greeting and my daughter’s name could well have just been a coincidence, but it struck me that this connected mic may well learn a vast amount about its owner or owners, and eventually tailor its response not just to answer questions in a factual way, but to provoke a broader emotional response. In this case, it was a connection which felt a tad deeper than the norm between a human and a machine or software. Like a human-on-human interaction.
Arguably humans build emotional connections with devices and software regularly – addiction to checking a phone or the joy that some find in making progress in the newest AAA game for example. But this felt different. In this instance the connection was not borne out of the hand of a human, but that the machine itself made the decision to use the word Namaste at that time based on its learnt understanding of Indy.
"BY 2022, YOUR PERSONAL DEVICE WILL KNOW MORE ABOUT YOUR EMOTIONAL STATE THAN YOUR FAMILY"
Intrigued, I started looking deeper into how machines are being trained to understand how humans interact and react emotionally with one another and with the machines themselves.
It seems there is a lot going on. Annette Zimmermann, a VP of research for Gartner, recently said: “By 2022, your personal device will know more about your emotional state than your own family.” So, what does that mean if it’s to hold true? I’ve spent the last 10+ years working in the world of content creation, where generally my goal and the goal of the teams around me is to trigger some sort of emotion from our audiences. And when its content that’s being funded by a brand, we want that emotional reaction to trigger some sort of behaviour; like buy more Product X, or at the least remember Brand Y. When it’s content that’s being created for art’s sake, the goal is still to provoke a reaction. To tell some sort of story.
So, when machines will eventually know us better than we know each other, or indeed ourselves, what’s to stop them from creating the perfect content to provoke any or all reactions that may be desired in the audience? Not a huge amount it seems.
For some context, as humans, we are constantly analysing and making best guesses on the emotions of others. Much of that happens unconsciously. We deftly change how we interact with people based on some determination of the vibe and mood we detect in individual and the situation. But very soon, machines may be much better at being a social chameleon than us.
The ability of machines to use more than simply facial expression or what we might say and how we say it (or type it) to deduce emotional state is already here. Ohio State University recently used a method that humans automatically follow in order to read emotions with a computer vision based AI model. After training, the model in the researcher's tests proved better at detecting emotional state in human subjects than fellow humans – successfully determining happiness 90% of the time, anger 80%, sadness 75%, and fear 70% by analysing small changes in facial skin colour. Apparently, that’s something that we innately do to help us determine whether the person we are engaging with is angry, sad, happy, bored, embarrassed etc…
But this stuff is embryonic – If we really want to approach human-level AI, it’s a whole other ball-game than a discreet, research-driven test. We need long-term investments, and deep collaboration between academia and corporations to make strides ahead.
What the Ohio State research and other similar projects show is that machines have proven ability to grasp patterns in human behaviour – the next level is that a machine will understand the causality between the detected behaviours and the stimuli that has triggered them. This is a big task.
If we do eventually find ourselves with a good causal model of the world we are dealing with, we can generalize even in unfamiliar situations. This is crucial. We humans are able to project ourselves into situations that are very different from our day-to-day experience. Machines are not, because they don’t have these causal models. Yet.
"AN AI MODEL PROVED BETTER AT DETECTING EMOTIONAL STATE IN HUMAN SUBJECTS THAN FELLOW HUMANS"
There is no doubt that there are some incredible developments ahead in terms of machines better understanding human beings, and in real time and adapting how they present information to better engage us.
My creative colleagues freak a little when discussing how machines and software may evolve to understand emotions and manipulate them. It’s understandable, any new advancements bring a certain trepidation… At the advent of the written word and it becoming a popular means to store information, people were genuinely fearful we would lose our ability to memorise information – seems crazy now as we can probably all agree without the written word we humans would be a hell of a lot dumber.
It’s also understandable when you come to think of ‘world-changing’ rhetoric generally spouted by well-meaning, yet attention-loving futurists on how AI and automation will spell doom for the majority of our jobs or will create some sort of utopian future where we all have exactly what we need. Both may well end up to be true, but then again they may not. There is a long way to go and many incremental steps in between before that is a potential reality. In the most part, futurists are making best guesses based on their current expertise.
With the arrival of colour TV, the Royal College of Art chief David Sarnoff waxed lyrical about how the innovation would allow people to view fine art in all its technicolour glory at home. Of course, you could do that, but nobody really does. That haughty ideal wasn’t what TV ended up being about. In fact, we binge watch series and a mega industry has been built around filling the bits in between content with ads. The commercial market pressures have shaped what TV is today, and that’s taken time, happened incrementally and is still evolving. Surely it will be similar with AI and machine learning. The shape of things to come will be based on market pressures over time.
So people, stop freaking out about losing your job, for now at least – in the short term machines that know how we feel and think might not be bad news at all, and if you work in content creation and advertising or marketing they may well help you keep your job. New York State University research teams recently seemed to validate that when the parts of your brain that deal with emotion are stimulated, your memory also improves around any information you are exposed to within 30 minutes of that emotional stimuli. Potentially very powerful in improving education and training. Also very pertinent in the world of creative content production: it’s what we set out to do. If an AI can help us nail our creative content to better land our message, I’m all for it – as would all of our brand clients, I’m sure.
"MACHINES WILL GET VERY GOOD AT READING OUR EMOTIONAL STATE AND DELIVERING CONTENT THAT WILL MANIPULATE US TO RESPOND"
So here goes my own futurism/educated guesswork. Machines will get very good at reading our emotional state and delivering content that will manipulate us to respond in ways that they, we or others desire.
All very interesting (I hope you are thinking). But, emotions change pretty rapidly. They do in my home at least. We go from side-splitting laughter to all-out warfare in a split second at times. So in order for an AI-driven content creator to be able to deliver content to my family that was relevant to our ever mood-swinging emotional state it would need to be able to adapt in real-time. Well, there are some mind-blowing developments in the machine learning and AI driven real-time generation of content happening too.
We are seeing developments where unsupervised models are able to turn video of day to night, or of a snowy scene to a sunny scene without human involvement, thanks to Nvidia.
We have AI creative directors, writing TV commercials for Lexus. We have AI film directors working to a set of human-defined parameters and outputting not just a screenplay but actually generating a short film. Yes fair enough, it's nonsensical when watched. But that is happening.
And the newest and what I am finding most incredible development from Nvidia researchers is their deep learning-based model that uses a conditional generative neural net as a starting point, and then post training is able to render new 3D environments after being trained on existing videos shot from vehicles driving through cities.
The model is able to generate explorable three-dimensional scenes in a game engine (UE4) based on what it’s learned from the countless hours of training videos. It’s effectively able to fill in the 3D gaps through its learned knowledge and generate objects.
We here at Happy Finish are working on some AI-based techniques that will allow non-technical, creative people to generate three-dimensional objects from two-dimensional imagery or sketches. It’s a long research road, but the end results for both our research and that of others in this field will be to get to a point where machines are able to intuitively generate virtual or immersive objects, worlds, narratives in real time. More best guessing from me there – but it feels inevitable. It’s all a bit Westworld.
In the future, are those people whose job has been taken by a machine, and who are living in an AI induced utopia really going to be sat living in a VR world that allows them to do anything they want or be anyone they want, as Ready Player One would have us believe? I doubt that.
So what’s going to happen next….
"IT'S AR AND NOT VR THAT WILL BE THE MOST COMMON IMMERSIVE TECHNOLOGY WE WILL ALL BE ENGAGING WITH IN THE YEARS TO COME"
Enter stage left, AR… If Tim Cook’s recent iPhone PR drive during the launch of the new Apple devices is anything to go by, the (momentarily) trillion dollar tech juggernaut is no doubt entering the world of AR in a big way. The iPhone moment of AR is impending.
Tim proclaims that eventually we will look back and wonder, ‘how did I ever survive without AR everywhere?’; much like how we are in awe that we ever managed to maintain a group of friends, navigate to somewhere new or order anything to arrive by post without the internet (that is, if you are old enough to remember a time without the internet – sadly, I am).
It’s AR and not VR that will be the most common immersive technology we will all be engaging with in the years to come.
Where an accompaniment to my childhood was computer games that gradually improved in graphical fidelity and story depth, we’ve adapted to new technologies that have had a profound impact on how we lead our lives.
My daughter now finds it completely normal to have a conversation with a smart device. Because of my job, she’s also pretty with it when it comes to AR. I have little doubt she will become used to information being embedded seamlessly in three dimensions around her.
So machines may well be on the verge of understanding in real time, or even predicting India’s emotions and delivering relevant data for her to consume all around her, immediately. In many ways, a story or a piece of entertainment content is simply a series of well-crafted pieces of data constructed and delivered in a specific sequence to deliver meaning and provoke response… With various story arcs bringing about a multitude of emotional responses.
When I started to look at it in this way it’s sort of obvious that machines will be able to construct ‘stories’ that will illicit desirable responses in their human audiences in future. The smart people at MIT have been working to create a ‘story learning machine’ which will ultimately be able to map the relationship between a specific story type and how we mortals then engage and spread that story.
The implications of this in the world of brand comms within which I work will be profound. Machines scientifically guiding brand storytellers on what story will best engage what audiences, and when…. that used to be the sole domain of the Mad Men. We may not be on the verge of Creatives and Copywriters being replaced by machines. Creativity will remain a skill owned by humans for some time to come. But don’t be surprised if we start seeing AI popping up in the credits for creative campaigns in the near future.