AI Risk or Safety?

Robot overloads ruling humanity has been a common theme in movies and dystopian novels for the past half century. Today, smaller-scale, more realistic concerns, like whether Alexa is eavesdropping on conversations, has captured the collective conscience. As technology rockets forward, the question of what happens once artificial intelligence (AI) surpasses human intelligence is becoming a very real concern. The gap between fiction and reality is shrinking faster than ever and although claims of impending robot overlords are premature, it is not clear whether the ever-quickening pace of AI development is one that puts the world on a path towards utopia or dystopia. Emerging AI applications are radical in themselves and worthy of concern. The potential for unintended consequences, misaligned application, and loss of control are all significant risks inherent in the technology. Yet, currently, funding for research into AI safety is less than 2 percent of the funding into AI development. This is at a time when entities such as China and Facebook are increasingly looking towards using AI for social management while new applications such as driverless cars loom on the horizon. The current trajectory of AI development is reckless, misdirected, and ultimately poses a significant threat to the potential benefits of the technology.

Currently, the movie screen depictions of AI going rogue and wiping out humanity are fantasy. However, there is some reason to worry. Indeed, computers are machines that do exactly what is asked of them and nothing more. However, as AI improves, we are witnessing a rise in “specification gaming”, the generation of a computer solution that literally satisfies the stated objective but fails to solve the problem according to a human designer’s intent. In other words, the computer finds its own way to solve a problem. Sometimes this leads to creative solutions. For example, an algorithm was tasked with designing a circuit without the central component, the capacitor. Researchers were astounded to find that it succeeded and had done so by harnessing energy from a nearby radio station to create a make-shift capacitor. Such creative solutions aren’t always a cause for celebration. In another case, an AI learning to play Tic-Tac-Toe started playing moves outside the normal three-by-three square. The computer ultimately crashed from memory overflow. The rules and its own survival didn’t matter, all that mattered to the AI was that it didn’t lose.

In contrast to somewhat pedantic specification-gaming issues lies a much more familiar risk: people. Although the possibility of AI’s values becoming misaligned with humanity’s is admittedly far-fetched, the possibility of a corporation’s or government’s becoming drastically misaligned is not. Introduce the power of AI, and the effects can be dramatic.

Sometimes corporations are morally ambiguous such as in 2012 when data scientists tweaked the Facebook news feed algorithms of almost 700,000 users. Some of the users were shown happier posts while others were shown content determined to be sadder than average. After exposure to the manipulated news feed, the users were more likely to post similarly happy or sad content themselves. These sophisticated algorithms control what we see on the internet and this experiment shows that large corporations are more than willing to experiment with their emerging influence.

In other cases, actions are decidedly more controversial. In 2014, China announced the creation of a social credit system, which by 2020 aims to monitor and rank the social standing of its 1.4 billion citizens. Already in Shanghai things such as not visiting parents often enough, parking illegally, protesting government, and falsifying personal history are considered bad behaviors that can lower a person’s social credit score. It’s the kind of thing any authoritarian government would try and pursue and it’s only possible due to the meteoric rise in big data and machine learning in recent years.

Finally, in what may very well be a doomsday scenario, AI could become sentient or so wildly surpass humans in effective intelligence that it becomes uncontrollable. Big name figures from Elon Musk to the late Stephen Hawking suggest such a scenario, commonly referred to as the singularity, which could very well spell the end of civilization. Small human error or misalignment in the programming of AI could spiral into catastrophes as the technology scales upwards. Consider the small, but particularly striking example of this in early 2015 when a Google image detection program identified two black men as gorillas. Once AI reaches super-human levels of intelligence, such unintended consequences grow astronomically. Without proper management of these misalignment risks, dystopian realities could become far more likely and far more disconcerting than a computed social credit score.

Clearly, the development of AI is outpacing man’s abilities to understand and regulate. The technology is new, dangerous, and—by definition—virtually impenetrable to direct human understanding. This has spurred public figures to put their money where their mouth is. In 2015, Elon Musk gave a $1 billion endowment and established Open AI, a nonprofit tasked with developing and researching the use of friendly AI as a way to help society. Its focus is to ensure the benefits of AI are not funneled to the elite or eclipsed by unforeseen accidents. In academia, institutes such as Berkeley’s recently established Center for Human Compatible AI and Oxford’s Future of Humanity Institute research more foundational problems in the emerging field of AI safety. These problems range from preventing specification gaming to creating algorithms to ensure that the morals of AI are aligned with the desires of society. Although the field is relatively new, increases in funding and longer timelines hold the promise of solving the AI alignment problem.

Funding for such safety research, however, is woefully inadequate. Given the unchecked progress of the technology, the safest option would be to simply curtail or regulate the development of AI until the risks can be managed. Unfortunately, this is not likely. There aren’t many, if any, cases were the development of a new technology has been successfully halted. Even the highly secretive Manhattan Project was infiltrated by the Russians before the end of World War II. Of course, despite the fact that most of the world’s superpowers have detonated nuclear weapons, only the United States has used them in war. In that respect, the establishment of regulatory agencies has been extremely successful in reducing the proliferation and use of nuclear weapons.

This would suggest that the establishment of analogous regulatory mechanisms would be similarly helpful in preventing catastrophic AI outcomes. However, in the same way uranium is the raw material for a nuclear bomb, data is the raw material for AI. The higher the quality the better. Thus, the problem is one of data regulation. The idea would be that stopping people from getting access to certain data would stop the development of certain kinds of AI. Defining appropriate data collection is a complex problem and many would argue present data collection practices have already crossed the line. While regulation may be successful in slowing down development of an advanced AI, the current state of initiatives such as China’s social credit program suggest that the technologies are already sufficient to create a dystopian state. For all practical purposes, Pandora’s box is already open.

The future of AI is unwritten. Automation has the potential to radically shift how citizens approach work and leisure. As Oxford researchers concluded this year, AI automation could leave up to half the population unemployed in the next two decades. Self-driving cars have the potential to save Americans hundreds of billions of hours in driving. They also threaten to destroy the livelihood of 3.5 million truck drivers in the US alone. Moreover, there is the risk of a rouge hacker interfering with or seizing control of a collection of these vehicles. Researchers at Berkeley investigating adversarial attacks on these AI vehicles came to the stunning conclusion that many driverless cars can be fooled into misreading street signs by altering only a few pixels worth of input data. The danger is real and approaching at high speed.

Risks posed by AI are difficult to assess, as the many technologies involved are so interrelated they can prove resistant to isolated risk evaluation. The current trajectory of AI is dangerously unregulated. AI safety research, although gaining traction in recent years, has not kept up with progress. At this rate, AI could be humanity’s last invention. Yet, from the creation of the steam engine to the invention of the computer, humanity has persevered and risen to the challenge. As Plato said, “The tools which would teach men their own use would be beyond price.” AI’s true power may not be its ability to mimic or surpass human intelligence, but its promise to create and disseminate understanding. If we are willing to approach Artificial Intelligence with unprecedented humbleness, accepting regulation to slow development as we focus on safety, our future with AI is promising.

References

https://deepdrive.berkeley.edu/node/107 https://openai.com/about/ https://www.economist.com/graphic-detail/2018/04/24/a-study-finds-nearly-half-of-jobs-are-vulnerable-to-automation https://80000hours.org/problem-profiles/positively-shaping-artificial-intelligence/ https://en.wikipedia.org/wiki/OpenAI https://www.wired.com/story/when-it-comes-to-gorillas-google-photos-remains-blind/ https://www.theatlantic.com/technology/archive/2014/06/everything-we-know-about-facebooks-secret-mood-manipulation-experiment/373648/ https://www.theatlantic.com/technology/archive/2018/02/chinas-dangerous-dream-of-urban-control/553097/ http://www.newsweek.com/china-social-credit-system-906865 https://mailchi.mp/d1a19c140226/alignment-newsletter-10?e=e75491658f https://vkrakovna.wordpress.com/2018/04/02/specification-gaming-examples-in-ai/

Written on September 15, 2018