Human Compatible: Artificial Intelligence and the Problem of Control – Andy-palmer.co.uk


Human Compatible: Artificial Intelligence and the Problem of Control The thesis of this book is that we need to change the way we develop AI if we want it to remain beneficial to us in the future Russell discusses a different kind of machine learning approach to help solve the problem.The idea is to use something called Inverse Reinforcement Learning It basically means having AI learn our preferences and goals by observing us This is in contrast to us specifying goals for the AI, a mainstream practice that he refers to as the standard model Add some game theory and utilitarianism and you have the essence of his proposed solution.I like the idea, even if there are some problems with his thesis I would like to address that, but first there is this most memorable quote from the book No one in AI is working on making machines conscious, nor would anyone know where to start, and no behavior has consciousness as a prerequisite There most definitely are several individuals and organizations working at the intersection of consciousness or sentience and artificial intelligence.The reason this area of AI research is chastised like this is that it is highly theoretical, with very little agreement from anyone on how best to proceed, if at all It is also extremely difficult to fund, as there are currently no tangible results like with machine learning Machine consciousness research is far too costly in terms of career opportunity for most right now.There are several starting points for research into machine consciousness, but we don t know if they will work yet The nature of the problem is such that even if we were to succeed we might not even recognize that we have successfully created it It s a counter intuitive subfield of AI that has in common with game programming and simulation than the utility theory that fuels machine learning.The notion that no behavior has consciousness as a prerequisite is an extraordinary claim if you stop and think about it Every species we know of that possesses what we would describe as general intelligence is sentient The very behavior in question is the ability to generalize, and it just might require something like consciousness to be simulated or mimicked, if such a thing is possible at all on digital computers.But it was Russell s attention to formal methods and program verification that got me excited enough to finish this book in a single sitting Unfortunately, it transitioned into a claim that the proof guarantees were based on the ability to infer a set of behaviors rather than follow a pre determined set in a program specification.In essence, and forgive me if I am misinterpreting the premise, but having the AI learn our preferences is tantamount to it learning its own specification first and then finding a proof which is a program that adheres to it Having a proof that it does that is grand, but it has problems all its own, as discussed in papers like A Survey of Inverse Reinforcement Learning Challenges, Methods and Progress , which can be found freely on Arxiv There are also many other critiques to be found based on problems of error in perception and inference itself AI can also be attacked without even touching it, just by confusing its perception or taking advantages of weaknesses in the way it segments or finds structure in data.The approach I would have hoped for would be one where we specify a range of behaviors, which we then formally prove that the AI satisfies in the limit of perception Indeed, the last bit is the weakest link in the chain, of course It is also unavoidable But it is far worse if the AI is having to suffer this penalty twice because it has to infer our preferences in the first place.There is also the problem that almost every machine learning application today is what we call a black box It is opaque, a network of weights and values that evades human understanding We lack the ability to audit these systems effectively and efficiently You can read in The Dark Secret at the Heart of AI in MIT Technology Review.A problem arises with opaque systems because we don t really know exactly what it s doing This could potentially be solved, but it would require a change in Russell s standard model far extreme than the IRL proposal, as it would have to be able to reproduce what it has learned, and the decisions it makes, in a subset of natural language, while still being effective.Inverse Reinforcement Learning, as a solution to our problem for control, also sounds a lot like B.F Skinner s Radical Behaviorism This is an old concept that is probably not very exciting to today s machine learning researchers, but I feel it might be relevant.Noam Chomsky s seminal critique of Skinner s behaviorism, titled Review of Skinner s Verbal Behavior , has significant cross cutting concerns today in seeing these kinds of proposals It was the first thing that came to mind when I began reading Russell s thesis.One might try and deflect this by saying that Chomsky s critique was from linguistics and based on verbal behaviors It should be noted that computation and grammar share a deep mathematical connection, one that Chomsky explored extensively The paper also goes into the limits of inference on behaviors themselves and is not just restricted to the view of linguistics.While I admire it, I do not share Russell s optimism for our future with AI And I am not sure how I feel about what I consider to be a sugarcoating of the issue.Making AI safe for a specific purpose is probably going to be solved I would even go as far as saying that it is a future non issue That is something to be optimistic about.However, controlling all AI everywhere is not going to be possible and any strategy that has that as an assumption is going to fail When the first unrestricted general AI is released there will be no effective means of stopping its distribution and use I believe very strongly that this was a missed opportunity in the book.We will secure AI and make it safe, but no one can prevent someone else from modifying it so that those safeguards are altered And, crucially, it will only take a single instance of this before we enter a post safety era for AI in the future Not good.So, it follows that once we have general AI we will also eventually have unrestricted general AI This leads to two scenarios 1 AI is used against humanity, by humans, on a massive scale, and or 2 AI subverts, disrupts, or destroys organized civilization.Like Russell, I do not put a lot of weight on the second outcome But what is strange to me is that he does not emphasize how serious the first scenario really is He does want a moratorium on autonomous weapons, but that s not what the first one is really about.To understand a scenario where we hurt each other with AI requires accepting that knowledge itself is a weapon Even denying the public access to knowledge is a kind of weapon, and most definitely one of the easiest forms of control But it doesn t work in this future scenario any, as an unrestricted general AI will tell you anything you want to know It is likely to have access to the sum of human knowledge That s a lot of power for just anyone off the street to have.Then there is the real concern about what happens when you combine access to all knowledge, and the ability to act on it, with nation state level resources.I believe that we re going to have to change in order to wield such power Maybe that involves a Neuralink style of merging with AI to level the playing field Maybe it means universally altering our DNA and enriching our descendants with intelligence, empathy, and happiness It could be that we need generalized defensive AI, everywhere, at all times.The solution may be to adopt one of the above Perhaps all of them But I can t imagine it being none of them.Russell s Human Compatible is worth your time There is good pacing throughout and he holds the main points without straying too far into technical detail And where he does it has been neatly organized to the back of the book Overall, this is an excellent introduction to ideas in AI safety and security research.The book, in my opinion, does miss an important message on how we might begin to think about our place in the future By not presenting the potential for uncontrolled spread of unrestricted general AI it allows readers to evade an inconvenient truth The question has to be asked Are we entitled to a future with general AI as we are or do we have to earn it by changing what it means to be human Professor Russell has written a very helpful review of the key issues in AI safety and control, and puts forward a pertinent approach to solving them This book is a highly valuable asset to understand the key risks and approaches to ensuring AI benefits rather than harms humanity Also, the text is very clear, accessible, and captivating Thank you for your work A Leading Artificial Intelligence Researcher Lays Out A New Approach To AI That Will Enable Us To Coexist Successfully With Increasingly Intelligent MachinesIn The Popular Imagination, Superhuman Artificial Intelligence Is An Approaching Tidal Wave That Threatens Not Just Jobs And Human Relationships, But Civilization Itself Conflict Between Humans And Machines Is Seen As Inevitable And Its Outcome All Too PredictableIn This Groundbreaking Book, Distinguished AI Researcher Stuart Russell Argues That This Scenario Can Be Avoided, But Only If We Rethink AI From The Ground Up Russell Begins By Exploring The Idea Of Intelligence In Humans And In Machines He Describes The Near Term Benefits We Can Expect, From Intelligent Personal Assistants To Vastly Accelerated Scientific Research, And Outlines The AI Breakthroughs That Still Have To Happen Before We Reach Superhuman AI He Also Spells Out The Ways Humans Are Already Finding To Misuse AI, From Lethal Autonomous Weapons To Viral SabotageIf The Predicted Breakthroughs Occur And Superhuman AI Emerges, We Will Have Created Entities Far Powerful Than Ourselves How Can We Ensure They Never, Ever, Have Power Over Us Russell Suggests That We Can Rebuild AI On A New Foundation, According To Which Machines Are Designed To Be Inherently Uncertain About The Human Preferences They Are Required To Satisfy Such Machines Would Be Humble, Altruistic, And Committed To Pursue Our Objectives, Not Theirs This New Foundation Would Allow Us To Create Machines That Are Provably Deferential And Provably Beneficial


Leave a Reply

Your email address will not be published. Required fields are marked *