A whole-of-society approach needed to tackle deepfakes: OpenAI exec | World News - Hindustan Times
close_game
close_game

A whole-of-society approach needed to tackle deepfakes: Anna Makanju, Global Public Policy head, OpenAI

Dec 14, 2023 12:47 AM IST

OpenAI's head of global public policy talks regulation, risks and benefits of AI in exclusive interview

Emerging harms from AI such as deepfakes need a whole-of-society solution and the conversations around regulating artificial intelligence (AI) technologies should strike a balance between laying down guardrails and how best to leverage the technology for society’s benefit, OpenAI’s head of Global Public Policy Anna Makanju said.

Anna Makanju, head of Global Public Policy, OpenAI (Sanjeev Verma/HT Photo)
Anna Makanju, head of Global Public Policy, OpenAI (Sanjeev Verma/HT Photo)

In an exclusive interview with HT in New Delhi, Makanju, who has worked in various roles for US President Joe Biden’s administration in the past, spoke of how the recent upheaval at the company --- its CEO Sam Altman was briefly fired in a dramatic boardroom churn before being reinstated --- has left the maker of ChatGPT with a more transparent governance structure, and spoke on a wide range of issues concerning AI, including its potential for good, the unseen risks it poses, and how best to navigate what maybe the biggest technological leap for the world in decades. Edited excepts:

Your company recently went through this significant churn. It is hypothesized that this is really a structural conflict between the fact that the nonprofit board which is about greater good for everyone and the more profit- and business-oriented arm. Was that the conflict happening in the company?

It really isn’t. Now, there have been multiple public statements that reaffirm that from board members who have said that this is categorically not about safety, it’s categorically not about the rate of growth, the speed. It was ultimately about the relationship between Sam [Altman] and those board members and their feeling like there was really a breakdown in communication.

It was enabled by this unique structure, and their ability to remove our CEO and in many ways, I think it worked as intended that they were able to exercise this power. And ultimately, it was them that came to the conclusion that this wasn’t the decision that made the most sense for the mission for the company and so now we have the structure that we do. Ultimately, I think it was healthy for us. Now we have a more solid governance structure and one that will be more transparent and clear to people.

What were your first reactions on the EU proposal on AI regulation?

There’s still a ton of technical work to be done in the text so I’m waiting to read the final text which hasn’t actually been shared yet. But in general, I think it’s a great thing that they were able to come to an agreement because they’ve been working on this since 2018. And I think in general, it’s great for democratic institutions that they were able to identify a compromise.

Are foundational models out of it and self-regulated?

The risk framework will still govern the work of these models. I don’t think that somehow this will mean that companies building these tools are left unregulated, because we are going to be implicated by the uses as well.

Europe has been seen at the vanguard of regulating everything. In some cases, it ended up being counterproductive from the perspective of both investment and innovation. Are there fears that this could happen with AI too?

I think that the European governments themselves were concerned that this would happen. And this was part of the reason that they objected at the last minute to some of the specific rules that were in there. It is very difficult to know exactly what the implications of this law will be yet because so much depends on the details of implementation. I think one thing that they did that makes a lot of sense, is create flexibility for adjusting the way this law will be implemented because the technology is going to continue to evolve and change.

At your session at the GPAI Summit, the speaker from UNESCO [Gabriela Ramos, the assistant director-general for social and human sciences] talked about the concentration of power in the AI sector, especially in the US or in a few companies. How do you see these conversations?

OpenAI from its founding, the intention of our unique structure was to really think about how to more broadly distribute benefits. It is in our charter. … I absolutely agree that thinking about how to make sure that all of these benefits are as widely distributed as possible is a critical part of developing these tools. But one of the problems is that there’s all these foundational issues, such as access to even electricity, or internet that limits people’s ability to take advantage of this. And that is, unfortunately, not something OpenAI can really solve but it is one of the reasons that we’ve tried to get governments involved from the beginning.

How are your discussions with stakeholders in India going if you were to look at it from a spectrum with the EU’s AI Act at one end and the Biden Executive Order on the other?

The executive order does something that makes a lot of sense which is that they really work hard to strike the balance between ensuring there is an infrastructure that makes people feel like there are the necessary guardrails on the technology while also encouraging government agencies to adopt the technology. Right now, 95% of the conversations we have are about regulation. But governments aren’t really moving very fast to incorporate this technology, and take advantage of it.

In India, this is much more of the conversation --- how do you leverage this as quickly as possible for the benefit of citizens while implementing guardrails? Here, the conversation is much more focused on leveraging the benefits.

Regulators have been trying to understand what foundation models do and trying to get ahead of risks, especially existential ones that may arise in the future. How should AI companies open up for regulators, academics or experts to see if they might be building something with too big an implication on society and economy?

I think one of the challenges here is that the conversation about existential risks has been extremely abstract. We started a new team a few months ago called the Preparedness Team [in October 2023] that’s actually going to specify and then measure the risks here: What are the risks that we can anticipate? How do we test our models for these risks as their capabilities advance? How do we share the information about these models externally prior to deployment?

What we really need is much more industry-wide evaluation and measurement because we’re not always measuring the same things, or even using the same language to talk. We need to get much more grounded in facts based on this conversation, and then figure out how to share that information, particularly with governments so that they’re at least aware of what are frontier capabilities and risks.

I spent eight years working in government so having governments at the forefront of this is something that I personally really care about. Anything governments can do to start testing, sand-boxing, doing low risk applications, is hugely important.

You might end up having unaligned foundation models by bad actors, similar to how the dark web exists today, which will possibly pose a threat to how regulators see these risks. How do you approach these risks that develop outside industry collaborations? And how do you technically stop them? Can you?

This is something that we have thought about and have done some research on. Right now, the amount of compute resources that are necessary to develop a model that is at the frontier, with the highest risk level, is still really substantial. There are some ideas for how to tackle this. It’s really complex. We know that there is a certain amount and type of hardware [required for this]. The governments can look at that and say we should at least be notified if you’re going to train a model of a certain size. And the Biden EO [executive order] looks at this way where they say if you’re using this many compute resources, then you should alert us. I think that’s a good start right now because once the model is trained, the amount of compute required to run it is actually quite small.

You talked about how in India, the conversation has also been about incorporating AI for governance. What are the use cases that the Indian government and OpenAI have been looking at?

Unfortunately, I’m not sure that we’re directly working with the Indian government right now on use cases but we have pilots going with city states like Albania and Iceland. For large languages, the model performs pretty well and will continue to improve. But what about languages with really small datasets? If you are low-resource language, how can we make sure that the model will still perform? That’s what our project with Iceland is about.

These models are shockingly good at tourism. A lot of people are using these models to help people plan their day and itineraries.

The Indian government is very big on DPIs [digital public infrastructure]. In the debate between efficiency and privacy, the Indian government leans more towards efficiency than towards privacy. It makes it easier to layer these models atop DPIs.

One of the reasons I’m here is because I really want to do more of that.

Existential risk has dominated the discussion around AI regulation. One of the criticisms levelled against OpenAI and Anthropic, especially in the wake of the open letter that called for a moratorium on AI research, is that existential has been overplayed while immediate harms like bias, discrimination, exclusion, have not been addressed. What happens when these harms are combined with digital public infrastructure (DPI) and governance structures? How do you see that playing out?

One of the reasons that you saw that dynamic is because a lot of AI researchers believed that while there was already a lot of attention on present-day harms, people were not aware and not paying attention to the potential down-the-road of these harms. Now, I think that is no longer the case. I think everybody now is aware, and thinking quite hard about existential risks.

There was also the idea that many existing agencies and organisations were paying attention to the near-term risks. A new ecosystem was not required to address those things. We have been directly addressing those concerns [of bias, exclusion etc] in the development processes of these tools.

We take these huge data sets to train the model and then there is a whole other process of post-training where we steer model behaviour in a deliberate way. This is where reinforcement learning and other techniques come in so you can train the model to behave in a way that is does not reflect the biases in the pre-training data sets.

How do you do that?

We steer it away from providing biased responses. We have subject matter experts from around the world and they review model outputs to say that this is the best response.

A lot of people play with these models with the intention to game them. There are rules that prevent ChatGPT from giving bomb-making instructions but if you were to give it a hypothetical where you need to blow up a field to make it more fertile, it would actually give you instructions on how to make a bomb.

I think the question is about jailbreaks. 87% of the jailbreaks that were effective in GPT 3.5 are no longer effective in GPT 4. As these models get smarter, they are actually more resilient to jailbreaks. Recent studies have shown that ChatGPT hallucinates the least of the large language models at about 3%. That number is going to keep getting smaller.

When it comes to biases, these models are large language models, and rely on linguistics. Language is biased. For instance, ChatGPT at one time presumed a doctor to be male and a nurse to be female.

Now, if you were to type in the question about the doctor and nurse, I’m confident that it will not [have that bias]. This is why we have iterative deployment where we try to solve for as many of these things as possible before deployment. People will absolutely try to do the worst possible thing and we will then remove their ability to do that. We’re continuously incorporating our learnings into the models as people use them.

Does the model self-learn or is the human intervention required to correct for some of these jailbreaks?

It is not self-learning yet.

How do you incorporate for people Voldemort-ing across systems?

We also have a problem with over-refusals because when we try to control for some of these things, sometimes the model can do unpredictable over-refusals. Certain safety guardrails can bleed over in ways that are [unexpected]. We are criticised for this as well.

This is just a matter of continuing to calibrate and using our models to help calibrate our models because sometimes the model itself can create better rules for how to get it to do something than we can. This is something that we’re actually increasingly using our models to do and to scale this work. Our safety systems team has really figured out how to make these things more efficient using the models which has been helpful in rolling these out quickly and across languages.

When it comes to frontier AI, who identifies and designates a particular AI model as a frontier AI model?

This is a great question because what does it even mean to be at the frontier? We are now working to get a definition that really encompasses the key. There maybe a different definition for different uses. But for now we are trying to combine compute threshold with capability evaluations.

If the designation would come from the companies themselves, or say, a forum, such as Frontier AI Forum, which is completely company-led, why should the government or anybody outside of these companies trust the companies to be transparent about when they’ve actually approached this frontier level where there’s greater risk and therefore potential for greater regulation?

Governments are also working to define this and this is one of the tasks for these new AI institutes. For now, they’ve chosen one that’s just compute threshold based, but they’re going to evolve that over time as we learn more about what it means to be on the frontier.

The Frontier Model Forum is going to build an advisory board that is going to include include civil society, academia and other players so that the recommendations are not just based on what the companies themselves want to do. Right now, there is a race to the top on safety as much as there is on capability because everyone knows that for products to be used and incorporated, people need to believe that we take these issues seriously, and see us doing the work.

One of the safety-related solutions being looked at is watermarking AI content. How are you going to use watermarking technology for text? What is the thinking in terms of what could be foolproof and where do we stand on industry-wide collaboration?

This is an area of active research for us. Internally, we are able to identify any image that is DALL-E generated and we are working on how those tools could be deployed more publicly. Artists have told us that they are quite concerned about it.

Also, there is certainly a trade-off between provenance tools and privacy because in order to have really robust provenance, we have to collect information about users that we normally wouldn’t. There’s a vulnerability there as well. So, we’re researching this, we’re figuring out how to deploy it in a way that balances all of these questions, but it is certainly difficult.

Have technological solutions caught up to tackling the issue of deepfakes?

I worry a bit that there is over reliance on there being a technical solution to this question because I do think that there’s still a lot of societal [work to be done]. Fact checkers remain incredibly important. There’s kind of a distribution versus generation problem because an image is generated for personal use within ChatGPT is one thing but then if it gets distributed through other platforms, then it becomes more concerning. How do you make sure that there’s no distribution of these images? Can you create campaign rules that require political parties to proactively say that they will not be deceptive? I think that this has to really be a whole-of-society solution. If we say that this has to be a technical only solution, I don’t think that that will be robust.

In your conversations with governments, what are they most worried about?

Disinformation and election interference are amongst the top things. Existential risk often comes up as well just to understand when does this technology become dangerous? A lot of anxiety that governments are not going to integrate this technology quickly enough. At the same time, a desire to figure out how it can be used for new science, lots of scientific agencies that want sort of custom tools, for science and for research. Then, job losses, of course. And trying to figure out how to predict the impact. And fake news.

In India, the government wants to allocate responsibility and liability, at least when it comes to regulation of intermediaries. And we have kind of imported the same language to AI as well. So how do you go about allocating responsibility and subsequently liability for what say, a ChatGPT produces?

I think this is a question that so many are struggling with and I’m not better equipped to answer than everyone puzzling through this. In the US, this an incredibly politicised conversation, and what even constitutes speech, fake news? These are all incredibly thorny questions.

If you look at a risk-based or a harms-based approach, the Indian government wants to who do we hold responsible for generating harmful content because they can easily order a Facebook or a Twitter to take the content down? Who is the actual content generator in case of ChatGPT?

One of the things that we do is work to prevent dissemination and scaling. At this moment, ChatGPT is one-on-one interaction. The relative harm of one person generating fake news for themselves is [limited]. What we work on is to prevent someone from hooking up a political chatbot to ChatGPT. Because at the end of the day, preventing individuals from having this kind of traction, I’m not sure if that’s where you prevent the most harm especially because deciding what speech is permissible and what speech is not is also what we are grappling with. We have a bunch of democratic input grants right now to try to get as much societal feedback as possible here because we have certain guardrails where everyone agrees that there is certain information that the model shouldn’t provide but within that, there is a ton of grey area. Certainly, for us to be the sole arbiters of what speech is permissible for one to have, that is very challenging.

Are the recommendations of the Frontier AI Forum advisory board going to be binding? We have seen that the Meta Oversight Board’s recommendations are not binding so they are limited in their impact. How do you prevent that from happening with Frontier AI Forum?

Frontier AI Forum does safety best practices. It’s a little bit different. It’s just guidance. Our technical teams and our research teams have identified as what works well and obviously we will be implementing them. But they’re also not meant to be like the final say. They are meant to be helpful to governments as they think about what actually makes sense from a technical perspective.

Are you bound to implement them?

There is no legal requirement but the idea is that the legal requirements will presumably be informed by some of that work.

Get World Cup ready with Crickit! From live scores to match stats, catch all the action here. Explore now!

See more

Get Current Updates on World News, US News , Hollywood News , Anime and Top Headlines from around the world.

SHARE THIS ARTICLE ON
SHARE
Story Saved
Live Score
OPEN APP
Saved Articles
Following
My Reads
Sign out
New Delhi 0C
Wednesday, June 26, 2024
Start 14 Days Free Trial Subscribe Now
Follow Us On