Research contests

to advance AI alignment

We want to make sure advanced AI systems pursue our goals and let us turn them off.

(How hard could that be? Surprisingly hard. We think this may be the most important challenge of our time.)

No technical prerequisites are required. Submissions from those with ML or non-ML backgrounds are welcome.

Submit by March 1.

Contests

Goal Misgeneralization

In the 2022 paper Goal Misgeneralization in Deep Reinforcement Learning, researchers identified the problem that a reinforcement learning agent may retain its capabilities out-of-distribution yet pursue the wrong goal.

How can we prevent or detect goal misgeneralization?

Learn more →

Corrigibility

In the 2015 paper Corrigibility, researchers hypothesized that advanced AI systems are likely to resist attempts to turn them off.

How can we design AI systems that are open to being shut down, even as they get increasingly advanced?

Learn more →

Contests block

How it works

1. Choose a contest

We’re currently accepting proposals for how to make progress on two problems in AI alignment research: goal misgeneralization and corrigibility. (Learn more about AI alignment here.)

2. Develop your submission

We’re open to a wide range of submissions that might help researchers make progress on the problems. See more details about acceptable submissions on each contest page. You can join our Discord to discuss ideas and form teams with other participants.

3. Submit

Submit your proposal as a research paper or writeup. Optionally add your graphics, code, math, data, etc. At minimum, you must submit a 500-word abstract/summary of your idea. In general, strong submissions will be several pages or more. We may ask for supporting material, further elaboration, or revision if needed.

4. Judging

We will award up to $100,000 for each submission that make progress on the contest problems. For especially promising submissions, we may offer additional research funding, fellowships, and workshop invitations. Submissions will be judged on a rolling basis by our panel of judges.

Note that we expect most prizes will range from $1,000 - $20,000. We will only award higher prizes if we receive exceptional submissions.

5. Proposals and winners made public

Participants can choose to keep their proposals private during the competition or discuss them on the participant Discord. However, all winning and non-winning proposals and the names of the winners will eventually be made public for the sake of transparency. In general, we will release proposals after the competition has ended.

Frequently Asked Questions

Is there an application deadline?

Yes, the deadline is March 1, 2023 at midnight (anywhere on earth). Promising early submissions may be eligible for feedback and resubmission.

What information should I include in my proposal?

Submissions to either contest should propose a new solution, define the problem more concretely, identify new examples or implications of the problems, or build on existing solutions in meaningful ways. 

All submissions must include an abstract of up to 500 words explaining your idea, how it helps with the contest problem, and what its limitations are. You may also include a supporting research paper, experimentation results, code, math, images, or a conceptual elaboration on your idea.

How will submissions be evaluated? Why will submissions be made public?

We’re focused on solving alignment problems that could arise in more advanced AI systems than currently exist. Because of this, proposals cannot be evaluated in a clear cut manner. We’re having researchers familiar with AI alignment evaluate all submissions according to how valuable they expect each to be. To make prizing transparent, we will eventually make all final submissions public and release the names of the winning authors. We will award up to $100,000 for each winning submission that make progress on the contest problems. For especially promising submissions, we may offer additional research funding, fellowships, and workshop invitations.

Can I work with others?

Yes – you can form teams of up to four people. For winning teams, prizes will be divided evenly among the teammates. You can find potential teammates on our Discord.

Can I submit multiple entries?

Yes. Submissions will be evaluated independently.

Who can participate?

Anyone can participate, regardless of age or background. Familiarity with machine learning or math may be helpful. We’re also excited about people with non-STEM backgrounds like philosophy, economics, and cognitive science.

Why focus on risks from more advanced AI systems?

We think current safety issues are important, but we focus on risks from more advanced AI systems because we think they are likely to be even more consequential. In particular, we think the development on unaligned artificial intelligence could be catastrophic for humanity. You can learn more about AI alignment here.

See more questions & answers