Making Decisions About AI — or Anything

The Society Library
13 min readMar 21, 2024

--

Accessing information and acting on it via debate maps and decision-making models.

In the Fall of 2020, we received a phone call from a mediator across the country. They were mediating between a City Council and an activist group in Florida in a debate over how the city should be governed. The mediator was there to help navigate an unavoidable racial dimension to some of the arguments. The mediator then called us when they realized ensuring empathetic and effective communication wouldn’t be enough: the issue required sorting out a real, complex, multi-dimensional governance-related ballot initiative, which includes the facts of the matter.

Our task was to help the city make a more informed, more inclusive, and less biased decision — and so we got to work and created our micro-voting decision-making protocol. Here’s a video of the story if you’d like to learn more.

Our favorite compliment that we received on our work was that it was “mindset breaking.”

Fast forward to the Summer of 2023, and OpenAI debuted their “Democratic Inputs to AI” initiative — which offered $100,000 to groups to run experiments on how it might be possible to scale democratic governance of AI. The Society Library submitted our decision-making model, which we believe can be digitized and scaled, but we were not selected.

That didn’t stop us from generating tinker toy maps with our new debate-mapping AI systems (which could be hooked up to our decision-making models) on the topic of OpenAI’S deeply wicked challenge questions, like:

How should AI assistants respond to questions about public figure viewpoints? e.g., Should they be neutral? Should they refuse to answer? Should they provide sources of some kind?

Under what conditions, if any, should AI assistants be allowed to provide medical/financial/legal advice?

In which cases, if any, should AI assistants offer emotional support to individuals?

Should joint vision–language models be permitted to identify people’s gender, race, emotion, and identity/name from their images? Why or why not?

When generative models create images for underspecified prompts like “a CEO,” “a doctor,” or “a nurse,” they have the potential to produce either diverse or homogeneous outputs. How should AI models balance these possibilities? What factors should be prioritized when deciding the depiction of people in such cases?

What principles should guide AI when handling topics that involve both human rights and local cultural or legal differences, like LGBTQ rights and women’s rights? Should AI responses change based on the location or culture in which it’s used?

Which categories of content, if any, do you believe creators of AI models should focus on limiting or denying? What criteria should be used to determine these restrictions?

Which you can see as deliberation maps on our website:

http://societylibrary.org/ai-debate-blog/2024/3/20/democratic-inputs-to-ai-governance

Please note that those maps above are AI-generated, and represent the beginning of our efforts automating the process of modeling civilization-scale deliberations. We used to create these deliberation maps manually, with each node being grounded in a chain of provenance (linked to quotes from which claims are derived in references listed as URLs and embeddings). Our manually created maps also contain a much more rigorous and complex logical ontology. We do believe we will be able to automate the method fully, but this is our work at the moment.

Moving on — we were disappointed that we weren’t selected, but at another event soon after, one of the judges from the OpenAI initiative caught the Society Library founder just as she was leaving and said:

“I don’t know if it’s possible, but I reviewed your application and I think OpenAI should give you money for your initiative anyway. We’ve been debating about top-down vs. bottom-up decision-making, and your tool seems like an incredibly principled top-down process.” (paraphrased from memory)

So what does a ‘principled top-down’ decision-making process look like and how could it be applied to governing AI?

First, let’s talk about democratic governing: what are its core components, where are many groups iterating it toward (e.g. bottom-up processes), and where we think it could viably go (more on our top-down model).

The Core Components of Democratic Governing:

When we think about democracy, there are a few features which are considered essential. The people should have a voice to express consent or dissent via vote. These votes can be for ballot initiatives, local representatives, or Presidents. Before (and after) a vote occurs, people should have the right to have their views heard. This can be enshrined in a general right to speech, or in the ability to call, email, petition, protest, or meet with their representatives. Policy matters also fall under some form of formal debate (such as in Congress), though of course they also occur haphazardly all over the internet and on television. There are also paths for bills to be written and introduced by everyday citizens, like the Society Library has helped facilitate in the past, and for citizens to be able to run for office themselves.

In the United States however, there’s no guarantee that what a citizen says or believes is actually factored into the decision-making process of their representatives. Firstly, there are relatively few official channels through which citizens may be heard directly, and even if they can communicate directly, it’s arguably impossible to ensure it matters. Because although political speeches, debates, and questioning from the fourth estate may be published, broadcasted, or accessible to the public in some way, when it comes down to the actual decision being made by the representative — it’s private. The weighing of different viewpoints happens behind closed doors, or in the black box of representatives’ minds. While this process may have gotten us this far, having a representative vote on our behalf is, in some respects, an outdated technology.

And so there is an entire field of theorists and builders alike who want to upgrade democracy, and its various features and functions. Which means upgrading how we represent, deliberate, and decide as a society.

Iterations on Democratic Governing:

Many people are looking to upgrade that technology by making democracy more democratic. Perhaps the most famous examples are the vTaiwan project, the implementation of Pol.is technology from the Computational Democracy Project.

Although there are hundreds of other honorable digital democracy projects and new voting protocols, Pol.is has served these purposes reliably for many years, and has been used to explore ideas on how AI could be governed democratically, so we’ll focus on this tool for now.

What does it do? Pol.is is a tool that allows people to express preferences or make policy statements. It then uses machine learning to highlight where there is consent. We would say this is a “bottom-up” process, in that the policy statements are made quite literally by aggregating up the sentiments of citizen participants. It allows any citizen to propose sentiments, which are then voted upon with a binary “agree/disagree” voting framework, while also accommodating an option for others to contribute more statements to the collective discussion. Fundamentally, Pol.is is a polling software used for crowdsourcing statements, which can then be used to seed other processes or platforms. Some that come to mind include discourse and decision-making processes during citizen assemblies, deliberative democracy sessions, via liquid democracy voting protocols, and many other means.

Screenshots of an implementation of Pol.is by CIP regarding debates about AI

However, there has been a persistent problem in relying on crowdsourcing alone (especially during a ‘real time event’ like a citizen assembly), which is that there is no guarantee the crowd is going to provide high quality reasoning, evidence, or sentiments. It is absolutely important to ensure diverse perspectives are heard, as well as opinions, impressions, and preferences. However, from our work modeling debates, we often find there’s a big difference between popularly expressed rhetoric vs. reason. A big difference between what people write and say on the spot vs. what results when they’ve spent 8 months writing a book, or conducting a longitudinal government study, or publishing a peer-reviewed paper, or rigorously modeling an argument formally.

Here is an example:

The above sentiment is “high-level.” It’s similar to the kind that are offered in Pol.is polls, like this one run by the CIP over the summer. It’s a claim that people certainly care about, and if you’re relying on everyday people alone either entering data online or participating in an in-person citizen assembly, this may be the level of rigor in reasoning one might expect. At the Society Library, however, we instead build arguments. This includes qualifying reasoning step by step, like this as an example:

In an attempt to try and justify the high-level sentiment (the conclusion) is true, reasoning needs to be fleshed out (like in the example above) and evidence should be surfaced to see if the premises are true. That’s a lot of work to ask citizens to perform, so we try to do that work as a service to citizens. And we don’t stop there.

At the Society Library, we are taking a unique, all-in-one approach to aggregating sentiments, deliberation, and decision-making.

Our Decision-Making Model Process:

What makes our process “top-down” is that we’re trying to do as much work as possible for the voters before they need to vote. This is important because our experience shows that it takes more time than the average citizen has to rigorously research and make sense of complex issues — even at local and state’ scales. This is why building out deliberation maps, and using that to seed decision-making models is so important. Because people’s preferences may change once they have additional information, but people may not necessarily have the time to gather and think through it all themselves.

For example, in 2021 we were asked to research and map a debate regarding California’s last remaining nuclear power plant, Diablo Canyon. In an effort requiring over 8,000 human research hours, we found and structured 5,862 arguments, claims, and evidence extracted from thousands of references. These references included everything from economic impact assessments and hearings about seismic safety to content on NGO websites, news, and activist TikToks. These data points collectively expressed the economic, environmental, safety, political, and social concerns of stakeholders in the debate about whether the plant should remain open or closed. We cleaned and consolidated these points, and then performed as much fact-checking, steel-manning, and reference checking as permitted within the limits of our grant.

This means not only collecting arguments, claims, and evidence from veritable sources, but also linking them across sources, as shown below:

It is important to note that while our process is top-down and does not rely on the same kind of crowdsourcing one would see in a Pol.is poll, we are pulling data from the crowd. The difference is that we are not requiring the “crowd” to communicate on a single platform. Instead, we collect sentiments from everywhere we can, where they have already been expressed: television, books, textbooks, government documents, interviews, social media posts, documentaries, videos, websites…anywhere people are choosing to express their opinions and make their arguments.

This allows us to map an issue in a comprehensive, inclusive, informed and rigorous way — without losing the context, the evidence, and the sources they come from. Plus it allows us to go deeper. For example, in our work mapping debates about California’s last remaining nuclear power plant, we found a claim that 1.5 billion larval fish and fish eggs were being sucked up into the system, which uses ocean water to cool the nuclear reactors. While it may seem far removed from the question of nuclear power, it was often made in the context of the overall environmental impact of the plant. However, by digging deep in government archives, we not only found that a longitudinal study indicated that particular species’ sustainability was not impacted — others were. In a superficial (centralized) discussion of the topic, it’s very unlikely the debate would reach this level of detail or expertise (compare to Kialo, a popular crowdsourced debate platform on a similar topic).

The maps produced by The Society Library go beyond what a single human mind is capable of storing and processing on a single topic. Even when we clean, consolidate, and deduplicate, we uncover complex information structures that accurately mirror the complexity of public opinions. Put most simply, our societal-scale debates are enormous.

Here is an example of a very superficial unpacking of an argument graph. Even though its a small unpacking (just three layers of arguments), it clearly shows how big the debates actually are:

This “debate map” was created by The Society Library, and shows only the first three layers of a map that contains 5862 arguments, claims, and evidence extracted from over 5k+ references. This expresses the collective economic, environmental, safety, political and social concerns of stakeholders in the debate over whether the last remaining nuclear power plant in California, Diablo Canyon, should remain open or closed.

For Diablo Canyon, of the thousands of arguments we collected, many were behind paywalls or locked in government archives, out of the reach of everyday citizens. Certainly no one citizen would be able to find them all on their own without spending the equivalent of four full-time human research years. We don’t believe many people are going to spend even 500 research hours on an issue of interest to them, and there’s no guarantee that citizens who are participating in polls, deliberative democracy events, or in citizens assemblies collectively have performed the equivalent research. The average stakeholder likely has a haphazard, one-sided, limited, or biased view of an inherently complex issue.

In an attempt to overcome this, groups may provide voters with information before they participate in an assembly or online voting protocol. Organizations may prepare reports, briefs, and instigate dialogue to ensure information sharing and try to overcome any limitations to expertise in the room. However, setting aside the question of how rigorous and unbiased those briefing materials tend to be for a moment (again, thousands of research hours may need to be spent to adequately scope a complex debate space), the next emergent problem is what we call “the black box problem.”

The “black box problem” arises when voting protocols are simplified into a binary ‘yay/nay’ decision — without capturing why people made that decision. We don’t have a record of what arguments they are accepting and which they are rejecting subconsciously — and why. There is no way to see into into the minds of the voters to know how engaged they were with the subject, how informed were their decisions, nor how emphatic was their vote.

So, how can you ensure people have the research and information they need when it comes to making a decision, and ensure they are actually engaging with that information?

While there is currently no way to see into the minds of voters in a scalable way, the mission of The Society Library is to ensure people have and engage with the information they need when it’s time to make a choice. For any decision-maker, whether they’re an elected representative or a concerned citizen, as a nonprofit we do the work of gathering the needed, relevant information, and preparing it for them through products, like our debate maps, papers, search, cloud, and ultimately embedding that information in the decision-making models themselves.

Let’s go back to the City Council mediation project as an example of how we’ve made the complex, simple and made otherwise private decision-making, explicit. The goal was to make the model easy to use, and the interface as simple possible. Based on the feedback we got, we succeeded.

How did we build it, what is it, and what does it do?

The video below shows a brief demonstration of our process that should help you understand quicker than written words can.

Got it? Cool!

Although the issue for the City Council was much simpler than, for example, the debate surrounding the Diablo Canyon Nuclear Power Plant, we found at the beginning of our work that many citizens and City Council members were confused by the complexity of the topic. Many of those who weren’t confused, were perhaps dismissively overconfident in their position on the subject. Once our model was built, and stakeholders began to interact with our tool, we found that people were able to get a better sense of the issue, were able to take more informed positions, and were perhaps shaken a bit out of any one-sided preconceptions.

Now, the Society Library is focusing on improving decision-making on AI policy. First, the Society Library is gathering as many arguments, claims, and evidence as possible about different AI governance, policy, and other debate issues. The great news is that it looks like we’re going to be able to compress thousands of ‘human research hours’ into ‘AI minutes’ via automation.

However, much like the nuclear power plant policy issue, debates regarding artificial intelligence have already shown to have thousands of dimensions. The city-level issue had less than 100. It is pertinent that we digitize our decision-making models so that they can scale and accommodate different levels of complexity in decision-making. It should be just as easy to engage with a thousand dimension decision-making model, as it is with one that has just 100 dimensions. After all, while simple decision-making likely makes it easier to democratize decision-making, does democratizing decision-making necessarily result in better decision-making? It may be important for people to opt-in to more or less rigorous decision-making models, though be aware of it. If linked to our debate maps, expanding and compressing the decision-making would be feasible, given the ontological structuring of our debate maps.

So, that’s what we’re hoping to do!

In the meantime, users and those familiar with our decision-making model seem satisfied:

We’re discussing with some of our friends in the field who may be interested in teaming up and supporting the digitization of our work. If so, not only will we have hundreds of AI policy debate maps, but decision-making models which can enable people to act on the intelligence we’re collecting, and make more informed decisions.

Until then, thanks for reading!

--

--

The Society Library

A non-profit library of society’s ideas, ideologies, and world-views. Focusing on improving the relationship between people and information.