Natural Language Negotiation

15 Oct

LEWIS HAMMOND

Institutional Problem or Research Question

Describe what the open institutional problem or research question you’ve identified is, what features make it challenging, and how people deal with it currently.

Negotiations permeate many kinds of intra- and inter-institutional decision-making. Such processes can be arduous, requiring lengthy deliberations and taking a heavy toll on human participants (especially when the stakes are high). These costs also mean that negotiators often fail to reach the Pareto frontier of the space of possible deals. Successful negotiation requires understanding contextual information, strategic planning, and strong communication skills.

At present, people engage in negotiations themselves (often informally) or, for higher stakes scenarios, employ professionals to negotiate on their behalf (such as lawyers, or estate agents, for example). In the near future, individuals may have their own personal AI assistants – in essence, a more sophisticated version of Apple’s Siri or Amazon’s Alexa – which may need to engage in many lower stakers negotiations on behalf of their users, such as finding mutually convenient meeting times.

Possible Solution

Describe what your proposed solution is and how it makes use of AI. If there’s a hypothesis you’re testing, what is it? What makes this approach particularly tractable? How would you implement your solution?

Recent advances in language modelling open the door to the possibility of sophisticated negotiating agents that might be deployed on behalf of human principals. The fact that these agents output natural language (as opposed to a domain-specific programming language or logic) would allow them both to actively interact with their principals (asking them questions to clarify their preferences, for example), as well as be monitored by them. Fundamentally, the hypothesis driving this suggestion is that it is often easier to check the result of a process (e.g., a deal that the human principals can then agree to or not) rather than executing that process itself (the act of coming up with and continually refining a deal).

An implementation of AI natural language negotiators (NLNs) could make use of existing models (such as OpenAI’s GPT4, or Anthropic’s Claude) via their APIs, with additional tooling or prompt engineering on top. Ideally, there would also be some kind of platform that allows for communication between negotiators and their principals, and the submission of various documents by the different parties (such as legal contracts, financial details, or details of previous negotiations).

Method of Evaluation

Describe how you will know if your solution works, ideally at both a small and large scale. What resources and stakeholders would you require to implement and test your solution?

A natural candidate for the measure of success of a negotiation, from an impersonal perspective, is the joint welfare of the principals. We might also wish to consider the way in which individual benefits are traded off, for instance by ensuring some kind of fairness constraint. There are many ways that this welfare could be measured, which may also depend on the specific context of the negotiation (e.g., expected profits, number of individually preferred conditions met, etc.).

Extensive testing would be needed before such systems could be reliably deployed in the real world. Ideally, NLNs would be consistently shown to achieve better outcomes than human negotiators (as judged either by some impersonal measure of welfare, or individually by their human principals). The existence of negotiation classes and exercises might provide a natural first testbed for NLNs. Following this, NLNs could be tested in well-scoped, lower stakes environments, where the degree of human oversight is large, and where the option to ignore or reject the deal found by the NLNs is preserved.

Risks and Additional Context

What are the biggest risks associated with this project? If someone is strongly opposed to your solution or if it is tried and fails, why do you think that was? Is there any additional context worth bearing in mind?

The most obvious risk behind NLNs is that they may learn to exploit weaker AI systems or their human counterparts. Such exploitative practices could, in some situations, be a natural result of attempting to secure the best deal for oneself. Even if a human counterpart is required to approve the deal before it is agreed upon, it might be possible for NLNs to deceive or bully the human in order to achieve this. The benchmarking of NLNs therefore provides an important opportunity for the study of more cooperative AI systems. One possible way to avoid this is the use of a single, impartial AI mediator, instead of multiple NLNs. Due to the benefits to be had from personalised and private AI systems, however, this may not be realistic.

Other potential challenges include ensuring that confidential materials in negotiations remain confidential (perhaps via the use of privacy-preserving technologies) and the displacement of human labour in relevant professions.

Next Steps

Outline the next steps of the project and a roadmap for future work. What are your biggest areas of uncertainty?

The first step would be to construct both a prototype NLN based on an existing language model, and a platform via which two NLNs can interact and be provided with inputs to a negotiation scenario. As a proof of concept, this platform could be relatively lightweight, and the construction of the NLN could simply be done using some careful prompt engineering.

Next, it would be useful to build a benchmark set of negotiation scenarios, each consisting of two principals’ preferences and private information, a description of the commonly known information, such as the setting and any previous interactions between the principals, and a method of scoring outcomes of the negotiation process. These scenarios could build on exercises from courses on negotiation, for example. For tractability and reproducibility, it could be useful to have the principals in the benchmarks also be represented by language models.

Following this, it would be interesting to assess how well NLNs compare on this benchmark not simply to each other (if further NLNs are developed in future), but to human negotiators who they might eventually replace. Testing this would require the collection and evaluation of human data, via a platform such as Amazon Mechanical Turk or Surge. Eventually, these results could be compiled into a publication. It might also be of interest to host a contest at a major AI conference to encourage the design of safe and cooperative NLNs.

Saffron Huang https://cip.org