A.I. Alignment within the framework of Predictive Processing

Alessia Cavallo, Hanna Greb, Julia Harzheim, Jonathan Mannhart


Considering the huge progress made in the field of Artificial Intelligence during the last couple of years, the creation of an Artificial Intelligence with cognitive abilities comparable to human ones - or even superior to them! - seems likely to happen in the future. Pursuing this remarkable endeavour, it is increasingly crucial to take into consideration the fact that the creation of any superintelligent A.I. would entail a great challenge: to carefully align those systems with human values and thereby ensure that they do not pose a threat for humanity.

In our project we aim to address this issue of A.I. alignment by means of Predictive Processing, an increasingly popular model for human intelligent behaviour. Initially, we plan to gain a profound comprehension of Predictive Processing including its mechanism, its relevance for different areas and its limitations through extensive literature research and an additional psychological experiment. In doing so, particular emphasis will be placed on the role of values and the emergence of motivation or goals as well as on the process of decision making within the model. Aside from analysing how these concepts come into play, we will formulate a conjecture about the normative maxims a (possible) future A.I. should act on in the first place - and the problems that would inevitably occur.

Essentially, the outcome could thus be regarded as a rough proposal on how to align a Predictive Processing-based Artificial Intelligence with human interests and values.