Video observation as a method for researching natural human-robot interaction in public spaces

At ZEN-MRI, we are not only concerned with the evaluation of the coexistence of humans and robots in public spaces, but also with the development and improvement of neutral and independent research methods. In psychology, we have various tools at our disposal for this purpose. We can interview people (using surveys or questionnaires), invite them to focus groups or workshops, or measure their physiological reactions (e.g. heart rate or facial expressions).

Another method is to observe people from the outside when they interact with robots and how they react to them. This observation method can be extended by using video cameras to repeatedly observe and analyze in detail the complex interaction between humans and robots.

What is video observation?

Video observation is an established research method that makes it possible to observe and analyze human behavior using video recordings. In research on human-robot interaction, this method is used in a variety of ways to record and understand patterns, behaviors and reactions.

  • Identification of behavioral patterns: By analyzing video recordings, researchers can identify behavioral patterns such as evasive movements, gaze behavior and interaction attempts that provide information about reactions to robots. In addition, emotional reactions can be evaluated, which provide conclusions about user acceptance.
  • Evaluation of designs: Video observation makes it possible to evaluate the effects of different robot designs on the behavior and reactions of users and thus improve the design of future robots.
  • Safety assessments: By observing interactions, potential safety risks can be identified and mitigation measures can be developed. An important focus of the ZEN-MRI project is the identification and prevention of conflict situations between humans and robots in public spaces.

Advantages of video observation compared to other methods:

  • Non-invasive: The method does not interfere with the natural behavior of the participants and enables authentic recording of reactions. In addition, people do not necessarily have to be invited, but all encounters with the robot can be included in the evaluation. This means that a large number of different people and groups can be observed and taken into account when improving the robot designs.
  • Scalability: Video observation can be used in different environments and is therefore extremely flexible.
  • Detailed analysis: The ability to view and analyze recordings repeatedly allows researchers to examine and understand fine details of interactions.

Challenges and ethical considerations:

Despite its benefits, video observation also presents challenges and ethical considerations, particularly in relation to data protection and privacy. Researchers must ensure that individuals are informed about the recording and their rights, and that recordings are anonymized where possible. In addition, secure storage and restricted access and use of the video data are essential. In ZEN-MRI, we have a competent partner in the consortium, the Institute for Digital Ethics (IDE) at Stuttgart Media University, which monitors compliance with these requirements. To this end, we have ensured in past video observations that passers-by in Ulm have the option of taking alternative routes if they do not wish to be recorded. We have also displayed extensive information documents and made them available online to explain data protection rights. As further data protection measures, we have developed a storage and deletion concept and defined encryption and access restrictions for the video data.

The evaluation of video data also presents us with a number of challenges, particularly because the interpretation of behavior is heavily dependent on the person evaluating it and can therefore have limitations in terms of objectivity. To address this challenge, we are working closely with the Institute of Media Informatics to explore the possibilities and potential of computer vision and AI technologies for a more objective analysis. It should also be noted that research in the field is generally less standardized, which can affect the comparability and generalizability of results.


Video observation is an extremely valuable method for researching human-robot interaction in public spaces. Through the detailed analysis of behavioral patterns and interaction dynamics, it helps to improve the design, safety and user acceptance of robots and to advance the development of these technologies.


We have already carried out two video observations in the ZEN-MRI. Initial findings from the more than 10 hours of data and recorded interactions will be shared in an upcoming article. Further video observations as well as evaluations and publications on the video data are planned. Among other things, we will address the following questions:

  • To what extent can computer vision and artificial intelligence be used effectively to anonymize people in video data? How can the pre-selection of relevant interaction events in video data be optimized? And what potentials arise in the automated evaluation of behavioral patterns and interactions in large volumes of video data?
  • What advantages does lidar technology offer compared to video observation when recording the movement behavior of passers-by in public spaces with robots?
  • How can the comparability and generalizability of research results from different observational studies on human-robot interaction be improved? Which criteria and categories should be considered in a classification system for structuring video data from different environments and contexts with robots?