Passer le menu
Version française

Customer section | Press section | Recruitment section | Extranet

Home > Tests > Benchmark on information processing multimedia systems

Benchmark on information processing multimedia systems

At a time when everything is digital and where information is everywhere and provided over different media (sound, text, videos ...), it is essential to build common standards for qualifying the reliability of data processing systems. This approach enables integrators and consumers to choose among the various products available. Through international evaluation campaigns it also enables developers and researchers to compare their approaches by quantifying the performance of their systems for a specific task.

LNE organizes benchmarks of information processing systems which use different types of data:

  • text: machine translation, document classification, structuring and summary, named entity recognition, answers to questions ...
  • speech: automatic speech recognition, language identification, speaker recognition, keyword spotting, translation ...
  • video and image: object recognition, head detection, person tracking, optical character recognition
  • sensor measurements: robotics and autonomous vehicles

The benchmarking data distributed by LNE is prepared in partnership with associations specialized in language technologies.

Open evaluations, in order to help researchers and public institutions

LNE, as a trusted third party, conducts evaluation campaigns within the framework of projects funded by government organizations. These assessments aim to evaluate on the same data and at the same time the systems of the benchmark's various participants. This method ensures equity among participants and guarantees the reproducibility, repeatability and accuracy of the measurement. The assessments aim to:

  • evaluate the performance of competing systems using consistent evaluation methodologies,
  • estimate the state of the art in various areas of information technology,
  • promote research and a wider use of information technology in everyday life.

Analysing errors for developers and integrators

For developers and technology integrators, LNE offers analyses which highlight the strengths of the systems, as well as possible areas of improvement. The analysis is performed using data similar to those covered by the final application. The aim is to understand the factors that have an impact on the system performance. This approach aims to:

  • highlight the system’s strengths
  • help developers in characterizing data for which processing can be improved

Evaluation approaches based on research conducted beforehand

In addition to the benchmarking, LNE conducts researches on evaluation protocols. For this reason, it works on the development of evaluation tools, carries out corpus analyses, and participates to the international standardization activities.

A particular focus is given on evaluating the performance of complex, multi-technology systems in multimedia and multimodal information processing. For those, LNE is developing new, task-oriented metrics, evaluation methods, and sophisticated data annotation systems.

Relevant projects

Quaero (

Quaero was a collaborative industrial research and innovation program which addressed the automatic processing, classification and use of multimedia and multilingual content. It was funded by OSEO and gathered 32 partners who collaborated on research, producing demonstrators, advanced application prototypes, and innovative services to access information present in media such as spoken language, images, video and music. The consortium included French and German public and private partners.

In this framework, LNE organised and conducted evaluations for:

  • Speech recognition
  • Speaker recognition and tracking
  • Detection of named entities in written and oral documents
  • Translation
  • Optical character recognition

These evaluations were used help direct and adjust research and development priorities of the Quaero project.

DEFI-REPERE (french website

DEFI-REPERE was an evaluation project in the field of multimedia person recognition in videos. It was a 42-month project co-funded by DGA (French defence procurement agency) and the ANR (French National Research Agency).

The project aimed at identifying persons present in audiovisual programs. The identities were derived from:

  • the image inside which persons are visible,
  • pop-up texts in which the names of persons appear,
  • the soundtrack in which the voices of the speakers are recognizable,
  • the content of the speech signal in which the persons are named.

The systems' performance was benchmarked every year by LNE on French-language TV shows.

VERA – Advanced Error Analysis for Speech recognition

The VERA project, financed by ANR, aims at developing a methodology as well as generic tools for locating and diagnosing errors in automatic speech recognition outputs (ASR). New measures are developed for a contrastive focus on the different types of error, depending on the application. The objective of this project is to investigate these errors in detail, in order to obtain an accurate diagnosis, enhance the evaluation tools and improve the performance of ASR systems' state of the art.

As part of this project, LNE is developing new metrics which take into account the relative importance of transcription errors according to the targeted application.

FaBiole – Reliability in biometric voice identification

This project, funded by ANR, belongs to the field of voice comparison. Voice comparison consists in expressing the likelihood that two voice recordings belong to the same person. The evaluation of this task is not trivial, because it is important to assess whether an audio recording contains enough information about the speaker to actually recognize him. FaBiole aims at identifying which information is characteristic of the speaker and can be found in every record, as well as to measure its consistency.

IMM- Multimedia Multilingual Integration

This project is part of the Institute for Technological Research (IRT) SystemX. It belongs to the field of multimedia information processing, and aims at developing a platform able to synthesize the information present in video and textual data. In this context, LNE organizes benchmarks following three different protocols, a priori and a posteriori corpus appraisal, as well as human-mediated usability testing. They enable developers from the project's seven companies to highlight both the strengths and improvement axes of the platform's systems. These evaluations also allow the funding institutions and use-case partners to estimate the project's progress.

SVA - Simulation for the safety of Autonomous Vehicles

Within the framework of the Autonomous Vehicle Plan of the New Face of Industry in France (Driverless vehicles), the SVA Project tackles the security and reliability of the devices which ensure part or all of the driving task. The main goal is to characterize, model and simulate the security of the autonomous vehicle. To validate simulations, LNE is working on characterization methodologies for on-board sensors. LNE also evaluates the driving decision algorithms in a variety of scenarios, building upon the sensors response modeling.

Relevant publications

O. Galibert, S. Rosset, C. Grouin, P. Zweigenbaum and L. Quintard, 2011, « Structured and Extended Named Entity Evaluation in Automatic Speech Transcriptions », Proc of IJCNLP, Thailand.

G. Gravier, G. Adda, N. Paulsson, M. Carré, A. Giraudel and O. Galibert, 2012, The ETAPE corpus for the evaluation of speech-based TV content processing in the French language, Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey.

A Giraudel, M Carré, V Mapelli, J Kahn, O Galibert, L Quintard , « The REPERE Corpus: a multimodal corpus for person recognition. Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Istanbul, Turkey.

S Rosset, C Grouin, K Fort, O Galibert, J Kahn, P Zweigenbaum, 2012, Structured named entities in two distinct press corpora: Contemporary broadcast news and old newspapers, Proceedings of the Sixth Linguistic Annotation Workshop (LAW), 40-48


Contact | Webmaster | Legal mention