New paper: Listening beyond seeing

Our new paper has just been published in Brain and Language, titled “Listening beyond seeing: Event-related potentials to audiovisual processing in visual narrative.” My collaborator Mirella Manfredi carried out this study, which builds on her previous work looking at different types of words (Pow! vs. Hit!) substituted into visual narrative sequences.

Here, Mirella showed visual narratives where the climactic event either matched or mismatched auditory sounds or words. So, like the figure to the right, a panel showing Snoopy spitting would accompany the sound of spitting or the word “spitting”. Or, we played incongruous sounds, like the sound of something getting hit, or the word “hitting.”

We measured participants brainwave responses (ERPs) to these panels/sounds. We found that these stimuli elicited an “N400 response”—which occurs to the processing of meaning in any modality (words, sounds, images, video, etc.). We found that though the overall semantic processing response (N400) was similar to both stimulus types, the incongruous sounds evoked a slightly different response across the scalp than the incongruous words. This suggested that, despite the overall process of computing meaning being similar, these stimuli may be processed in different parts of the brain.

In addition, these patterned responses very much resembled what is typical of showing words or sounds in isolation, and did not resemble what often appear to images. This suggests that, despite the multimodal image-sound/word interaction determining whether stimuli were congruent or incongruent, the semantic processing of the images did not seem to factor into the responses (or, was equally subtracted out across stimulus types).

So, overall, this implies that semantic processing across different modalities uses a similar response (N400), but may differ in neural areas.

You can find the paper here (pdf) or along with my other downloadable papers.


Every day we integrate meaningful information coming from different sensory modalities, and previous work has debated whether conceptual knowledge is represented in modality-specific neural stores specialized for specific types of information, and/or in an amodal, shared system. In the current study, we investigated semantic processing through a cross-modal paradigm which asked whether auditory semantic processing could be modulated by the constraints of context built up across a meaningful visual narrative sequence. We recorded event-related brain potentials (ERPs) to auditory words and sounds associated to events in visual narratives—i.e., seeing images of someone spitting while hearing either a word (Spitting!) or a sound (the sound of spitting)—which were either semantically congruent or incongruent with the climactic visual event. Our results showed that both incongruent sounds and words evoked an N400 effect, however, the distribution of the N400 effect to words (centro-parietal) differed from that of sounds (frontal). In addition, words had an earlier latency N400 than sounds. Despite these differences, a sustained late frontal negativity followed the N400s and did not differ between modalities. These results support the idea that semantic memory balances a distributed cortical network accessible from multiple modalities, yet also engages amodal processing insensitive to specific modalities.

Full reference:

Manfredi, Mirella, Neil Cohn, Mariana De Araújo Andreoli, and Paulo Sergio Boggio. 2018. “Listening beyond seeing: Event-related potentials to audiovisual processing in visual narrative.” Brain and Language 185:1-8. doi:


Write a Reply or Comment