Cole Manship.jpg

Dr. Vaibhava Goel

Vaibhava is our advisor and consultant on multi-modal research, including providing research and guidance in a number of areas of theoretical computer science, including approximation algorithms, combinatorics, complexity theory, computational geometry, distributed systems, learning theory, online algorithms, cryptography and quantum computing.

Dr. Goel has been with IBM's T. J. Watson Research Center since Dec. 2000 after completing a Ph.D. in Biomedical Engineering from Johns Hopkins University in Baltimore, MD.

For last several years, Vaibhava has focused on the challenges of speech recognition by computers. This very challenging problem draws upon a number of disciplines such as statistics, probability theory, machine learning, information theory, and linguistics.

Dr. Goel’s work spans a wide range of areas in the human-computer interaction field. Some of the important themes of his research include:

Visualization. From discovering successful disease treatments in health care data to uncovering anomalous behaviors in social networks, easy-to-use visualizations are critical for helping people see important patterns in big data sets.

Cognitive & user modeling. Understanding users is at the heart of HCI research. By creating models of users’ cognitive processes, we can help detect usability problems in software systems before they lead to costly mistakes

Usable mobile security. Security often gets in the way of the task at hand. He is investigating intuitive and natural methods for creating secure mobile apps based on biometric authentication.

Mobile collaboration and learning. Mobile field workers often need access to information and expertise on the job site. We are creating systems that enable workers to collaborate and improve their expertise while remaining in the field.

Visual and natural language comprehension are rapidly evolving areas of artificial intelligence (AI). A prime example is image captioning – the task of generating one or more natural language descriptions for an image, relying solely on the visual input – which demonstrates a machine’s comprehension of the visual content as well as its ability to describe that content in natural language. The image captioning task continues to be a very active area of research in academic and industrial research labs.

I am very proud to announce that recently IBM Watson submitted its first entry to the Microsoft COCO Image Captioning Challenge, an ongoing competition since 2015, and is currently in the top spot on the leaderboard!

The results obtained by the Watson entry on various evaluation metrics can be viewed on the codalab results page (row labeled “etiennem”) and also on the MSCOCO results page (Watson Multimodal entry under Table-C5 or Table-C40).