Publications

Publications

|

Ready for You When You Are Back: Content-driven Session-based Recommendation for Continuity of Experience

Authors: Brijraj Singh, Sonal Dabral, Niranjan Pedanekar
AAAI | 2025

|

Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs

Authors: Pratik Rakesh Singh, Mohammadi Zaki, Pankaj Wasnik
AAAI | 2025

|

EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion

Authors: Ashish Gudmalwar, Ishan Biyani, Nirmesh Shah, Pankaj S. Wasnik, Rajiv R. Shah
AAAI | 2025

|

LLM-BRec: Personalizing Session-based Social Recommendation with LLM-BERT Fusion Framework

Authors: Raksha Jalan, Tushar Prakash and Niranjan Pedanekar
Generative Information Retrieval (Gen-IR) workshop at the SIGIR 2024 conference | July 2024

Read More

|

DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing

Authors: Neha Sahipjohn, Ashishkumar Gudmalwar, Nirmesh Shah, Pankaj Wasnik, Rajiv Ratn Shah (IIIT Delhi)
INTERSPEECH | September 2024

Read More

|

VECL-TTS: Voice Identity and Emotional Style Aware Cross-Lingual TTS

Authors: Ashishkumar Gudmalwar, Nirmesh Shah, Sai Akarsh, Pankaj Wasnik, Rajiv Ratn Shah (IIIT Delhi)
INTERSPEECH | September 2024

Read More

|

Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection

Authors: Ayush Ghadiya, Purbayan Kar ,Vishal Chudasama, Pankaj Wasnik
Computer Vision and Pattern Recognition (CVPR) 7th MULA Workshop | June 2024

Read More

|

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning

Authors: Shivam Ratnakant Mhaskar, Nirmesh Shah, Mohammadi Zaki, Ashishkumar Gudmalwar, Pankaj Wasnik, Rajiv Ratn Shah (IIIT Delhi)
Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL),: findings | June 2024

Read More

|

Efficacy of Large Language Models in Predicting Hindi Movies' Attributes: A Comprehensive Survey and Content-Based Analysis

Authors: Prabir Mondal (IIT Patna), Siddharth Singh (IIT Patna), Kushum (IIT Patna), Sriparna Saha (IIT Patna), Jyoti Prakash Singh (IIT Patna), Brijraj Singh, Niranjan Pedanekar
WebConf (WWW) | May 2024

Read More

|

Optimizing Movie Selections: A Multi-Task, Multi-Modal Framework with Strategies for Missing Modality Challenges

Authors: Subham Raj (IIT Patna), Pawan Agrawal (IIT Patna), Sriparna Saha (IIT Patna), Brijraj Singh, Niranjan Pedanekar
ACM Symposium on Applied Computing (SAC) | April 2024

Read More

|

Estimation of individual causal effects in network setup for multiple treatments

Authors: Abhinav Thorat, Ravi Kolla, Niranjan Pedanekar, Naoyuki Onoe
38th Annual Association for the Advancement of Artificial Intelligence (AAAI) Conference on Artificial Intelligence [Graphs and Complex Structure for Learning and Reasoning (GCLR) Workshop] | February 2024

Read More

|

Open-set Object Detection By Aligning Known Class Representations

Authors: Hiran Sarkar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian (IIT Hyderabad)
Winter Conference on Applications of Computer Vision (WACV) | January 2024

|

Open-set Object Detection By Aligning Known Class Representations

Authors: Hiran Sarkar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian (IIT Hyderabad)
Winter Conference on Applications of Computer Vision (WACV) | January 2024

Read More

|

Efficient infusion of self-supervised representations in Automatic Speech Recognition

Authors: Darshan Prabhu, Saiganesh Mirishkar, Pankaj Wasnik
Poster presentation at the Neural Information Processing Systems (NeurIPS) 3rd Workshop | December 2023

Read More

Read More

|

Enhancing Social Recommendation with Multi-View BERT Network

Authors: Tushar Prakash, Raksha Jalan, Naoyuki Onoe
IEEE International Conference on Data Mining (ICDM) | December 2023

Read More

|

Fiducial Focus Augmentation for Facial Landmark Detection

Authors: Purbayan Kar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian (IIT Hyderabad)
British Machine Vision Conference (BMVC) | November 2023

Read More

|

Impulsion of Movie's Content-Based Factors in Multi-Modal Movie Recommendation System

Authors: Prabir Mondal (IIT Patna), Pulkit Kapoor (IIT Patna), Siddharth Singh (IIT Patna), Prof. Sriparna Saha (IIT Patna), Naoyuki Onoe, Brijraj Singh
International Conference on Neural Information Processing (ICONIP) | November 2023

Read More

|

LLM Based Generation of Item-Description for Recommendation System

Authors: Arkadeep Acharya, Brijraj Singh, Naoyuki Onoe
Recommender Systems Conference (RECSYS) | September 2023

Read More

|

CR-SoRec: BERT driven Consistency Regularization for Social Recommendation

Authors: Tushar Prakash, Raksha Jalan, Brijraj Singh, Naoyuki Onoe
Recommender Systems Conference (RECSYS) | September 2023

Read More

|

Iteratively Improving Speech Recognition and Voice Conversion

Authors: Mayank Kumar Singh, Naoya Takahashi, Onoe Naoyuki
INTERSPEECH | August 2023

Read More

|

Cd-HRNN: Content-Driven HRNN to Improve Session-Based Recommendation System

Authors: Sonal Dabral, Brijraj Singh, Naoyuki Onoe
International Joint Conference on Neural Networks (IJCNN Main Conference) | April 2023

Read More

|

A Multi-Modal Multi-Task Based Approach for Movie Recommendation

Authors: Sriparna Saha (IIT Patna), Naoyuki Onoe
International Joint Conference on Neural Networks (IJCNN Main Conference) | April 2023

|

A Meta-Learning Based Generative Model with Graph Attention Network for Multi-Modal Recommender Systems

Authors: Sriparna Saha (IIT Patna), Naoyuki Onoe
International Neural Network Society Workshop on Deep Learning Innovations and Applications (INNS DLIA)/International Joint Conference on Neural Networks (IJCNN) | April 2023

|

Task-Specific and Graph Convolutional Network Based Multi-Modal Movie Recommendation System in Indian Setting

Authors: Sriparna Saha (IIT Patna), Naoyuki Onoe
International Neural Network Society Workshop on Deep Learning Innovations and Applications (INNS DLIA)/International Joint Conference on Neural Networks (IJCNN) | April 2023

|

Revisiting Class Imbalance for End-to-end Semi-Supervised Object Detection

Authors: Purbayan Kar, Vishal Chudasama, Pankaj Wasnik, Naoyuki Onoe
Efficient Deep Learning for Computer Vision (ECV) Workshop in Computer Vision and Pattern Recognition (CVPR) | April 2023

Read More

|

Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing

Authors: Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe
The International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | February 2023

Read More

|

Hierarchical disentangled representation learning for singing voice conversion

Authors: Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji
The International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | February 2023

Read More

|

Graph Network based Approaches for Multi-modal Movie Recommendation System

Authors: Daipayan Chakder (IIT Patna), Parbir Mondal (IIT Patna), Subham Raj (IIT Patna), Sriparna Saha (IIT Patna), Angshuman Gosh, Naoyuki Onoe
IEEE International Conference on System, Man, and Cybernetics (SMC) | November 2022
Read More ➜

|

Semi-supervised Acoustic and Language Modeling for Hindi ASR

Authors: Tarun Sai Bandarupalli (IISc Bangalore), Shakti Rath (IISc Bangalore), Nirmesh Shah, Onoe Naoyuki, Sriram Ganapathy (IISc Bangalore)
INTERSPEECH | September 2022

Read More

|

Towards Developing a Multi-Modal Video Recommendation System

Authors: Sriram Pingali (IIT Patna), Prabir Mondal (IIT Patna), Daipayan Chakder (IIT Patna), Sriparna Saha (IIT Patna), Angshuman Ghosh
International Joint Conference on Neural Networks (IJCNN) | September 2022

Read More

|

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer

Authors: Shrutina Agarwal (IISc Bangalore), Sriram Ganapathy (IISc Bangalore), Naoya Takahashi
INTERSPEECH | September 2022

Read More

|

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

Authors: Vishal Chudasama, Purbayan Kar, Ashish Gudmalwar, Nirmesh Shah, Pankaj Wasnik, Naoyuki Onoe
Conference on Computer Vision and Pattern Recognition (CVPR) | June 2022

Read More

|

A Unified Model for Fingerprint Authentication and Presentation Attack Detection

Authors: Additya Popli (IIIT Hyderabad), Saraansh Tandon (IIIT Hyderabad), Joshua J. Engelsma (Michigan State University), Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri (IIIT Hyderabad)
International Conference on Acoustics, Speech, and Signal Processing (IJCB) | April 2021

Read More

|

End-to-end lyrics Recognition with Voice to Singing Style Transfer

Authors: Sakya Basak (IISc Bangalore), Shrutina Agarwal (IISc Bangalore), Sriram Ganapathy (IISc Bangalore), Naoya Takahashi
International Conference on Acoustics, Speech, and Signal Processing (ICASSP) | February 2021

Read More

***International Institute of Information Technology Hyderabad **Indian Institute of Technology Patna *Indian Institute of Science, Bangalore #Michigan State University

Skip to content