
Abstract
Partial multi-label learning aims to extract knowledge from incompletely annotated data, which includes known correct labels, known incorrect labels, and unknown labels. The core challenge lies in accurately identifying the ambiguous relationships between labels and instances. In this paper, we emphasize that matching co-occurrence patterns between labels and instances is key to addressing this challenge. To this end, we propose Semantic Co-occurrence Insight Network (SCINet), a novel and effective framework for partial multi-label learning. Specifically, SCINet introduces a bi-dominant prompter module, which leverages an off-the-shelf multimodal model to capture text-image correlations and enhance semantic alignment. To reinforce instance-label interdependencies, we develop a cross-modality fusion module that jointly models inter-label correlations, inter-instance relationships, and co-occurrence patterns across instance-label assignments. Moreover, we propose an intrinsic semantic augmentation strategy that enhances the model's understanding of intrinsic data semantics by applying diverse image transformations, thereby fostering a synergistic relationship between label confidence and sample difficulty. Extensive experiments on four widely-used benchmark datasets demonstrate that SCINet surpasses state-of-the-art methods.
Methodology

The proposed SCINet framework addresses partial multi-label learning through three key components: 1) Bi-Dominant Prompter, leveraging CLIP-based text and visual encoders to capture semantic co-occurrence between labels and instances, enhanced by weak/medium/strong image transformations; 2) Cross-Modality Fusion Module, which optimizes label confidence by integrating instance similarity (via Gaussian-based local relationships) and label correlations (via Pearson coefficients) into a unified confidence matrix; 3) Intrinsic Semantic Augmentation, employing triple transformations (weak, medium, strong) with consistency losses and self-distillation to align semantic distributions across perturbations. These components synergistically exploit global label-instance dependencies, refine multi-modal alignment, and enhance robustness against partial supervision.
Interactive Demo




More Results





Citation
@article{WU2025,
title = {Exploring Partial Multi-Label Learning via Integrating Semantic Co-occurrence Knowledge},
year = {2025},
author = {Wu, Xin and Teng, Fei and Feng, Yue and Shi, Kaibo and Lin, Zhuosheng and Zhang, Ji and Wang, James}
}