INFERENCELAB
Back home
Research output

Low-resource NLP & speech intelligence.

Independent research and international academic collaboration — King Saud University, EPU Kuwait, Doane University (USA), and Hanyang University (Korea). Every release ships reproducible pipelines, evaluation documentation, and a permanent DOI.

Low-Resource NLP

Datasets, annotation pipelines, and fine-tuned models for Roman Urdu — filling the gap between major-language NLP and underrepresented South Asian languages.

Speech Intelligence

Vocal fatigue estimation, speaker verification, and continuous vocal load monitoring — deployed as open libraries and production REST APIs.

Cognitive & Ergonomic Systems

Multi-institutional research applying Cognitive Systems Engineering to real-world healthcare settings.

Publications & preprints
  • Under Review2026

    Modeling Vocal Fatigue as Embedding-Space Deviation Using Contrastively Trained ECAPA-TDNNs

    ECAPA-TDNN-VHE designed from scratch with supervised contrastive loss — 2.5× accuracy over baseline (78% vs 36%), F1 scores 0.85 / 0.78 / 0.70 across three fatigue classes.

    Springer · EURASIP J. on Signal ProcessingDOI
  • Under Review2026

    Continuous Vocal Load Monitoring in Professional Voice Users

    Development and occupational validation of an automated vocal load assessment tool for professional voice users — clinical-grade speech analysis in production.

    Journal of Voice · King Saud University & EPU Kuwait
  • Under Review2026

    RUEmoCorp: A Large-Scale Roman Urdu Emotion Corpus & Benchmark Suite

    First large-scale Roman Urdu emotion corpus — 134K labeled samples with Fleiss κ = 0.658 (substantial agreement), multi-institute annotation, fully open-source on HuggingFace and Harvard Dataverse.

    Language Resources and Evaluation (Springer)DOI
  • Published Preprint2026

    RUDaSA: Roman Urdu Dataset for Sentiment Analysis — A Large-Scale, Curated Corpus with Privacy-Preserving Embeddings and Competitive Benchmarking of Transformer Models

    Large-scale Roman Urdu sentiment corpus built via privacy-preserving embedding pipelines. Benchmarks state-of-the-art Transformer models — addressing a critical gap in low-resource South Asian NLP.

    Research Square · PreprintDOI
  • Published Preprint2025

    Data-Centric Roman Urdu NLP: Dataset Curation & Model Benchmarking

    Largest high-quality Roman Urdu sentiment dataset via privacy-preserving embedding pipelines — SOTA 0.84 accuracy, 0.83 Macro-F1.

    Zenodo · PreprintDOI
  • Published Preprint2025

    Forecast-Based Decision Support System for Mango Malformation

    Time-series forecasting and smart-agriculture DSS — demonstrated 50–60% yield improvement through data-driven intervention.

    Zenodo · PreprintDOI
  • In Progress2026

    Ergonomic Interventions and Cognitive Workload in Healthcare Settings: A Qualitative Case Study Using Cognitive Systems Engineering

    Multi-institutional international study applying Cognitive Systems Engineering to healthcare ergonomics — systematic analysis of workload, safety, and intervention efficacy.

    Hanyang University (Korea) · King Saud University (Saudi Arabia) · Doane University (USA)