CV

The full cv can be found by clicking the icon on the right.

Table of contents

General Information

Full Name Bingsheng "Arthur" Yao
Current Occupation 5th Year Ph.D. Student in Computer Science
One-sentence Introduction? Hanc marginis exiguitās nōn caperet (Fermat, n.d.)

Education

  • 2024
    Ph.D. in Computer Science
    Rensselaer Polytechnic Institute
    • Thesis Topic
      • Enhance Machine Reasoning with Human Rationales
  • 2019
    MS in Information Technology & Web Science
    Rensselaer Polytechnic Institute
  • 2018
    BS in Computer & Systems Engineering
    Rensselaer Polytechnic Institute

Professional Experience

  • 2022 - 2023
    Objective Human Explanation Evaluation
    IBM AIRC Fellowship (Rensselaer Polytechnic Institute & IBM Research)
    • Proposed a prompt-based unified data format for different tasks
    • Enhanced the established Simulatability score with a novel metric to evaluate explanations' helpfulness at fine-tuning and inference
    • Our metric can evaluate 5 datasets with 2 models (T5 and BART) consistently, while the established metric falls short
  • 2022 - 2023
    Active Learning for Human Labels and Explanations
    IBM AIRC Fellowship (Rensselaer Polytechnic Institute & IBM Research)
    • Proposed a dual-model AL framework to generate natural language explanations as additional information for the prediction model
    • Designed a novel AL selection strategy based on the similarity between unlabeled data and human-annotated explanations
    • Justified human explanations with our AL framework are beneficial for the prediction model to perform better and converge faster
  • 2021
    QA-Pair Generation (QAG) for Children Storybooks
    Summer Research Extern (IBM Research)
    • Implemented a QAG system with 1). a heuristics-based answer extraction module, question; 2). a fine-tuned BART-based question generation module; 3). a DistilBERT-based ranking module to rank and select best QA-pairs
    • Outperformed 2 SOTA QAG systems on the FairytaleQA dataset in terms of the mean average ROUGE-L precision at top 1, 3, 5, and 10 QA-pairs generated per story section
    • Developed an interactive story-telling web application built upon our QAG system and justified its usefulness through a user study with 12 pairs of parents and children
  • 2021
    FairytaleQA Dataset for Children Education
    Summer Research Extern (IBM Research)
    • Supervised education experts with carefully designed annotation schema to create 10, 580 QA-pairs on 278 fairytale stories
    • Benchmarked FairytaleQA dataset can provide helpfulness on Question Answering and Question Generation tasks by fine-tuning various SOTA language models and performing in-depth analysis

(Selected) Side Research Projects

  • 2023
    Instruction-finetune LLM for Mental Issue Detection
    • Curated high-quality human-annotated datasets for mental issue detection in online communities
    • Designed prompts for augmented tasks with curated datasets and instruction-fine-tuned an Alpaca and FLAN-T5 model
    • Benchmarked with SOTA models (e.g., fine-tuned Alpaca-Lora, mental-RoBERTa) and conducted extensive ablation studies on prompt selection, subsampling, and transfer learning
  • 2023
    QA Annotation Framework with Extern KG Support
    • The framework supports human annotators to select a preferred concept, then retrieve and rank the most relevant Commonsense Knowledge from ConceptNet to facilitate QA-pair annotation
    • Human evaluation justified that expert-created QA-pairs with our framework are preferred to the ones generated by carefully prompted GPT-3.5 few-shot approaches

Honors and Awards

  • 2019
    Member of Upsilon Pi Epsilon
    • The International Honor Society for the Computing and Information Disciplines
  • 2019
    Member of Gamma Nu Eta
    • The National Information Technology Honor Society

Service

  • 2023
    Reviewer
    • ACL 2023 [2], EMNLP 2023 [5], ACL ARR Aug [1]

Leadership Experience

  • 2018
    Teaching Assistant
    • Computer Organization
  • 2019
    Teaching Assistant
    • Introduction of AI
  • 2023
    Teaching Assistant
    • Computer Organization