• 8:00-8:30 Virtual Breakfast. Informal groups

  • 8:30-8:45 Intro Talk

  • 8:45-10:15 Invited Talks #1: Scalable, Multimodal NLP (I)

    • 8:45-9:15 NLP Intelligence for Microsoft Outlook and Teams​ (Mei-Yuh Hwang)

Abstract: I will be talking about what work we are doing to improve productivity by helping users complete tasks easier and faster on Outlook and Teams. This includes smart email search, people search, files search, smart reply and smart compose, among others. Internationalization is achieved via transfer learning on language-agnostic representations such as mBERT, XLM, and InfoXLM, with machine translated training data. More and more delightful experience will be added into Microsoft Office products, including sharepoint. Feedback is highly appreciated.

    • 9:15-9:45 Dhruv Batra


    • 9:45-10:15 NLP-enabled Design Assistance for Visual Communication (Thamar Solorio)

Abstract: Content authoring and design refers to the interdisciplinary research space that includes Graphic Design and Artificial Intelligence fields such as NLP, and machine learning. This area addresses open problems in leveraging AI-empowered models to assist users during creation by modelling the author/audience needs so that the outcome is aesthetically appealing and effectively communicates its intent. Research related to this space is emerging, and we have seen significant improvements in various platforms for generating, formatting, and editing digital text. During this talk, I will present recent efforts in this space. In particular, I will discuss our recent results on predicting word emphasis in short texts. For textual content, word emphasis is used as a powerful tool to better convey the desired meaning of the written text to the audience. In addition, whether on flyers, posters, ads, social media posts or motivational messages, emphasis is usually designed to grab the viewer's attention by being distinct from the rest of the design elements. Due to the subjective nature of the task, where multiple appropriate solutions may exist, we formulate this task as label distribution learning. I will discuss the advantages and disadvantages of this formulation and will conclude with a brief overview of our ongoing work in a related task of emphasis prediction in presentation slides. My goal is to motivate more research in this exciting and relatively unexplored problem space.

  • 10:15-10:30 Break. Informal groups

  • 10:30-12:00 InvitedTalks #2: Privacy, Safety, Federated Learning, Explainability

    • 10:30-11:00 Irina Rish


    • 11:00-11:30 Tom Yeh


    • 11:30-12:00 Evaluating and Testing Natural Language Processing Models (Sameer Singh)

Abstract: Current evaluation of the generalization of natural language processing (NLP) systems, and much of machine learning, primarily consists of measuring the accuracy on held-out instances of the dataset. Since the held-out instances are often gathered using similar annotation process as the training data, they include the same biases that act as shortcuts for machine learning models, allowing them to achieve accurate results without requiring actual natural language understanding. Thus held-out accuracy is often a poor proxy for measuring generalization, and further, aggregate metrics have little to say about where the problem may lie. In this talk, I will introduce a number of approaches we are investigating to perform a more thorough evaluation of NLP systems. I will first provide a quick overview of automated techniques for perturbing instances in the dataset that identify loopholes and shortcuts in NLP models, including semantic adversaries and universal triggers. I will then describe recent work on creating comprehensive and thorough tests and evaluation benchmarks for NLP using CheckList, that aim to directly evaluate comprehension and understanding capabilities. The talk will cover a number of NLP tasks, including sentiment analysis, textual entailment, paraphrase detection, and question answering.

  • 12:00-2:00 Lunch/Poster/Demo Breakouts

  • 2:00-2:30 Lightning Talks*

    • Social Bias Frames: Reasoning about Social and Power Implications of Language

    • They, Them, Theirs: Rewriting with Gender-neutral English

    • Simulated Multiple Reference Training Improves Low-Resource Machine Translation

(* The above papers are nominated for the best paper, which will be announced during the event)
  • 2:30-4:00 Invited Talks #3: Scalable, Multimodal NLP (II)

    • 2:30-3:00 Building Embodied Conversational Agents (Asli Celikyilmaz)

Abstract: Language understanding and generation are particularly challenging for multi-modal tasks such as visual-language navigation, which poses several challenges in building AI agents that can understand natural language instructions to navigate in real human environments to reach a goal (e.g., find an object). Success in these task requires building multimodal language groundings that allow the agent to successfully navigate while reasoning about vision-language dynamics. We train our agents to execute our commands but not necessarily teach the agents how to react when uncertainties in the environment arise. In this talk, I will present our recent work in which we go beyond instruction following and teach the navigating agent to learn to communicate in natural language to get help from another agent (oracle) to reach the goal more efficiently. I will present our new thinking into a more general problem for understanding how a system asks for and receives assistance with the goal of exploring techniques to transfer and generalize for vision-language navigation research field.

    • 3:00-3:30 Efficient transformers for natural language processing (Hanna Hajishirzi)

Abstract: In this talk, I present our recent work in introducing a deep and light-weight transformer, DeLighT, that delivers similar or better performance than standard transformer-based models with significantly fewer parameters. Our network more efficiently allocates parameters both (1) within each Transformer block using a deep and light-weight transformation and (2) across blocks using block-wise scaling, that allows for shallower and narrower DeLighT blocks near the input and wider and deeper DeLighT blocks near the output. Overall, DeLighT networks are 2.5 to 4 times deeper than standard transformer models and yet have fewer parameters and operations. Experiments on benchmark machine translation and language modeling tasks show that \arch~matches or improves the performance of baseline Transformers with 2 to 3 times fewer parameters on average.

    • 3:30-4:00 Y-Lan Boureau


  • 4:00-4:15 Break. Informal groups

  • 4:15-5:45 Panel: NLP research post COVID-19

    • Jason Williams (moderator), Ruhi Sarikaya, Svitlana Volkova, Alex Rudnicky, Mona Diab

  • 5:45-6:00 Closing Remarks

  • 6:00-7:00 (or later): Virtual Happy Hour