Search Goes Multimodal

Google will upgrade its search engine with a new model that tracks the 
relationships between words, images, and, in time, videos — the first fruit of 
its latest research into multimodal machine learning and multilingual language 
modeling.


*What’s new:* Early next year, Google will integrate a new architecture called 
Multitask Unified Model (MUM) into its traditional Search algorithm and Lens 
photo-finding system, VentureBeat 
<https://info.deeplearning.ai/e3t/Btc/LX+113/cJhC404/VW1DF-4wY9ZzM15xZCpSHDPW2L--MX4yFR-qN1sYgZG3q3phV1-WJV7CgYvMN81R7zCT9lWrW1wjv3P4ZfSJ-W2RCPXR3HtSBZW21L2hQ3C3VM-W3S1XJF6vQBGTW1Skmb87MqZ4NW7S2kr56RXmqcW5QN7tM4nFSHCVJ-WBV6DmFbnW5LzXHQ3SsDBvMW9QZP84w4lW4PGRP75ggmnlW5_7Wpq3WD2FgV2NMn28b1LTCW7WfnDQ2bWgZsW76Ryd01BPyb7W7DPZgS2rNmznVk2zSS2-T_9vW8Q4P1Y485SJNN867QtZ-vKNgW6d67z64WTLKPW3qQpQb1NsCb5W3V_Kpc68kl4KVh12Xn7kgH4HW7mWfs-1Y03pPN1p6LfHwlN1hW6HZ93v2rWfRpW4xc17r1H2DP534hb1>
 reported. The new model will enable the search engines to break down complex 
queries (“I’ve hiked Mt. Adams and now I want to hike Mt. Fuji next fall. What 
should I do differently to prepare?”) into simpler requests (“prepare to hike 
Mt. Adams,” “prepare to hike Mt. Fuji,” “Mt. Fuji next fall”). Then it can 
combine results of the simpler requests into coherent results.


*How it works:* Announced 
<https://info.deeplearning.ai/e3t/Btc/LX+113/cJhC404/VW1DF-4wY9ZzM15xZCpSHDPW2L--MX4yFR-qN1sYgZ33q3nJV1-WJV7CgGzsW4l2vBG6Vj3FyW5xQjrM4RsNtFW4J61Rs6g6JsmN6MS5MBG5wnzW26_WQD1RjmV0W2dQCyq474DsyW4yctQ26N55NMW7D43T36_W31tW1hp4ns6q60fKW6tq21f7X-zpZW3mfKKj3RTBf0W2DyyxH4_HdXrW5lGN4h35v8lQW4ybzps3PXqjPW6f7J1W3g1lLlW10wHsX7cxVRVW5v391b6nP9FxW4fpQl53sgpPgW2SjSSd2fvZmjN5Hz73-vrMMqW6v6Htr4PcrMwW3gZG_F6PnHF3Vt4q344Kv8vgW6J4lSX8nZfHk39TC1>
 in May, MUM is a transformers-based natural language model. It’s based on 
Google’s earlier T5 
<https://info.deeplearning.ai/e3t/Btc/LX+113/cJhC404/VW1DF-4wY9ZzM15xZCpSHDPW2L--MX4yFR-qN1sYgYN3q3npV1-WJV7CgMQQW7CCGcG6ZRGjJW4-R5cj62RN9XW22V33w3SjnM6W23CRLs56Ny05W3M-H-Q4crpgsW2W4Tp14j1qJ_W317hSs4gRlRlW5-nKny1X2m-NW8Rxkr58dsWJpMrgJltlwvHxW7J6SfX6WtFHBW4018M13KNgZFW5cYsdS6Tj0l_W697_N47nxTrYW340drC3jgVbRW5CG7YG3PLcCRW5KhhKj80Mh9kW5zzdC44h80K5W8nWvfW8tsPX_W3wyL3q2vwJWdW5NKZQv5PBLm4W4RSPvt6F7xwy3dfq1>
 that comprises around 110 billion parameters (compared to BERT’s 110 million, 
GPT-3’s 175 billion, and Google’s own Switch Transformer at 1.6 trillion). It 
was trained on a dataset of text and image documents drawn from the web from 
which hateful, abusive, sexually explicit, and misleading images and text were 
removed.


 * Google Search users will see three new features powered by MUM: an 
AI-curated list that turns broad queries into actionable items and step-by-step 
instructions, suggestions to tweak queries, and links to relevant audio and 
video results. 
 * Google Lens 
<https://info.deeplearning.ai/e3t/Btc/LX+113/cJhC404/VW1DF-4wY9ZzM15xZCpSHDPW2L--MX4yFR-qN1sYgYt3q3n5V1-WJV7CgF0GW4t_dSc8-7zFzW8rQ6Nr6gnCQzW6LTNYy2h5QCTW6GNtBX7kN9FCW6p7f4z700w7FW93y50L5zRs73W8bt9rg2lR4z0W75W2Sx6cH8FwW2X_s3153k0ZkW7jNv468j98sTW2NVXXk7fxZr9W4TNG3N6-7KC6W2MFmss4zdskxW4L4Lgl46bSH1W99lTNZ2slNRbN6lp-mQK9-DMW4zKvFQ3N4z-YV19Vcs8P_9KmW31XtKH3vTBFPW4xctgP8dGrDv3bvV1>
 users can take a photo of a pair of boots and, say, ask if they are 
appropriate to hike a particular mountain. MUM will provide an answer depending 
on the type of boot and the conditions on the mountain. 
 * The technology can answer queries in 75 languages and translate information 
from documents in a different language into the language of the query. 
 * Beyond filtering objectionable material from the training set, the company 
tried to mitigate the model’s potential for harm by enlisting humans to 
evaluate its results for evidence of bias.
*Behind the news:* In 2019, Google Search integrated BERT 
<https://info.deeplearning.ai/e3t/Btc/LX+113/cJhC404/VW1DF-4wY9ZzM15xZCpSHDPW2L--MX4yFR-qN1sYgZm3q3n_V1-WJV7CgYlZW7Dxwkt6JyX7SW7HWj_x2kXQkqW6wv8l_2jk6qSW2L8NFD3fmvvYVdvKjr78LgvVW8_-bF68fgr1RW8yqbKg7k0JVPW5m4p5L1SN6lyW1drXp-4-jg24Vzc8tt3qpclkW9hz5wS4LtzSdW6H86N-1Fs1WzW5PPNjB2V9ZzgW6y6jK799bsjRW45klL35w16dTN1RlbJbMhPPjW85bbTb32YrdCW1Fr4L-8LfBV8N1_cTKKvqj9lW4fx0tb7fpbBBW1FhHxG21pRvqW3G9VwC1s2h21W2TD1mS5MDN47W1tcvFL1WQfkRTB4PR6JQv0CN241yMpLWqx-36Fg1>.
 The change improved the results of 10 percent of English-language queries, the 
company said, particularly those that included conversational language or 
prepositions like “to” (the earlier version couldn’t distinguish the 
destination country in a phrase like “brazil traveler to usa”).  BERT helped 
spur a trend toward larger, more capable transformer-based language models. 


*Why it matters:* Web search is ubiquitous, but there’s still plenty of room 
for improvement. This work takes advantage of the rapidly expanding 
capabilities of transformer-based models.


*We’re thinking:* While we celebrate any advances in search, we found Google’s 
announcement short on technical detail. Apparently MUM really is the word.



------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Tfbe7571f16620c06-M6a148efe1d0b0714c3f3499c
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to