I have been playing with a new toy -- a question and answer system. [1, 2]

Here's how it works. Save a document as a plain text file. The document can be 
just about anything that makes sense. Examples include: a job posting, a 
conference announcement, or a journal article. Apply a previously created 
machine learning model to the document, and the result is a list of questions. 
Feed the list of questions and the document to another model, and get back a 
list of answers. These models are embedded and configurable in a couple of 
Python scripts, as the links below outline. Most of the models are available 
from a repository of models called Hugging Face. [3]

I applied my implementation to a message sent to our list earlier today, and a 
few of the more interesting questions and answers include:

  How much do participants travel stipends?

     answer: up to $1000
    context: rous support from the Mellon Foundation, participant
             travel stipends (up to $1000) are available to offset air
             and/or ground transportation, parking, 

  What date will we follow up with you if your application is accepted?

     answer: February 3, 2023
    context: application is accepted, we will follow up with you no
             later than February 3, 2023. For more details, including an
              agenda, see the Event Website <ht

  What is a publication medium that is both a primary source and a networked
  container of primary sources?

     answer: the web
    context: is both a primary source and a networked container of
             primary sources, the web presents challenges of scale and
             complexity for those that seek to int


The full list of about twenty questions and answers is attached.

I did this same sort of thing against chapters in Moby Dick, asked questions 
like "Who is Ahab?", "Where did they sail?", and "What is whaling?" The answers 
are often times quite plausible.

This sort of system can be applied more broadly in Library Land. Students, 
researchers, and scholars are suffering from information overload; we all 
continue to drink from the proverbial firehose. Given something like the system 
outlined above, librarians and libraries can go beyond providing access to 
data, information, knowledge. More specifically, we can support the process of 
using & understanding data, information, and knowledge.

Fun with digital scholarship?


[1] generate questions - 
https://haystack.deepset.ai/tutorials/13_question_generation
[2] answer questions - 
https://haystack.deepset.ai/tutorials/01_basic_qa_pipeline
[3] Hugging Face - https://huggingface.co/models

--
Eric Lease Morgan
Navari Family Center for Digital Scholarship
Hesburgh Libraries
University of Notre Dame

https://cds.library.nd.edu


Questions and answers

This is a list of questions and answers rooted in a conference annoucement 
posted to the Code4Lib mailing list. The announcment was fed to a machine 
learning model which returned a list of questions. The questions were then fed 
to another model which returned answers. In this particular case, the answers 
are more than plausible, if not 100% accurate. Fun with the digital 
scholarship. --Eric Lease Morgan <[email protected]>, January 12, 2023


  How much do participants travel stipends?

     answer: up to $1000
    context: rous support from the Mellon Foundation, participant
             travel stipends (up to $1000) are available to offset air
             and/or ground transportation, parking, 


  On what date will the workshop be held alongside the ACRL 2023 Conference?

     answer: March 15, 2023
    context: https://archive-it.org/blog/digital-scholarship-and-the-web/>
             held on March 15, 2023 alongside the ACRL 2023 Conference
             <https://acrl2023.us2.pathable.c


  What date will we follow up with you if your application is accepted?

     answer: February 3, 2023
    context: application is accepted, we will follow up with you no
             later than February 3, 2023. For more details, including an
              agenda, see the Event Website <ht


  What does the Event Website contain?

     answer: an agenda
    context: with you no later than February 3, 2023. For more
             details, including an agenda, see the Event Website
             <https://archive-it.org/blog/digital-scholarsh


  What do participants gain familiarity with using web archives?

     answer: web archive research use cases and how libraries support them
    context: s as a primary source, gain familiarity with web
             archive research use cases and how libraries support them; and
             acquire hands-on experience creating w


  What is a publication medium that is both a primary source and a networked 
container of primary sources?

     answer: the web
    context: is both a primary source and a networked container of
             primary sources, the web presents challenges of scale and
             complexity for those that seek to int


  What is required to attend the workshop?

     answer: applicants
    context: The Internet Archive <https://archive.org/> invites
             applicants to a daylong workshop Digital Scholarship and the
             Web: An Introduction to Data Analysi


  What is the acronym for Archives Research Compute Hub?

     answer: ARCH
    context: putationally analyzing web archives using Archives
             Research Compute Hub (ARCH)
             <https://webservices.archive.org/pages/arch>. Participant
             Support This


  What is the maximum amount of travel stipends?

     answer: $1000
    context: s support from the Mellon Foundation, participant
             travel stipends (up to $1000) are available to offset air
             and/or ground transportation, parking, two

  What is the priority deadline for all applications?

     answer: January 27, 2023
    context: space is limited and the priority deadline for all
             applications is January 27, 2023. If your application is
             accepted, we will follow up with you no la


  What kind of support does the Mellon Foundation provide?

     answer: generous support from the Mellon Foundation, participant travel 
stipends
    context: ever registration is limited, and with generous
             support from the Mellon Foundation, participant travel stipends
             (up to $1000) are available to offset 


  What type of production occurs globally?

     answer: digital information
    context: 023.us2.pathable.com/> in Pittsburgh, PA. Every day,
             significant digital information production occurs globally,
             much of it across the web (e.g., new


  What will participants learn about web archives as a primary source?

     answer: familiarity with web archive research use cases and how libraries 
support them
    context: archives as a primary source, gain familiarity with
             web archive research use cases and how libraries support them;
             and acquire hands-on experience cr


  Where can you send any questions?

     answer: [email protected].
    context: genda, see the Event Website
             <https://archive-it.org/blog/digital-scholarship-and-the-web/>.
             Please direct any questions to [email protected]. 


  Where can you submit an application?

     answer: The Internet Archive
    context: The Internet Archive <https://archive.org/> invites
             applicants to a daylong workshop Digital Scholarship and the
             Web: An Introduction to Data Analysi


  Where is the workshop held?

     answer: Pittsburgh, PA
    context: de the ACRL 2023 Conference
            <https://acrl2023.us2.pathable.com/> in Pittsburgh, PA. Every
            day, significant digital information production occurs glob


  Who invites applicants to a daylong workshop on Digital Scholarship and the 
Web: An Introduction to Data Analysis and Instruction?

     answer: The Internet Archive
    context: The Internet Archive <https://archive.org/> invites
             applicants to a daylong workshop Digital Scholarship and the
             Web: An Introduction to Data Analysi


Reply via email to