Thank you Ziko and Steven for the thoughtful responses.

My sense is that for a class for readers having a generative UI that returns an 
answer VS an article would be useful. It would probably put Quora out of 
business. :-)

If the models are not open source, this indeed would require developing our own 
models. For that kind of investment, we would probably want to have more 
application areas. Translation being one that Ziko already pointed out but also 
summarization. These kinds of Information retrieval queries would effectively 
index into specific parts of an article vs returning the whole thing.

Wikipedia as we all know is not perfect but it’s about the best you can get 
with the thousands of editors and reviewers doing quality control. If a bot was 
exclusively trained on Wikipedia, my guess is that the falsehood generation 
would be as minimal as it can get. Garbage in garbage out in all these models. 
Good stuff in good stuff out. I guess the falsehoods can also come when no 
material exists in the model. So instead of making stuff up, they could default 
to “I don’t know the answer to that”. Or in our case, we could add the topic to 
the list of article suggestions to editors…

I know I am almost day dreaming here but I can’t help but think that all the 
recent advances in AI could create significantly broader free knowledge 
pathways for every human being. And I don’t see us getting after them 
aggressively enough…

Best regards,

Victoria Coleman

> On Dec 29, 2022, at 5:17 PM, Steven Walling <steven.wall...@gmail.com> wrote:
> 
> 
> 
> 
>> On Thu, Dec 29, 2022 at 4:09 PM Victoria Coleman 
>> <vstavridoucole...@gmail.com> wrote:
>> Hi everyone. I have seen some of the reactions to the narratives generated 
>> by Chat GPT. There is an obvious question (to me at least) as to whether a 
>> Wikipedia chat bot would be a legitimate UI for some users. To that end, I 
>> would have hoped that it would have been developed by the WMF but the 
>> Foundation has historically massively underinvested in AI. That said, and 
>> assuming that GPT Open source licensing is compatible with the movement 
>> norms, should the WMF include that UI in the product?
> 
> This is a cool idea but what would the goals of developing a 
> Wikipedia-specific generative AI be? IMO it would be nice to have a natural 
> language search right in Wikipedia that could return factual answers not just 
> links to our (often too long) articles.
> 
> OpenAI models aren’t open source btw. Some of the products are free to use 
> right now, but their business model is to charge for API use etc. so 
> including it directly in Wikipedia is pretty much a non-starter. 
> 
>> My other question is around the corpus that Open AI is using to train the 
>> bot. It is creating very fluid narratives that are massively false in many 
>> cases. Are they training on Wikipedia? Something else?
> 
> They’re almost certainly using Wikipedia. The answer from ChatGPT is: 
> 
> “ChatGPT is a chatbot model developed by OpenAI. It was trained on a dataset 
> of human-generated text, including data from a variety of sources such as 
> books, articles, and websites. It is possible that some of the data used to 
> train ChatGPT may have come from Wikipedia, as Wikipedia is a widely-used 
> source of information and is likely to be included in many datasets of 
> human-generated text.”
> 
>> And to my earlier question, if GPT were to be trained on Wikipedia 
>> exclusively would that help abate the false narratives
> 
> Who knows but we would have to develop our own models to test this idea. 
> 
>> This is a significant matter for the  community and seeing us step to it 
>> would be very encouraging.
>> 
>> Best regards,
>> 
>> Victoria Coleman
>> _______________________________________________
>> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
>> https://meta.wikimedia.org/wiki/Wikimedia-l
>> Public archives at 
>> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/CYPO3PEMM4FIWPNL6MRTORHZXVTS2VNN/
>> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org
> _______________________________________________
> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
> https://meta.wikimedia.org/wiki/Wikimedia-l
> Public archives at 
> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/G57JUOQ5S5ZHXHWJN7LPYEBZMFVMJGVO/
> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org
_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/WH6SHKVKPBVKPPWID5WFM2RSY3ZUUSQ6/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

Reply via email to