[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia

Kimmo Virtanen Thu, 30 Mar 2023 04:25:54 -0700

Hi,

>>  My understanding is that is not proprietary, and the only reason it
doesn't qualify for Open Source Initiative approval is because of these use
restrictions:
>
> To generate or disseminate information or content, in any context (e.g.
posts, articles, tweets, chatbots or other kinds of automated bots) without
expressly
> and intelligibly disclaiming that the text is machine generated
>
This makes it useless in most content-related use cases as it requires too
much extra text to use the results.


About FOSS compatible LLMs, EleutherAI's GPT-J, NeoX, and Pythia and
Cerebras-GPT are under Apache 2.0. The question is whether these models are
good enough to be useful. However, the same question is relevant to Bloom
too.

Br,
-- Kimmo Virtanen, Zache

On Thu, Mar 30, 2023 at 3:34 AM Lauren Worden <laurenworde...@gmail.com>
wrote:

> On Wed, Mar 29, 2023 at 1:50 PM Jan Ainali <j...@aina.li> wrote:
> >
> > I think it is important to, as early as possible, deter all these
> attempts to weaken the concept of "open" and that we as a movement need to
> take a hard stance against them.
> > These proprietary licenses do not fit the spirit of sharing all
> knowledge and letting anyone do whatever they want with it.
>
> Is the BLOOM RAIL license [
> https://huggingface.co/spaces/bigscience/license ] proprietary?  My
> understanding is that is not proprietary, and the only reason it
> doesn't qualify for Open Source Initiative approval is because of
> these use restrictions:
>
> "You agree not to use the Model or Derivatives of the Model:
> (a) In any way that violates any applicable national, federal, state,
> local or international law or regulation;
> (b) For the purpose of exploiting, harming or attempting to exploit or
> harm minors in any way;
> (c) To generate or disseminate verifiably false information with the
> purpose of harming others;
> (d) To generate or disseminate personal identifiable information that
> can be used to harm an individual;
> (e) To generate or disseminate information or content, in any context
> (e.g. posts, articles, tweets, chatbots or other kinds of automated
> bots) without expressly and intelligibly disclaiming that the text is
> machine generated;
> (f) To defame, disparage or otherwise harass others;
> (g) To impersonate or attempt to impersonate others;
> (h) For fully automated decision making that adversely impacts an
> individual’s legal rights or otherwise creates or modifies a binding,
> enforceable obligation;
> (i) For any use intended to or which has the effect of discriminating
> against or harming individuals or groups based on online or offline
> social behavior or known or predicted personal or personality
> characteristics
> (j) To exploit any of the vulnerabilities of a specific group of
> persons based on their age, social, physical or mental
> characteristics, in order to materially distort the behavior of a
> person pertaining to that group in a manner that causes or is likely
> to cause that person or another person physical or psychological harm;
> (k) For any use intended to or which has the effect of discriminating
> against individuals or groups based on legally protected
> characteristics or categories;
> (l) To provide medical advice and medical results interpretation;
> (m) To generate or disseminate information for the purpose to be used
> for administration of justice, law enforcement, immigration or asylum
> processes, such as predicting an individual will commit fraud/crime
> commitment (e.g. by text profiling, drawing causal relationships
> between assertions made in documents, indiscriminate and
> arbitrarily-targeted use)."
>
> Those restrictions seem very reasonable to me, and I would consider
> them an advantage given the problems the field is experiencing,
> including the threats to project content integrity. I don't see any
> drawbacks, and I see several advantages to encouraging such
> restrictions.
>
> So I expect the BLOOM license would therefor qualify for an exception
> as described in
> https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_Terms_of_use
>
> There is further discussion of these issues at
> https://arxiv.org/pdf/2011.03116.pdf
>
> -LW
> _______________________________________________
> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines
> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/L6DTD5QQWJPZVXDMT4L5NVFWCZKPLXJD/
> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/D3OT2HV2ZFGRH2ONOD7JVJ4R25MICEL2/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia

Reply via email to