Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

Jason Priem Sat, 14 Jul 2018 11:02:43 -0700

Thanks very much for your feedback, Henrik! I've included a few comments
inline below:


On Sat, Jul 14, 2018 at 4:27 AM, Henrik Karlstrøm <henrik.karlst...@ntnu.no>
wrote:

> Hello all,
>
> I'd like to offer a few comments in this discussion. First I'd like to say
> that I think this project is very exciting and very much in line with the
> spirit of Open Access, by striving to make available the wealth of the
> world's knowledge. This also means to make it understandable. I am sure
> that this type of system will one day be important in the dissemination of
> expert knowledge.
>

Thanks for the encouragement! We quite agree. As I mentioned to Heather, we
hope that even if our approach to the project is less than successful, we
can generate useful lessons for those who come after.

At some point, I agree: someone is going to make this type of project work.
And if it's going to work someday, let's try to make it work now! The
sooner the better! Lord knows we sure could use a lot more access to
rigorous, fact-based perspectives right now.


>
> Having said that, there are some important questions that I think would
> have to be answered for this project to reap these benefits:
>
> - First of all I very much agree with the subject line of Heather's
> original email. While this is hardly "self-driving car, real-time
> processing of multiple source inputs in life-and-death-situations"-hard,
> automated parsing of semantic content is plenty difficult, and an area
> where a lot of work remains to be done. To take just one example, a new
> paper investigating the AI-assisted subject classification methods of the
> Dimensions app finds serious problems of reliability and validity of the
> procedure, even going so far as to say that "most of the papers seem
> misclassified" [1]. The Explanation Engine will need to make some serious
> headway in automated semantic analysis to achieve its goals.
>

Agreed. That's the plan...we're certainly aiming to make some "serious
headway in automated semantic analysis."  We've got two years and a
generous grant and we're optimistic we can do it.

That said, it's risky and so we'll be working hard mitigate those risks.
That means starting with components of the Explanation Engine that don't
require fundamental advances in AI, but rather leverage proven tools and
techniques in (we hope) useful new ways. The entity recognition and
annotation is an example of this...that's already working in testing
environments and it's looking great. The plan is to build from there.

One thing we've got going for us is that articles tend to be quite
structured compared to some forms of text (although the *type* of structure
varies quite widely depending on field-specific norms). This is an
encouraging feature in an otherwise quite challenging problem and we think
it offers a useful way in.

Another factor working on our side is that we don't actually have to reduce
an article to an arbitrary simplicity level. Many readers can *almost*
understand a scholarly article already, and helping them out is
easier. We're raising scaffolding, not building houses.

We are working in the zone of proximal development (ZPD) [1], the zone
where a person can succeed*, if they have help*. We're trying expand the
ZPD as wide as we can, for as many people as we can.

So again, we'll start with the easier, low-risk parts of the
problem...we'll focus on readers who are already pretty well-prepared and
well-motivated, folks like citizen scientists, community college
instructors, and well-informed patients. We'll start with people whose ZPD
already *almost* reaches a given article.We're confident we can
substantially improve the reading experience of these folks, using
completely off-the-shelf technology.

Then we iterate, building on the lessons users teach us. Over time, we'll
learn to build more and more robust scaffolding, **based on real experience
with real readers**. Engaging closely with users shows us the leverage
points where we can deploy AI tech most effectively, as well as the few
places where we need to be actually pushing the tech forward.

This incremental, iterative strategy both reduces risk, and supports very
fast development. We think that over two years, it'll get us to the point
where we reach audiences that would never before have considered reading
the scholarly literature on their own.

But time, as always, will tell. ⏳😃

[1] https://en.wikipedia.org/wiki/Zone_of_proximal_development


>
> - It is with this in mind that it is so important that the underlying code
> from this project is open source and transparent, so that the improvements
> or failures that arise from the project can be adopted and adapted by the
> wider community. In this way the project can have research benefits in
> addition to the possible dissemination/public exposure gains.
>

Couldn't agree more. We've been an open source shop since we got started at
a hackathon seven years ago....that certainly won't be changing for this
project. You can find our code at https://github.com/Impactstory


>
> - While I believe Heather's interpretation of moral rights is incorrect*,
> I think you should carefully think about the legal ramifications of a
> project like this. At the very least a clear demarcation between the
> original work and the derived explanations should be immediately obvious to
> users of the application, and an opt-out/takedown mechanism be implemented.
> Presumably a way to include and exclude the different parts of the
> Explanation Engine could be interesting too, for example by being able to
> turn on or off the
>

Agreed. Good points.


>
> These caveats aside, I look forward to seeing the results of your work,
> and would like to offer my own CC-licensed work as guinea pig for the
> Engine if you need an actual author to give feedback on the quality of the
> annotations and summaries that are produced by your concept machine.
>

Thanks very much Henrik, we may take you up on that!

Best,
Jason


>
>
> Regards,
> Henrik Karlstrøm
>
> * The Berne Declarations does NOT unilaterally grant authors the right to
> object to modifications of a work that "affects their reputation" - that
> would make for example literary criticism or academic debate very hard. For
> the Right to Integrity to be violated, it is not enough that the author
> feels that there is a violation, the author must be able to demonstrate
> prejudice on the part of the modifiers, and demonstrate it "on an objective
> standard that is based on public or expert opinion in order to establish
> that the author's opinion was reasonable" [2]. Of course, this only matters
> in jurisdictions that recognize moral rights. The US, for example, does
> not...
>
> [1] Bornmann, L. Scientometrics (2018). https://doi.org/10.1007/
> s11192-018-2855-y
> [2] https://en.wikipedia.org/wiki/Prise_de_parole_Inc_v_Gu%C3%
> A9rin,_%C3%A9diteur_Lt%C3%A9e
>
>
> >-----Original Message-----
> >From: goal-boun...@eprints.org <goal-boun...@eprints.org> On Behalf Of
> >Heather Morrison
> >Sent: Friday, July 13, 2018 8:32 PM
> >To: Global Open Access List (Successor of AmSci) <goal@eprints.org>
> >Subject: Re: [GOAL] Why translating all scholarly knowledge for
> non-specialists
> >using AI is complicated
> >
> >It is easy to cherry-pick some examples of where this might work and not
> be
> >problematic. This is useful as an analytic exercise to demonstrate the
> potential.
> >However it is important to consider and assess negative as well as
> positive
> >possible consequences.
> >
> >With respect to violation of author's moral rights, under Berne 6bis
> >http://www.wipo.int/treaties/en/text.jsp?file_id=283698 authors have the
> right
> >to object to certain modifications of their work, that may impact the
> authors
> >reputation, even after transfer of all economic rights. Reputation is
> critical to an
> >academic career.
> >
> >Has anyone conducted research to find out whether academic authors
> consider
> >Wikipedia annotations to be an acceptable modification of their work?
> >
> >As an academic author, after using CC licenses permitting modifications
> for many
> >years, after careful consideration, I have stopped doing this. Your work
> for me
> >reinforces the wisdom of this decision. I do not wish my work to be
> annotated or
> >automatically summarized by your project. I suspect that other academic
> authors
> >will share this perspective. This may include authors who have chosen
> liberal
> >licenses without realizing that they have inadvertently granted
> permission for
> >such experiments.
> >
> >CC licenses with the attribution element include author moral rights and
> remedies
> >for violation of such rights.
> >
> >My advice is to limit this experiment to willing participants. For the
> avoidance of
> >doubt: I object to your group annotating or automatically summarizing my
> work.
> >
> >Thank you for the offer to contribute to your project. These posts to
> GOAL are
> >my contribution.
> >
> >best,
> >
> >Heather Morrison
> >
> >________________________________
> >From: goal-boun...@eprints.org <goal-boun...@eprints.org> on behalf of
> >Jason Priem <ja...@impactstory.org>
> >Sent: Friday, July 13, 2018 1:35:51 PM
> >To: Global Open Access List (Successor of AmSci)
> >Subject: Re: [GOAL] Why translating all scholarly knowledge for
> non-specialists
> >using AI is complicated
> >
> >Thanks Heather for your continued comments! Good stuff in there. Some
> >responses below:
> >
> >
> >
> >HM: Q1: to clarify, we are talking about peer-reviewed journal articles,
> right? You
> >are planning to annotate journal articles that are written and vetted by
> experts
> >using definitions that are developed by anyone who chooses to participate
> in
> >Wikipedia / Wikidata, i.e. annotating works that are carefully vetted by
> experts
> >using the contributions of non-experts?
> >
> >Correct. An example may be useful here:
> >
> >The article "More than 75 percent decline over 27 years in total flying
> insect
> >biomass in protected areas" was published in 2017 by PLOS ONE [1], and
> >appeared in hundreds of news stories and thousands of tweets [2]. It's
> open
> >access which is great. But if you try to read the article, you run into
> sentences like
> >this:
> >
> >"Here, we used a standardized protocol to measure total insect biomass
> using
> >Malaise traps, deployed over 27 years in 63 nature protection areas in
> Germany
> >(96 unique location-year combinations) to infer on the status and trend
> of local
> >entomofauna."
> >
> >Even as a somewhat well-educated person, I sure don't know what a Malaise
> trap
> >is, or what entomofauna is. The more I trip over words and concepts like
> this, the
> >less I want to read the article. I feel like it's just...not for me.
> >
> >But Wiktionary can tell me entomofauna means "insect fauna," [3] and
> Wikipedia
> >can show me a picture of a Malaise trap (it looks like a tent, turns out)
> [4].
> >
> >We're going to bring those kinds of descriptions and definitions right
> next to the
> >text, so it will feel a bit more like this article IS for me. This isn't
> going to make
> >the article magically easy to understand, but we think it will help open
> a door that
> >makes engaging with the literature a bit more inviting. Our early tests
> with this
> >are very promising.
> >
> >That said, we're certainly going to be iterating on it a lot, and we're
> not actually
> >attached to any particular implementation details. The goal is to help
> laypeople
> >access the literature, and do it responsibly. If this turns out to be
> impossible with
> >this approach, then we'll move on to another one.
> >
> >For us, the key to the Explanation Engine idea is to be modular and
> flexible, using
> >multiple layered techniques, in order to reduce risk and increase speed.
> >
> >
> >[1] http://journals.plos.org/plosone/article?id=10.1371/
> journal.pone.0185809
> >[2] https://www.altmetric.com/details/27610705
> >[3] https://en.wiktionary.org/wiki/entomofauna
> >[4] https://en.wikipedia.org/wiki/Malaise_trap
> >
> >
> >
> >
> >Q2: who made the decision that this is safe, and how was this decision
> made?
> >
> >Hm, perhaps I should've been more careful in my original statement.
> Apologies.
> >There's certainly no formal Decision here...I'm just suggesting that we
> think the
> >risk of spreading misinformation is relatively low with this approach.
> That's why
> >we'll start there. But the proof will need to be in the pudding, of
> course. We'll
> >need to implement this, test it, and so on.
> >
> >Maybe I'm wrong and this is actually a horrible, dangerous idea.
> >
> >If so, we'll find out, and take it from there. Thanks for letting us know
> you are
> >concerned it's not safe. We' take that seriously and so we'll make sure
> we are
> >evaluating this feature carefully. If you're interested in helping with
> that, we'd
> >love to have your input as well...drop me a line off-list and we can talk
> about how
> >to work together on it.
> >
> >
> >If the author has not given permission, this is a violation of the
> author's moral
> >rights under copyright. This includes all CC licensed works except CC-0.
> >
> >I'm not sure I see how this would be true? We are not modifying the text
> or
> >failing to give credit to the original author, but rather creating a
> commentary on
> >it...quite like one might do if discussing the paper in a journal club.
> >
> >I am not opposed to your project, just the assumption that a two-year
> project is
> >sufficient to create a real-world system to translate all scholarly
> knowledge for
> >the lay reader.
> >
> >Makes sense. You may be right...could be a quixotic errand. We will do
> our best,
> >and hopefully whatever we come up with will be a step in the right
> direction, at
> >least. I think something like this could make the world a better place,
> and maybe
> >if we aren't able to achieve it we can at least help give some ideas to
> the people
> >who ultimately do.
> >
> >
> > A cautious and iterative approach is wise; however this is not feasible
> in the
> >context of a two-year grant. May I suggest a small pilot project? Try
> this with a
> >few articles in an area where at least one member of your team has a
> doctorate.
> >Take the time to evaluate the summaries. If they look okay to your team,
> plan a
> >larger evaluation project involving other experts and the lay readers you
> are
> >aiming to engage (because what an expert thinks a summary says may not be
> the
> >same as how a non-expert would interpret the same summary).
> >
> >I think this sounds great! Your plan is very much what we have in mind to
> do. And
> >then we will continue from there on the "cautious iterative approach" to
> rolling
> >out features. I think the only area where we differ is in the
> timeline...sounds like
> >you don't project that we can get everything we need to done in a
> two-year time
> >frame.
> >
> >You may be right. Time will tell. Historically, Impactstory has been able
> to get
> >stuff done pretty fast, but once again, the proof will be in the pudding
> 😃. We're
> >certainly excited and motivated and will be doing our best!
> >
> >
> >
> >Thank you for posting openly about the approach and for the opportunity to
> >comment.
> >
> >Thank you for your thoughtful comments!
> >j
> >
> >
> >best,
> >
> >Heather Morrison
> >Associate Professor, School of Information Studies, University of Ottawa
> >Professeur Agrégé, École des Sciences de l'Information, Université
> d'Ottawa
> >heather.morri...@uottawa.ca<mailto:heather.morri...@uottawa.ca>
> >https://uniweb.uottawa.ca/?lang=en#/members/706
> >
> >_
> >
> >_______________________________________________
> >GOAL mailing list
> >GOAL@eprints.org<mailto:GOAL@eprints.org>
> >http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
> >
> >
> >
> >
> >--
> >Jason Priem, co-founder
> >Impactstory<http://impactstory.org/>: We make tools to power the Open
> >Science revolution follow at @jasonpriem<http://twitter.com/jasonpriem>
> and
> >@impactstory<http://twitter.com/impactstory>
> >
> >_______________________________________________
> >GOAL mailing list
> >GOAL@eprints.org
> >http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>
> _______________________________________________
> GOAL mailing list
> GOAL@eprints.org
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>



-- 
Jason Priem, co-founder
Impactstory <http://impactstory.org/>: We make tools to power the Open
Science revolution
follow at @jasonpriem <http://twitter.com/jasonpriem> and @impactstory
<http://twitter.com/impactstory>

_______________________________________________
GOAL mailing list
GOAL@eprints.org
http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal

Re: [GOAL] Why translating all scholarly knowledge for non-specialists using AI is complicated

Reply via email to