Re: [Wikipedia-l] [Wikidata] Fwd: [Wikimedia-l] Wikipedia in an abstract language

Denny Vrandečić Tue, 15 Jan 2019 13:28:11 -0800

Cool, thanks! I read this a while ago, rereading again.

On Tue, Jan 15, 2019 at 3:28 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:


> Hi all,
>
> let me send you a paper from 2013, which might either help directly or at
> least to get some ideas...
>
> A lemon lexicon for DBpedia, Christina Unger, John McCrae, Sebastian
> Walter, Sara Winter, Philipp Cimiano, 2013, Proceedings of 1st
> International Workshop on NLP and DBpedia, co-located with the 12th
> International Semantic Web Conference (ISWC 2013), October 21-25, Sydney,
> Australia
>
> https://github.com/ag-sc/lemon.dbpedia
>
> https://pdfs.semanticscholar.org/638e/b4959db792c94411339439013eef536fb052.pdf
>
> Since the mappings from DBpedia to Wikidata properties are here:
> http://mappings.dbpedia.org/index.php?title=Special:AllPages&namespace=202
> e.g. http://mappings.dbpedia.org/index.php/OntologyProperty:BirthDate
>
> You could directly use the DBpedia-lemon lexicalisation for Wikidata.
>
> The mappings can be downloaded with
>
> git clone https://github.com/dbpedia/extraction-framework ; cd core ;
> ../run download-mappings
>
>
> All the best,
>
> Sebastian
>
>
>
>
> On 14.01.19 18:34, Denny Vrandečić wrote:
>
> Felipe,
>
> thanks for the kind words.
>
> There are a few research projects that use Wikidata to generate parts of
> Wikipedia articles - see for example https://arxiv.org/abs/1702.06235 which
> is almost as good as human results and beats templates by far, but only for
> the first sentence of biographies.
>
> Lucie Kaffee has also quite a body of research on that topic, and has
> worked very succesfully and tightly with some Wikipedia communities on
> these questions. Here's her bibliography:
> https://scholar.google.com/citations?user=xiuGTq0AAAAJ&hl=de
>
> Another project of hers is currently under review for a grant:
> https://meta.wikimedia.org/wiki/Grants:Project/Scribe:_Supporting_Under-resourced_Wikipedia_Editors_in_Creating_New_Articles
> - I would suggest to take a look and if you are so inclined to express
> support. It is totally worth it!
>
> My opinion is that these projects are great for starters, and should be
> done (low-hanging fruits and all that), but won't get much further at least
> for a while, mostly because Wikidata rarely offers more than a skeleton of
> content. A decent Wikipedia article will include much, much more content
> than what is represented in Wikidata. And if you only use that for input,
> you're limiting yourself too much.
>
> Here's a different approach based on summarization over input sources:
> https://www.wired.com/story/using-artificial-intelligence-to-fix-wikipedias-gender-problem/
>  -
> this has a more promising approach for the short- to mid-term.
>
> I still maintain that the Abstract Wikipedia approach has certain
> advantages over both learned approaches, and is most aligned with Lucie's
> work. The machine learned approaches always fall short on the dimension of
> editability, due to the black-boxness of their solutions.
>
> Also, furthermore, agree to Jeblad.
>
> Remains the question, why is there not more discussion? Maybe because
> there is nothing substantial to discuss yet :) The two white papers are
> rather high level and the idea is not concrete enough yet, so that I
> wouldn't expect too much discussion yet going on on-wiki. That was similar
> to Wikidata - the number who discussed Wikidata at this level of maturity
> was tiny, it increased considerably once an actual design plan was
> suggested, but still remained small - and then exploded once the system was
> deployed. I would be surprised and delighted if we managed to avoid this
> pattern this time, but I can't do more than publicly present the idea,
> announce plans once they are there, and hope for a timely discussion :)
>
> Cheers,
> Denny
>
>
> On Mon, Jan 14, 2019 at 2:54 AM John Erling Blad <jeb...@gmail.com> wrote:
>
>> An additional note; what Wikipedia urgently needs is a way to create
>> and reuse canned text (aka "templates"), and a way to adapt that text
>> to data from Wikidata. That is mostly just inflection rules, but in
>> some cases it involves grammar rules. To create larger pieces of text
>> is much harder, especially if the text is supposed to be readable.
>> Jumbling sentences together as is commonly done by various botscripts
>> does not work very well, or rather, it does not work at all.
>>
>> On Mon, Jan 14, 2019 at 11:44 AM John Erling Blad <jeb...@gmail.com>
>> wrote:
>> >
>> > Using an abstract language as an basis for translations have been
>> > tried before, and is almost as hard as translating between two common
>> > languages.
>> >
>> > There are two really hard problems, it is the implied references and
>> > the cultural context. An artificial language can get rid of the
>> > implied references, but it tend to create very weird and unnatural
>> > expressions. If the cultural context is removed, then it can be
>> > extremely hard to put it back in, and without any cultural context it
>> > can be hard to explain anything.
>> >
>> > But yes, you can make an abstract language, but it won't give you any
>> > high quality prose.
>> >
>> > On Mon, Jan 14, 2019 at 8:09 AM Felipe Schenone <scheno...@gmail.com>
>> wrote:
>> > >
>> > > This is quite an awesome idea. But thinking about it, wouldn't it be
>> possible to use structured data in wikidata to generate articles? Can't we
>> skip the need of learning an abstract language by using wikidata?
>> > >
>> > > Also, is there discussion about this idea anywhere in the Wikimedia
>> wikis? I haven't found any...
>> > >
>> > > On Sat, Sep 29, 2018 at 3:44 PM Pine W <wiki.p...@gmail.com> wrote:
>> > >>
>> > >> Forwarding because this (ambitious!) proposal may be of interest to
>> people
>> > >> on other lists. I'm not endorsing the proposal at this time, but I'm
>> > >> curious about it.
>> > >>
>> > >> Pine
>> > >> ( https://meta.wikimedia.org/wiki/User:Pine )
>> > >>
>> > >>
>> > >> ---------- Forwarded message ---------
>> > >> From: Denny Vrandečić <vrande...@gmail.com>
>> > >> Date: Sat, Sep 29, 2018 at 6:32 PM
>> > >> Subject: [Wikimedia-l] Wikipedia in an abstract language
>> > >> To: Wikimedia Mailing List <wikimedi...@lists.wikimedia.org>
>> > >>
>> > >>
>> > >> Semantic Web languages allow to express ontologies and knowledge
>> bases in a
>> > >> way meant to be particularly amenable to the Web. Ontologies
>> formalize the
>> > >> shared understanding of a domain. But the most expressive and
>> widespread
>> > >> languages that we know of are human natural languages, and the
>> largest
>> > >> knowledge base we have is the wealth of text written in human
>> languages.
>> > >>
>> > >> We looks for a path to bridge the gap between knowledge
>> representation
>> > >> languages such as OWL and human natural languages such as English. We
>> > >> propose a project to simultaneously expose that gap, allow to
>> collaborate
>> > >> on closing it, make progress widely visible, and is highly
>> attractive and
>> > >> valuable in its own right: a Wikipedia written in an abstract
>> language to
>> > >> be rendered into any natural language on request. This would make
>> current
>> > >> Wikipedia editors about 100x more productive, and increase the
>> content of
>> > >> Wikipedia by 10x. For billions of users this will unlock knowledge
>> they
>> > >> currently do not have access to.
>> > >>
>> > >> My first talk on this topic will be on October 10, 2018,
>> 16:45-17:00, at
>> > >> the Asilomar in Monterey, CA during the Blue Sky track of ISWC. My
>> second,
>> > >> longer talk on the topic will be at the DL workshop in Tempe, AZ,
>> October
>> > >> 27-29. Comments are very welcome as I prepare the slides and the
>> talk.
>> > >>
>> > >> Link to the paper: http://simia.net/download/abstractwikipedia.pdf
>> > >>
>> > >> Cheers,
>> > >> Denny
>> > >> _______________________________________________
>> > >> Wikimedia-l mailing list, guidelines at:
>> > >> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>> > >> https://meta.wikimedia.org/wiki/Wikimedia-l
>> > >> New messages to: wikimedi...@lists.wikimedia.org
>> > >> Unsubscribe:
>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> > >> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
>> > >> _______________________________________________
>> > >> Wikipedia-l mailing list
>> > >> Wikipedia-l@lists.wikimedia.org
>> > >> https://lists.wikimedia.org/mailman/listinfo/wikipedia-l
>> > >
>> > > _______________________________________________
>> > > Wikidata mailing list
>> > > wikid...@lists.wikimedia.org
>> > > https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>> _______________________________________________
>> Wikidata mailing list
>> wikid...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> _______________________________________________
> Wikidata mailing 
> listWikidata@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wikidata
>
> --
> All the best,
> Sebastian Hellmann
>
> Director of Knowledge Integration and Linked Data Technologies (KILT)
> Competence Center
> at the Institute for Applied Informatics (InfAI) at Leipzig University
> Executive Director of the DBpedia Association
> Projects: http://dbpedia.org, http://nlp2rdf.org,
> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
> <http://www.w3.org/community/ld4lt>
> Homepage: http://aksw.org/SebastianHellmann
> Research Group: http://aksw.org
>
_______________________________________________
Wikipedia-l mailing list
Wikipedia-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikipedia-l

Re: [Wikipedia-l] [Wikidata] Fwd: [Wikimedia-l] Wikipedia in an abstract language

Reply via email to