Re: [Wikidata] Concise/Notable Wikidata Dump

2020-01-07 Thread Simon Razniewski
Hi, Just wanted to express my belated support for such dumps: - We encounter the same problem in research, and both for efficiency, reproducibility, and authoritativeness a centralized solution would be great. - Besides the filtering for existence in Wikipedia, I'd see much potential in

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-22 Thread Amirouche Boubekki
Hello all! Le mar. 17 déc. 2019 à 18:15, Aidan Hogan a écrit : > > Hey all, > > As someone who likes to use Wikidata in their research, and likes to > give students projects relating to Wikidata, I am finding it more and > more difficult to (recommend to) work with recent versions of Wikidata >

Re: [Wikidata] Concise/Notable Wikidata Dump Wikidata Digest, Vol 97, Issue 13

2019-12-21 Thread PWN
--- > > Message: 1 > Date: Thu, 19 Dec 2019 19:15:09 -0300 > From: Aidan Hogan > To: wikidata@lists.wikimedia.org > Subject: Re: [Wikidata] Concise/Notable Wikidata Dump > Message-ID: > Content-Type: text/plain; charset=utf-8; for

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-21 Thread Sebastian Hellmann
Hi Aidan, since DBpedia has been around for twelve years now, we spent the last 3 years intensively re-engineering to solve problems like this. Last week, we finished the Virtuoso DBpedia Docker[1]  to work on Databus Collections[2],[3]. Databus contains different repartitions of all the

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-21 Thread Lydia Pintscher
On Sat, Dec 21, 2019 at 6:37 PM Dan Brickley wrote: > That is also a fine place to record things! I don’t mean to fork the > discussion. Maybe we could have a call for interested parties in the new year? Yeah that sounds like a good idea. Cheers Lydia -- Lydia Pintscher -

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-21 Thread Dan Brickley
On Sat, 21 Dec 2019 at 17:25, Lydia Pintscher wrote: > On Thu, Dec 19, 2019 at 11:16 PM Aidan Hogan wrote: > > - @Lydia, good point! I was thinking that filtering by wikilinks will > > just drop some more obscure nodes (like Q51366847 for example), but had > > not considered that there are some

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-21 Thread Lydia Pintscher
On Thu, Dec 19, 2019 at 11:16 PM Aidan Hogan wrote: > - @Lydia, good point! I was thinking that filtering by wikilinks will > just drop some more obscure nodes (like Q51366847 for example), but had > not considered that there are some more general "concepts" that do not > have a corresponding

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-19 Thread Aidan Hogan
Hey all, Just a general response to all the comments thus far. - @Marco et al., regarding the WDumper by Benno, this is a very cool initiative! In fact I spotted it just *after* posting so I think this goes quite some ways towards addressing the general issue raised. - @Markus, I partially

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-19 Thread Lydia Pintscher
On Tue, Dec 17, 2019 at 7:16 PM Aidan Hogan wrote: > > Hey all, > > As someone who likes to use Wikidata in their research, and likes to > give students projects relating to Wikidata, I am finding it more and > more difficult to (recommend to) work with recent versions of Wikidata > due to the

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-19 Thread Markus Kroetzsch
Hi all, Yes, Benno's WDumper could be used for this purpose. The motivation for the whole project was very similar to what Aidan describes. We realised thought that there won't be a single good way to build smaller dump that would serve every conceivable use in research, which is why the UI

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-18 Thread James Heald
See also this recent discussion/brainstorm on "Wikidata subsetting" https://docs.google.com/document/d/1MmrpEQ9O7xA6frNk6gceu_IbQrUiEYGI9vcQjDvTL9c/edit#heading=h.7xg3cywpkgfq In a geographical context, whether or not an item has a Wikipedia entry has been contemplated as a criterion for

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-18 Thread Edgard Marx
It certainly helps, however, I think Aidan's suggestion goes into the direction of having an official dump distribution. Imagine how many CO2 can be spared just by avoiding the computational resource to recreate this dump every time ones need it. Besides, it standardise the dataset used for

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-18 Thread Marco Fossati
Hi everyone, Benno (in CC) has recently announced this tool: https://tools.wmflabs.org/wdumps/ I haven't checked it out yet, but it sounds related to Aidan's inquiry. Hope this helps. Cheers, Marco On 12/18/19 8:01 AM, Edgard Marx wrote: +1 On Tue, Dec 17, 2019, 19:14 Aidan Hogan

Re: [Wikidata] Concise/Notable Wikidata Dump

2019-12-17 Thread Edgard Marx
+1 On Tue, Dec 17, 2019, 19:14 Aidan Hogan wrote: > Hey all, > > As someone who likes to use Wikidata in their research, and likes to > give students projects relating to Wikidata, I am finding it more and > more difficult to (recommend to) work with recent versions of Wikidata > due to the

[Wikidata] Concise/Notable Wikidata Dump

2019-12-17 Thread Aidan Hogan
Hey all, As someone who likes to use Wikidata in their research, and likes to give students projects relating to Wikidata, I am finding it more and more difficult to (recommend to) work with recent versions of Wikidata due to the increasing dump sizes, where even the truthy version now costs