Re: [Wikidata-l] Fwd: [Wikidata-tech] Description(s)

2014-10-22 Thread Daniel Kinzler
Am 22.10.2014 11:06, schrieb Daniel Kinzler:
> Hi Lukas!
> 
> That really shouldn't happen...
> 
> Can you tell me on which item that happens?
> Also, please double-check the namespace and content model of the respective
> entry in the dump.

Never mind, I found it in the dump. Can't reproduce, though. Strange.

Filed https://bugzilla.wikimedia.org/show_bug.cgi?id=72348

-- daniel

-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Fwd: [Wikidata-tech] Description(s)

2014-10-22 Thread Daniel Kinzler
Am 22.10.2014 07:29, schrieb Gerard Meijssen:
> Hoi,
> Is this dump going to be cleaned up? Will the next dump be good? Why did this 
> go
> wrong? 

Frankly, we have no idea why this is going wrong. I cannot reproduce the problem
locally, and it seems to work fine with Special:Export.

Dump generation is a bit strange and wonderful, and few people actually know in
detail how it works on the live cluster. I vaguely remember that at one point,
only new revisions were dumped, and the result "stitched" into old dumps. That
would explain the issue - and it would be something we cannot fix on the
Wikibase side. I'm trying to get hold of someone who can confirm/fix this.

I have filed  so this gets
tracked. I'll also bring it up in our next call with the foundation.

-- daniel


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Fwd: [Wikidata-tech] Description(s)

2014-10-22 Thread Daniel Kinzler
Hi Lukas!

That really shouldn't happen...

Can you tell me on which item that happens?
Also, please double-check the namespace and content model of the respective
entry in the dump.

-- daniel

Am 21.10.2014 17:02, schrieb Lukas Benedix:
> Different keys can still be found in the actual xml dump 
> wikidatawiki-20141009-pages-articles.xml.bz2.
> 
> This bug/feature is also present in the current dump with history.
> 
> page_id   wd_id   keys 111   Q15 ['aliases', 'claims',
> 'descriptions', 'id', 'labels', 'sitelinks', 'type'] 137   Q24
> ['aliases', 'claims', 'description', 'entity', 'label', 'links'] 31500
> Q28119  ['aliases', 'description', 'entity', 'label', 'links'] 225144?
> ['entity', 'redirect'] 3916689   P6  ['aliases', 'claims', 'datatype',
> 'descriptions', 'id', 'labels', 'type'] 3916937   P10 ['aliases',
> 'claims', 'datatype', 'description', 'entity', 'label']
> 
> 
> Lukas
> 
> Am Do 09.10.2014 19:32, schrieb Lydia Pintscher:
>> On Thu, Oct 9, 2014 at 3:19 PM, Magnus Manske 
>>  wrote:
>>> I managed to do the task at hand by switching to JSON dumps (because
>>> that's the new, officially supported, long-term-stable Wikidata dump
>>> format, right? Right???), so no hurry there.
>>> 
>>> Maybe the XML dump process was run in the middle of the switch to the
>>> new format, or got a stale cache for some items?
>> 
>> It looks like the switch happened in the middle of a dump creation so 
>> this one is half old and half new format mixed. The ones after that 
>> should be all new format. And yay for switching to JSON!
>> 
>> 
>> Cheers Lydia
>> 
> 
> 
> 
> 
> 
> 
> 
> 
> ___ Wikidata-l mailing list 
> Wikidata-l@lists.wikimedia.org 
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
> 


-- 
Daniel Kinzler
Senior Software Developer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] Fwd: [Wikidata-tech] Description(s)

2014-10-21 Thread Gerard Meijssen
Hoi,
Is this dump going to be cleaned up? Will the next dump be good? Why did
this go wrong?
Thanks,
 GerardM

On 21 October 2014 17:02, Lukas Benedix  wrote:

> Different keys can still be found in the actual xml dump
> wikidatawiki-20141009-pages-articles.xml.bz2.
>
> This bug/feature
> is also present in the current dump with history.
>
> page_id   wd_id   keys
> 111   Q15 ['aliases', 'claims', 'descriptions', 'id', 'labels',
> 'sitelinks', 'type']
> 137   Q24 ['aliases', 'claims', 'description', 'entity',
> 'label', 'links']
> 31500 Q28119  ['aliases', 'description', 'entity', 'label', 'links']
> 225144?   ['entity', 'redirect']
> 3916689   P6  ['aliases', 'claims', 'datatype', 'descriptions',
> 'id', 'labels', 'type']
> 3916937   P10 ['aliases', 'claims', 'datatype', 'description',
> 'entity', 'label']
>
>
> Lukas
>
> Am Do 09.10.2014 19:32, schrieb Lydia Pintscher:
> > On Thu, Oct 9, 2014 at 3:19 PM, Magnus Manske
> >  wrote:
> >> I managed to do the task at hand by switching to JSON dumps (because
> that's
> >> the new, officially supported, long-term-stable Wikidata dump format,
> right?
> >> Right???), so no hurry there.
> >>
> >> Maybe the XML dump process was run in the middle of the switch to the
> new
> >> format, or got a stale cache for some items?
> >
> > It looks like the switch happened in the middle of a dump creation so
> > this one is half old and half new format mixed. The ones after that
> > should be all new format. And yay for switching to JSON!
> >
> >
> > Cheers
> > Lydia
> >
>
>
>
>
>
>
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


[Wikidata-l] Fwd: [Wikidata-tech] Description(s)

2014-10-21 Thread Lukas Benedix
Different keys can still be found in the actual xml dump
wikidatawiki-20141009-pages-articles.xml.bz2.

This bug/feature
is also present in the current dump with history.

page_id   wd_id   keys
111   Q15 ['aliases', 'claims', 'descriptions', 'id', 'labels',
'sitelinks', 'type']
137   Q24 ['aliases', 'claims', 'description', 'entity',
'label', 'links']
31500 Q28119  ['aliases', 'description', 'entity', 'label', 'links']
225144?   ['entity', 'redirect']
3916689   P6  ['aliases', 'claims', 'datatype', 'descriptions',
'id', 'labels', 'type']
3916937   P10 ['aliases', 'claims', 'datatype', 'description',
'entity', 'label']


Lukas

Am Do 09.10.2014 19:32, schrieb Lydia Pintscher:
> On Thu, Oct 9, 2014 at 3:19 PM, Magnus Manske
>  wrote:
>> I managed to do the task at hand by switching to JSON dumps (because that's
>> the new, officially supported, long-term-stable Wikidata dump format, right?
>> Right???), so no hurry there.
>>
>> Maybe the XML dump process was run in the middle of the switch to the new
>> format, or got a stale cache for some items?
> 
> It looks like the switch happened in the middle of a dump creation so
> this one is half old and half new format mixed. The ones after that
> should be all new format. And yay for switching to JSON!
> 
> 
> Cheers
> Lydia
> 








signature.asc
Description: OpenPGP digital signature
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l