Re: [Wikitech-l] Wikipedia database

2009-11-19 Thread Roan Kattouw
2009/11/19  :
> Greeting,
>
> May I ask the question about wikipedia database. I downloaded the Wikipedia
> revision current data. and found there are some records have the exactly
> same rev_id, rev_user and same timestamp. What does it mean? are they the
> same edit or different?
>
If they belong to the same wiki, they're very likely to be the same
edit. Of course such duplicates should theoretically not occur.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-19 Thread zh509
On Nov 19 2009, Roan Kattouw wrote:

>2009/11/19  :
>> Greeting,
>>
>> May I ask the question about wikipedia database. I downloaded the 
>> Wikipedia revision current data. and found there are some records have 
>> the exactly same rev_id, rev_user and same timestamp. What does it mean? 
>> are they the same edit or different?
>>
>If they belong to the same wiki, they're very likely to be the same
>edit. Of course such duplicates should theoretically not occur.
>
>Roan Kattouw (Catrope)
>

Thanks, I noted that because i add Revision Table and Page table together. 
May I ask why for the same page.page_latest, there are two same records on 
the table? Is that the link between revision and Page is the 
rev_id=page.page_latest?

thanks. 

Zeyi

___
>Wikitech-l mailing list
>Wikitech-l@lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-19 Thread Platonides
zh...@york.ac.uk wrote:
> On Nov 19 2009, Roan Kattouw wrote:
> 
>> 2009/11/19  :
>>> Greeting,
>>>
>>> May I ask the question about wikipedia database. I downloaded the 
>>> Wikipedia revision current data. and found there are some records have 
>>> the exactly same rev_id, rev_user and same timestamp. What does it mean? 
>>> are they the same edit or different?
>>>
>> If they belong to the same wiki, they're very likely to be the same
>> edit. Of course such duplicates should theoretically not occur.
>>
>> Roan Kattouw (Catrope)
>>
> 
> Thanks, I noted that because i add Revision Table and Page table together. 
> May I ask why for the same page.page_latest, there are two same records on 
> the table? Is that the link between revision and Page is the 
> rev_id=page.page_latest?

page.page_latest point to the current revision.rev_id


However, you shouldn't be able to have several revisions with the same
rev_id. Even if something went horribly wrong at the wiki level, rev_id
is a PRIMARY KEY.
How did you do the import?
I suspect you may have broken something importing or merging.



___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-20 Thread zh509
On Nov 19 2009, Platonides wrote:

>zh...@york.ac.uk wrote:
>> On Nov 19 2009, Roan Kattouw wrote:
>> 
>>> 2009/11/19  :
 Greeting,

 May I ask the question about wikipedia database. I downloaded the 
 Wikipedia revision current data. and found there are some records have 
 the exactly same rev_id, rev_user and same timestamp. What does it 
 mean? are they the same edit or different?

>>> If they belong to the same wiki, they're very likely to be the same
>>> edit. Of course such duplicates should theoretically not occur.
>>>
>>> Roan Kattouw (Catrope)
>>>
>> 
>> Thanks, I noted that because i add Revision Table and Page table 
>> together. May I ask why for the same page.page_latest, there are two 
>> same records on the table? Is that the link between revision and Page is 
>> the rev_id=page.page_latest?
>
>page.page_latest point to the current revision.rev_id
>
>
>However, you shouldn't be able to have several revisions with the same
>rev_id. Even if something went horribly wrong at the wiki level, rev_id
>is a PRIMARY KEY.
>How did you do the import?
>I suspect you may have broken something importing or merging.
>

I took the sub-current data from MediaWiki and import them to Oracle. I 
found there are two same page_latest ID in the page table. Then when I 
tried to join Revision table and Page table together, this caused two same 
rev_id.

May I ask why I have two page_latest on page table, what it mean? If I want 
to put Revision table and Page table together, which should be the link 
point?

thanks,
Zeyi
>
>___
>Wikitech-l mailing list
>Wikitech-l@lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-20 Thread Platonides
Zeyi wrote:
> I took the sub-current data from MediaWiki and import them to Oracle. 
Which tool did you use for the import?

> I found there are two same page_latest ID in the page table. Then when I 
> tried to join Revision table and Page table together, this caused two same 
> rev_id.

Which pages are those?


> May I ask why I have two page_latest on page table, what it mean? If I want 
> to put Revision table and Page table together, which should be the link 
> point?

You shouldn't have that situation.
And why are you merging page and revision, anyway?

> thanks,
> Zeyi


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-21 Thread zh509
On Nov 20 2009, Platonides wrote:

>Zeyi wrote:
>> I took the sub-current data from MediaWiki and import them to Oracle. 
>Which tool did you use for the import?
>
I used xml2sql tool, which is easy to use. 

>> I found there are two same page_latest ID in the page table. Then when 
>> I tried to join Revision table and Page table together, this caused two 
>> same rev_id.
>
>Which pages are those?
kinds of every pages, is that page_latest ID unique? 
>
>
>> May I ask why I have two page_latest on page table, what it mean? If I 
>> want to put Revision table and Page table together, which should be the 
>> link point?
>
>You shouldn't have that situation.
>And why are you merging page and revision, anyway?

I need use rev_user and page_namespace to do crossing-analysis. How i can 
put them in the one table? thanks again.

>> thanks,
>> Zeyi
>
>
>___
>Wikitech-l mailing list
>Wikitech-l@lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-21 Thread Roan Kattouw
2009/11/21  :
> I need use rev_user and page_namespace to do crossing-analysis. How i can
> put them in the one table? thanks again.
>
You don't need to put them in one table, just use a query with a JOIN.

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-23 Thread zh509
Thanks. but is that page_latest is unique in page table?

On Nov 21 2009, Roan Kattouw wrote:

>2009/11/21  :
>> I need use rev_user and page_namespace to do crossing-analysis. How i can
>> put them in the one table? thanks again.
>>
>You don't need to put them in one table, just use a query with a JOIN.
>
>Roan Kattouw (Catrope)
>
>___
>Wikitech-l mailing list
>Wikitech-l@lists.wikimedia.org
>https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2009-11-23 Thread Roan Kattouw
2009/11/23  :
> Thanks. but is that page_latest is unique in page table?
>
Yes. Every revision belongs to one page only (rev_page).

Roan Kattouw (Catrope)

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wikipedia database

2010-11-25 Thread Q
On 11/25/2010 2:14 AM, Petromir Dzhunev wrote:
> Hi everyone,
> 
>  
> 
> Would you like to put in "page" table coordinates for each page(of course
> for the pages, which have coordinates)?Is it possible?
> 
> The reason I'm asking you is that we want to know, which Wikipedia pages are
> marked in Google maps.

http://en.wikipedia.org/w/api.php?action=query&list=embeddedin&eititle=Template:Coord

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l