Hm, that is very correct. The data I've got do not have this info.
But I won't run such a query again soon, since this still does the
job: for now I only want to acknowledge when somebody has left sr.wp
and to book the reason by reviewing the talk and other relevant pages
from that time.
Thank yo
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Михајло Анђелковић:
> Namespaces are easily determined from the page prefix, I am not
> bothered if there are any anomalies out there (i.e. page starting with
> "User talk:" being in NS 0)
There are no page namespace prefixes in the databases. IOW, "
Thank you, guys, I've already taken what I needed.
Namespaces are easily determined from the page prefix, I am not
bothered if there are any anomalies out there (i.e. page starting with
"User talk:" being in NS 0) and the query is lighter in case ns isn't
being pulled out from the DB. In overall,
Михајло Анђелковић wrote:
> I would ask for allowance to run a request that can be resource
> consuming if not properly scaled:
>
> SELECT page.page_title as title, rev_user_text as user, rev_timestamp
> as timestamp, rev_len as len FROM revision JOIN page ON page.page_id =
> rev_page WHERE rev_id
Михајло Анђелковић wrote:
> Hello,
>
> I would ask for allowance to run a request that can be resource
> consuming if not properly scaled:
>
> SELECT page.page_title as title, rev_user_text as user, rev_timestamp
> as timestamp, rev_len as len FROM revision JOIN page ON page.page_id =
> rev_page
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Михајло Анђелковић:
> WHERE rev_id > 0 AND rev_id < [...] AND rev_deleted = 0;
Please check that MySQL plans this correctly (using the rev_id index).
> If this is generally allowed to do, my question is how large chunks of
> data can I take at once,
Unfortunately, the complete dumps contain lots if data I don't
actually need and I am afraid I am not willing to commit such an
impact to my small HDD. And even more, they are really unavailable
since 10.11, which is kind of very long already.
Right now I have time for this research and I want to
2010/11/29 Михајло Анђелковић :
> This is intended to extract basic data about all publicly visible
> revisions from 1 to [...]. Info about each revision would be a 4-tuple
> title/user name/time/length. I need this data to start generating a
> timeline of editing of srwiki, so it is intended to be
Hello,
I would ask for allowance to run a request that can be resource
consuming if not properly scaled:
SELECT page.page_title as title, rev_user_text as user, rev_timestamp
as timestamp, rev_len as len FROM revision JOIN page ON page.page_id =
rev_page WHERE rev_id > 0 AND rev_id < [...] AND re