ArielGlenn added a subscriber: hoo.
ArielGlenn added a comment.
I am proactively adding @hoo as he can provide some insight and perhaps tag
others as well.
TASK DETAIL
https://phabricator.wikimedia.org/T209390
EMAIL PREFERENCES
Sascha added a comment.
Hm, good point. Could the dumps be made consistent? Maybe like this: Before
starting a dump, find the current last revision; pass this cut-off revision ID
to the dumping shards; change the dump-producing code
Mitar added a comment.
Are you sure `lastrevid` works like that for the whole dump? I think that
dump is made from multiple shards, so it might be that `lastrevid` is not
consistent across all items?
TASK DETAIL
https://phabricator.wikimedia.org/T209390
EMAIL PREFERENCES
Sascha added a comment.
To find the timestamp of the last Wikidata change that went into a dump file,
couldn’t one — while processing the dump — extract the entity and revision ID
with the highest `lastrevid` value in the entire dump, and then retrieve the
corresponding `modified` timestamp
Mitar added a comment.
I realized I have exactly the same need as poster on StackOveflow: get a dump
and then using real-time feed to keep it updated. But you have to know where to
start with the real-time feed through EventStreams, using historical
consumption
Mitar added a comment.
Personally, I would love to have for each item in the dump a timestamp when
it was created and a timestamp when it was last modified.
Related: https://phabricator.wikimedia.org/T278031
TASK DETAIL
https://phabricator.wikimedia.org/T209390
EMAIL PREFERENCES
Restricted Application added a project: wdwb-tech.
TASK DETAIL
https://phabricator.wikimedia.org/T209390
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Mitar
Cc: Mitar, ArielGlenn, Smalyshev, Addshore, Invadibot, maantietaja, jannee_e,
Akuckartz,