Hi Christophe,

You can find more information about the Live Extraction and how to set
it up here:
https://github.com/dbpedia/extraction-framework/wiki/DBpedia-live
. I've written some scripts to automate the process and I can also set
it up or guide you trough it.
If you are the maintainer of a DBpedia language chapter, you can request
an OAI proxy access from Dimitris.

The approach with the Incremental Dumps is intriguing since it doesn't
require access to the OAI-PMH stream which would enable a lot of people
to host experimental clones of the system. However we might get a lot
more extraction errors using it:

"*Here's the big fat disclaimer.*

/This service is experimental./ At any time it may not be working, for a
day, a week or a month. It is not intended to replace the full XML
dumps. We don't expect users to be able to construct full dumps of a
given date from the incrementals and an older dump. We don't guarantee
that the data included in these dumps is complete, or correct, or won't
break your Xbox. In short: don't blame us (but do get on the email list
and send mail: see xmldatadumps-l
<https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l>)."

I don't know if anyone tried using the incremental dumps, I didn't know
they existed, this service seems to be quite new. The extraction
framework should actually work out of the box with these dumps, either
with the dump extraction or by packing the dumps in a mediawiki instance
and calling the live extraction. I will try it out.


Cheers,
Alexandru

On 04/07/2014 12:30 PM, Christophe Desclaux wrote:
> Hello,
> I'm currently working on a French version of dbpedia-live and would like 
> to know if my approach is good. In our case we want to have a live 
> updated each 24h and not in real time like in the international version.
> I will use the live extraction framework published on 
> <https://github.com/dbpedia/extraction-framework/tree/master/live>.
> I have (like in the international live) a local wikipedia mirror 
> communication with my live using OAI. But in order to retrieve the 
> wikipedia upstream modifications I have no access to the wikipedia OAI 
> and so i want to use th incremental dumps published everyday by 
> wikimedia <http://dumps.wikimedia.org/other/incr/frwiki/>. Do you think 
> it's a good approach ?
> I'm currently working on the first extraction of data to virtuoso and 
> will work on the incremental dumps import next week. Maybe this kind of 
> work has been made in other country earlier ?
> For the mapping wiki what's the best solution to plug the extraction 
> framework with ?
>
> I spend some time on your mailling list looking for informations about 
> the live. (My result is on 
> <https://github.com/descl/extraction-framework/wiki/mails-about-live>)
> Thanks, Christophe
>
>
> ------------------------------------------------------------------------------
> Put Bad Developers to Shame
> Dominate Development with Jenkins Continuous Integration
> Continuously Automate Build, Test & Deployment 
> Start a new project now. Try Jenkins in the cloud.
> http://p.sf.net/sfu/13600_Cloudbees_APR
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees_APR
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to