Hi,

I am trying to set up a DBpedia Live Mirror on my personal Mac machine. Here is 
some technical host information about my setup:
Operating System: OS X 10.9.3
>Processor  2.6 GHz Intel Core i7
>Memory  16 GB 1600 MHz DDR3
>Database server used for hosting data for the DBpedia Live Mirror: OpenLink 
>Virtuoso (Open-source edition: https://sourceforge.net/projects/virtuoso/)
Here's a summary of the steps I followed so far:

1. Downloaded the initial data seed from DBPedia Live at 
http://live.dbpedia.org/dumps/: dbpedia_2013_07_18.nt.bz2
2. Downloaded the synchronization tool from 
http://sourceforge.net/projects/dbpintegrator/files/.
3. Executed the virtload.sh script. Had to tweak some commands in here to be 
compatible with OS X.
4. Adapted the synchronization tools configuration files according to the 
README.txt file as follows:
a) Set the start date in file "lastDownloadDate.dat" to the date of that dump 
(2013-07-18-00-000000).
b) Set the configuration information in file "dbpedia_updates_downloader.ini", 
such as login credentials for Virtuoso, and GraphURI (http://live.dbpedia.org).
5. Executed "java -jar dbpintegrator-1.1.jar" on the command line.
This script repeatedly showed the following error:

INFO - Options file read successfully
INFO - File : http://live.dbpedia.org/changesets/lastPublishedFile.txt has been 
successfully downloaded
INFO - File : 
http://live.dbpedia.org/changesets/2014/06/16/13/000001.removed.nt.gz has been 
successfully downloaded
WARN - File 
/Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz 
cannot be decompressed due to Unexpected end of ZLIB input stream
ERROR - Error:  (No such file or directory)
INFO - File : 
http://live.dbpedia.org/changesets/2014/06/16/13/000001.added.nt.gz has been 
successfully downloaded
WARN - File 
/Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.added.nt.gz 
cannot be decompressed due to Unexpected end of ZLIB input stream
ERROR - Error:  (No such file or directory)
INFO - File : http://live.dbpedia.org/changesets/lastPublishedFile.txt has been 
successfully downloaded
INFO - File : 
http://live.dbpedia.org/changesets/2014/06/16/13/000002.removed.nt.gz has been 
successfully downloaded
INFO - File : 
/Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt.gz 
decompressed successfully to 
/Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000002.removed.nt
WARN - null Function executeStatement
WARN - null Function executeStatement
WARN - null Function executeStatement
WARN - null Function executeStatement
WARN - null Function executeStatement
...

Questions
1) Why do I repeatedly see the following error when running the Java program: 
"dbpintegrator-1.1.jar"? Does this mean that the triples from these files were 
not updated in my live mirror?
WARN - File 
/Users/shruti/virtuoso/dbpedia-live/UpdatesDownloadFolder/000001.removed.nt.gz 
cannot be decompressed due to Unexpected end of ZLIB input stream
ERROR - Error:  (No such file or directory)

2) How can I verify that the data loaded in my mirror is up to date? Is there a 
SPARQL query I can use to validate this?

3) I see that the data in my live mirror is missing wikiPageId 
(http://dbpedia.org/ontology/wikiPageID) and wikiPageRevisionsID 
(http://dbpedia.org/ontology/wikiPageRevisionID). Why is that? Is this data 
missing from the DBpedia live data dumps located here 
(http://live.dbpedia.org/dumps/)?

Please do let me know. Thanks!

~ Shruti
Software Engineer, San Francisco Area
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to