Roberto,

I think 
http://joernhees.de/blog/2010/10/31/setting-up-a-local-dbpedia-mirror-with-virtuoso/
 
  is very useful. We (well, Sarven, in CC) have done it for the  
DBpedia mirror Ireland [1] with the following spec:

  System:
        Ubuntu x86_64 GNU/Linux

        Memory: 16GB

        Disk: 2TB (swap: 16 GB)

        Filesystem: ext4

        Server: Apache/2.2.14


  Datadump:

        Source: http://downloads.dbpedia.org/

        Modified: 2011-09-13

        Size on disk: 354GB

        Number of files: 17694

        HTTP GET time length: ~3 days

(~65.5 hours for raw files at 1.5MB/s + overhead)


Replication:

        Requirements:

            wget

        HTTP GET command:

            wget -vcNr -w5 -np -nH http://downloads.dbpedia.org/


HTH ...

Cheers,
        Michael

[1] http://ie.mirror.linkeddata.org/dbpedia/
--
Dr. Michael Hausenblas, Research Fellow
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html

On 29 Feb 2012, at 12:27, Roberto Mirizzi wrote:

> Hi all,
> I think one of our IP addresses has been blacklistened by dbpedia
> servers. We use these addresses just for research purposes within my
> university. Who should I contact for kindly asking to enable it again?
>
> Ok, the obvious answer to this question could be: "install a local  
> dump
> of DBpedia and don't bother DBpedia server".
> Well, that's what I would really like to do. We used to have dbpedia
> dumps 3.5. Then, we recently decided to install a brand new fresh
> version with dump 3.7. The nightmare started. :-) Here's my story.
>
> I successfully installed Virtuoso Opensource 6.1.4 (latest version)  
> on a
> Linux Ubuntu 10.04 64bit distribution with 32GB ram.
> Then, I tried several times to follow the instructions at:
> http://www.openlinksw.com/dataspace/dav/wiki/Main/VirtBulkRDFLoaderExampleDbpedia
> (I successfully did the same a couple of years ago for dump 3.5).
> Unfortunately, after one hour or two of correct execution, the
> rdf_loader_run() procedure stucks, the virtuoso-t process result  
> active
> (at least it seems to be active since a "ps aux|grep virtuoso" shows  
> me
> the process has not been killed), but everything concerning virtuoso
> seems to be dead: the web interface http://localhost:8890 does not
> respond anymore, a "top" command from the shell does not show  
> "virtuoso"
> (while in the beginning it used 100% of CPU), the "isql-v" command
> allows me to correctly log in, but then the instructions does not  
> respond.
> The virtuoso.log file does not show anything wrong.
>
> Finally, I've observed (from the virtuoso.log file) some of the .nt
> files of the dump contain incorrect triples. For example, I get this
> error message:
> File /dbpedia-dump/3.7/en/external_links_en.nt error 23000 SR133:  Can
> not set NULL to not nullable column 'DB.DBA.RDF_QUAD.O'
>
> The problem is that when such an error is encountered, I think the
> loading of that file does not go on. In other words, I could lose
> important triples.
>
> Does anyone has any successfully/unsuccessfully experiences about
> installing DBpedia dump 3.7 on Virtuoso?
>
> ps: where is the openlinksw beloved endpoint?
> http://lod.openlinksw.com/sparql/
>
>
> Thanks in advance,
> roberto
>
> -- 
> Roberto Mirizzi
> Politecnico of Bari
> http://sisinflab.poliba.it/mirizzi
>
>
>
> ------------------------------------------------------------------------------
> Virtualization & Cloud Management Using Capacity Planning
> Cloud computing makes use of virtualization - but cloud computing
> also focuses on allowing computing to be delivered as a service.
> http://www.accelacomm.com/jaw/sfnl/114/51521223/
> _______________________________________________
> Dbpedia-discussion mailing list
> Dbpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to