Thanks for the advice! 

I used the spreadsheet and was able to size the application correctly. 17 hours 
later my rdf+xml triple file is 80% loaded. It looks like it might still take 
up to another 11 hours to finish, but again, this is based on my reading of 
"unescaped backslash" errors that are logged and timestamped with the file line 
number. 

I am still running this under tomcat using the workbench because the CURL 
command threw the following error: MALFORMED DATA: Element type "http:" must be 
followed by either attribute specifications, ">" or "/>".  I might try it again 
later using   curl --data-urlencode -T /path/to/data/data.nt ... to see if that 
helps, but I just wanted to get something running overnight.

Thanks again!

- Josh

It seems that the workbench application is better able to handle these 
On Mar 28, 2013, at 2:51 PM, Marek Šurek wrote:

> Hi,
> if you want to see progress in loading, there is and option to use standard 
> "curl" command instead of openrdf-workbench. It gives you some information 
> what is already loaded.
> To load files into owlim(from .trig file), run this command in your linux 
> shell :
> 
> curl -X POST -H "Content-Type:application/x-trig" -T 
> /path/to/data/datafile.trig 
> localhost:8080/openrdf-sesame/repositories/repository-name/statements
> 
> If you have xml style data, change content type to application/rdf+xml 
> 
> 
> If you load big amount of data, I recommend to use configuration.xls which is 
> part of OWLIM-SE.zip. It can help you to set datastore properly.
> 
> Hope this will help.
> 
> Best regards,
> Marek
> 
> From: Joshua Greben <jgre...@stanford.edu>
> To: owlim-discussion@ontotext.com 
> Sent: Thursday, 28 March 2013, 22:30
> Subject: [Owlim-discussion] Loading a Large Triple Store using OWLIM-SE
> 
> Hello all,
> 
> I am new to this list and to OWLIM-SE and was wondering if anyone could offer 
> advice for loading a large triple store. I am trying to load 670M triples 
> into a repository using the openrdf-sesame workbench under tomcat6 on a 
> single linux VM with 64-bit hardware and 64GB of memory.  
> 
> My JVM has the following: -Xms32g -Xmx32g -XX:MaxPermSize=256m
> 
> Here is the log info for my repository configuration:
> 
> ...
> [INFO ] 2013-03-27 13:57:00,720 [repositories/BFWorks_STF] Configured 
> parameter 'entity-id-size' to '32'
> [INFO ] 2013-03-27 13:57:00,720 [repositories/BFWorks_STF] Configured 
> parameter 'enable-context-index' to 'false'
> [INFO ] 2013-03-27 13:57:00,720 [repositories/BFWorks_STF] Configured 
> parameter 'entity-index-size' to '100000000'
> [INFO ] 2013-03-27 13:57:00,720 [repositories/BFWorks_STF] Configured 
> parameter 'tuple-index-memory' to '1600m'
> [INFO ] 2013-03-27 13:57:00,721 [repositories/BFWorks_STF] Configured 
> parameter 'cache-memory' to '3200m'
> [INFO ] 2013-03-27 13:57:00,721 [repositories/BFWorks_STF] Cache pages for 
> tuples: 83886
> [INFO ] 2013-03-27 13:57:00,721 [repositories/BFWorks_STF] Cache pages for 
> predicates: 0
> [INFO ] 2013-03-27 13:57:00,721 [repositories/BFWorks_STF] Configured 
> parameter 'storage-folder' to 'storage'
> [INFO ] 2013-03-27 13:57:00,741 [repositories/BFWorks_STF] Configured 
> parameter 'in-memory-literal-properties' to 'false'
> [INFO ] 2013-03-27 13:57:00,742 [repositories/BFWorks_STF] Configured 
> parameter 'repository-type' to 'file-repository'
> 
> The loading came to a standstill after 19 hours and tomcat threw an 
> OutOfMemoryError: GC overhead limit exceeded. 
> 
> My question is what the application is doing with all this memory and whether 
> I configured my instance correctly for this load to finish.  I also see a lot 
> of entries in the main log such as this:
> 
>       [WARN ] 2013-03-28 08:50:59,114 [repositories/BFWorks_STF] [Rio error] 
> Unescaped backslash in: L\'ambassadrice (314764886, -1)
> 
> Could these "Rio errors" be contributing to my troubles? I was also wondering 
> if there was a way to configure logging to be able to track the application's 
> progress. Right now these warnings are the only way I can tell how far the 
> loading has progressed.
> 
> Advice from anyone who has experience successfully loading a large 
> triplestore is much appreciated! Thanks in advance!
> 
> - Josh
> 
> 
> Joshua Greben
> Library Systems Programmer & Analyst
> Stanford University Libraries                
> (650) 714-1937
> jgre...@stanford.edu
> 
> 
> 
> _______________________________________________
> Owlim-discussion mailing list
> Owlim-discussion@ontotext.com
> http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion
> 
> 

_______________________________________________
Owlim-discussion mailing list
Owlim-discussion@ontotext.com
http://ontomail.semdata.org/cgi-bin/mailman/listinfo/owlim-discussion

Reply via email to