Hey Andy I wasn't able to unzip the file http://people.apache.org/~andy/jamendo.nt.gz however I ran it on my dataset and I received an out of memory exception. I then changed line 42 to true and received the original error. You can download the data file I have been testing with from http://www.kosmyna.com/mappingbased_properties_en.nt.bz2 unzipped it's 2.6gb. This file has consistently failed to load.
While trying other datasets and variations of the simple program I had what seemed to be a successful BulkLoad however when I opened the dataset and tried to query it there were no results. I don't have the exact details of this run but can try to reproduce it if you think it would be useful. -jp On Mon, Jun 20, 2011 at 4:57 PM, Andy Seaborne <[email protected]> wrote: > Fixed - sorry about that. > > Andy > > On 20/06/11 21:50, jp wrote: >> >> Hey andy, >> >> I assume the file you want me to run is >> http://people.apache.org/~andy/ReportLoadOnSSD.java >> >> When I try to download it I get a permissions error. Let me know when >> I should try again. >> >> -jp >> >> On Mon, Jun 20, 2011 at 3:30 PM, Andy Seaborne >> <[email protected]> wrote: >>> >>> Hi there, >>> >>> I tried to recreate this but couldn't, but I don't have an SSD to hand at >>> the moment (being fixed :-) >>> >>> I've put my test program and the data from the jamendo-rdf you sent me >>> in: >>> >>> http://people.apache.org/~andy/ >>> >>> so we can agree on exactly a test case. This code is single threaded. >>> >>> The conversion from .rdf to .nt wasn't pure. >>> >>> I tried running using the in-memory store as well. >>> downloads.dbpedia.org was down atthe weekend - I'll try to get the same >>> dbpedia data. >>> >>> Could you run exactly what I was running? The file name needs changing. >>> >>> You can also try uncommenting >>> SystemTDB.setFileMode(FileMode.direct) ; >>> and run it using non-mapped files in about 1.2 G of heap. >>> >>> Looking through the stacktarce, there is a point where the code has >>> passed >>> an internal consistence test then fails with something that should be >>> caught >>> by that test - and the code is sync'ed or single threaded. This is, to >>> put >>> it mildly, worrying. >>> >>> Andy >>> >>> On 18/06/11 16:38, jp wrote: >>>> >>>> Hey Andy, >>>> >>>> My entire program is run on one jvm as follows. >>>> >>>> public static void main(String[] args) throws IOException{ >>>> DatasetGraphTDB datasetGraph = TDBFactory.createDatasetGraph(tdbDir); >>>> >>>> /* I saw the BulkLoader had two ways of loading data based on whether >>>> the dataset existed already. I did two runs one with the following two >>>> lines commented out to test both ways the BulkLoader runs. Hopefully >>>> this had the desired effect. */ >>>> datasetGraph.getDefaultGraph().add(new >>>> Triple(Node.createURI("urn:hello"), RDF.type.asNode(), >>>> Node.createURI("urn:house"))); >>>> datasetGraph.sync(); >>>> >>>> InputStream inputStream = new FileInputStream(dbpediaData); >>>> >>>> BulkLoader bulkLoader = new BulkLoader(); >>>> bulkLoader.loadDataset(datasetGraph, inputStream, true); >>>> } >>>> >>>> The data can be found here >>>> http://downloads.dbpedia.org/3.6/en/mappingbased_properties_en.nt.bz2 >>>> I appended the ontology to end of file it can be found here >>>> http://downloads.dbpedia.org/3.6/dbpedia_3.6.owl.bz2 >>>> >>>> The tdbDir is an empty directory. >>>> On my system the error starts occurring after about 2-3minutes and >>>> 8-12 million triples loaded. >>>> >>>> Thanks for looking over this and please let me know if I can be of >>>> further assistance. >>>> >>>> -jp >>>> [email protected] >>>> >>>> >>>> On Jun 17, 2011 9:29 am, andy wrote: >>>>> >>>>> jp, >>>>> >>>>> How does this fit with running: >>>>> >>>>> datasetGraph.getDefaultGraph().add(new >>>>> Triple(Node.createURI("urn:hello"), RDF.type.asNode(), >>>>> Node.createURI("urn:house"))); >>>>> datasetGraph.sync(); >>>>> >>>>> Is the preload of one triple a separate JVM or the same JVM as the >>>>> BulkLoader call - could you provide a single complete minimal example? >>>>> >>>>> In attempting to reconstruct this, I don't want to hide the problem by >>>>> guessing how things are wired together. >>>>> >>>>> Also - exactly which dbpedia file are you loading (URL?) although I >>>>> doubt the exact data is the cause here. >>> >
