tdb2.tdbloader has a --loader=parallel option, you could try with that. From my past experience, the decreasing loading speed is caused by IO saturation. Do you have an HDD or an SSD?
> Sent: Tuesday, November 12, 2019 at 1:29 PM > From: "Amandeep Srivastava" <[email protected]> > To: [email protected] > Subject: TDB optimization query > > Hi, > > I'm trying to create a TDB database from Wikidata's official RDF dump to > read the data using Fuseki service. I need to make a few queries for my > personal project, running which the online service times out. > > I have a 12 core machine with 36 GB memory. > > Can you please advise on the best way for creating the database? Since the > dump is huge, I cannot try all the approaches. Besides, I'm not sure if the > tdbloader function works in a similar way on data of different sizes. > > Questions: > > 1. Which one would be better to use - tdb.tdbloader2 (TDB1) or > tdb2.tdbloader (TDB2) for creating the database and why? Any specific > configurations that I should be aware of? > > 2. I'm running a job currently using tdb.tdbloader2 but it is using just a > single core. Also, it's loading speed is decreasing slowly. It started at > an avg of 120k tuples and is currently at 80k tuples. Can you advise how > can I utilize all the cores of my machine and maintain the loading speed at > the same time? > > Regards, > Aman >
