There is nothing TDB specific here I think, the same would hold for any database holding the full data or not?

Not sure if I understand what you're doing, nor do I understand how you generated the folds on a graph but RDF datasets can manage so called graphs and so does Jena.

So my question is, why can't you split the dataset into n graphs and store this all into the TDB? I mean, just keep the graphs stored. You only have to select then the n-1 graphs for training and the remaining graph for validation or not? We also don't know how you assign weights to the RDF graph, but I'm pretty sure in RDF this has to be done via some kind of property attached to each node. You can add and delete those triples each time via SPARQL 1.1 Update statements. Or via Jena API methods of course.

Long story, short it would be helpful to explain what exactly you're doing and even better show the current source code and/or queries.

On 15.12.21 14:18, emri mbiemri wrote:
Hello all,

I am interested to conduct a k-fold validation for an algorithm that uses
TDB as its database. The stored graph is weighted based on some criteria.
The point is that when performing k-fold cross validation I have for each
iteration (k-times) to create the TDP repo, to load the training models, to
weight the graph, calculate the Precision of the algorithm with the
remaining test models, delete the complete graph again, and so it iterates
for each step.

My question is if I have to completely delete for each time all the files
and create a new dataset for each iteration? Or, is there maybe any other
more appropriate way to perform k-fold cross-validation with a TDB?

Thanks.

Reply via email to