There is nothing TDB specific here I think, the same would hold for any
database holding the full data or not?
Not sure if I understand what you're doing, nor do I understand how you
generated the folds on a graph but RDF datasets can manage so called
graphs and so does Jena.
So my question is, why can't you split the dataset into n graphs and
store this all into the TDB? I mean, just keep the graphs stored. You
only have to select then the n-1 graphs for training and the remaining
graph for validation or not? We also don't know how you assign weights
to the RDF graph, but I'm pretty sure in RDF this has to be done via
some kind of property attached to each node. You can add and delete
those triples each time via SPARQL 1.1 Update statements. Or via Jena
API methods of course.
Long story, short it would be helpful to explain what exactly you're
doing and even better show the current source code and/or queries.
On 15.12.21 14:18, emri mbiemri wrote:
Hello all,
I am interested to conduct a k-fold validation for an algorithm that uses
TDB as its database. The stored graph is weighted based on some criteria.
The point is that when performing k-fold cross validation I have for each
iteration (k-times) to create the TDP repo, to load the training models, to
weight the graph, calculate the Precision of the algorithm with the
remaining test models, delete the complete graph again, and so it iterates
for each step.
My question is if I have to completely delete for each time all the files
and create a new dataset for each iteration? Or, is there maybe any other
more appropriate way to perform k-fold cross-validation with a TDB?
Thanks.