On Thu, Mar 8, 2012 at 7:50 AM, Andy Seaborne <[email protected]> wrote: > On 08/03/12 15:03, Chris Dollin wrote: >> >> Mena said: >> >>> I want to apply the OnTools .FindShortestPath function on Yago. >>> I am using the following code to load the model: >>> >>> Model model = TDBFactory.createModel(FullYagoDirectory); >>> >>> The FindShortestPath function taking too much time to return a result. >>> I wonder if it is possible to load the model into main memory to make it >>> faster or if there is any other way to make FindShortestPath much faster. >> >> >> Model model = ModelFactory.createDefaultModel().add( >> TDBFactory.createModel(FullYagoDirectory) ); >> >> Of course you may then run out of memory if the model is big. >> >> Chris >> >> ("Default" models are in-memory models.) > > > IIRC YAGO(2) is a bit big. The core is something like 30 million triples > and full 80 million triples, I think. > > Bit big for memory unless you have a big server. > > Do you need "shortest path" or is just connectivity of entities acceptable? > > ARQ now has DISTINCT for paths and executes it (more) efficiently: > > { :x DISTINCT(path) ?y } > > in the ARQ language. > > (more to come here ... "soon") > > > If you do want "shortest path", you may need to simplify the problem. > > Jena's OntTools shortest path is quite general - can you work with, say, the > path being a fixed property? > > If so, maybe extract all the occurrences of that property and make a > subgraph, hopefully smaller.
I'm currently working on a shortest path problem that does have a fixed property. In my application, icons can optionally be associated with types and I'd like entities to use the "closest" icon(s) (i.e. direct type(s), then parent(s), then grandparent(s), etc.). Assuming no RDFS inferencing were performed, to get the icon(s) for particular instance you'd do something like: PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX ex: <http://example.org/> select ?icon where { ex:Entity1 rdf:type/SHORTEST(rdfs:subClassOf*/ex:hasIcon) ?icon . } I saw some discussion of path lengths at [1][2], and I understand why general path length is difficult because of the introduction of a new datatype, but the shortest path operator here doesn't require reporting the actual length to the user. Do you think it's feasible to implement such an operator that utilizes a breadth-first search in order to match the shortest paths? I have to admit I'm not as familiar with the property path code as I'd like to be. Alternatively, is there some way I'm missing to do this with sub-queries? Thanks, Stephen [1] http://www.w3.org/2009/sparql/meeting/2009-03-17#path_lengths___26___20_path__2d_matching_variables [2] http://www.w3.org/2009/sparql/wiki/Feature%3aPathLength
