Hi Joel,

On Wed, 2011-11-16 at 09:47 -0800, Joel Shellman wrote:

> > > And I need make such statements not just one path at a time, but path
> > > expressions such as apply things to a/b/** (all descendants of a/b).
> > Is there a difference between "all descendants of a/b" and "all
> > descendants of b"?  If a and b represent nodes in your graph then the
> > path by which you get to a node shouldn't matter surely.
> 
> The path matters very much which is the whole challenge I'm working
> on. If it didn't matter, I wouldn't be writing this email :)

OK, I feared as much. In which case none of my suggestions work.

> Because it's a graph, there could be a/b, a/d/e/b, a/z/y/b, etc. There
> could be any number of ways to get to b. But I need to associate
> things both with a node/type/thing and with paths so when I query I
> get info from both. So, yes, I need to apply things to a/b/** that do
> NOT get applied to a/d/e/b/** and such things.

Understood.

> > > 100,000's of paths that would be queried for. And multiply that by
> > > about 10 for all the other information that would be about each node
> > > besides hasField.
> > Don't follow that last statement. You see to have a tree/graph of fields
> > linked by hasChild. So the nodes are fields whereas hasField applies to
> > types.
> 
> Right, as mentioned above there will be other statements like hasFoo,
> hasBar, etc. about various fields and paths and such.
> 
> 
> > It's a little hard to follow the problem description in the abstract, if
> > you have a concrete example that might help.
> 
> I'm not sure how to be more concrete. I want to model as above, make
> statements like a/b/** hasChild foo, and query on paths like a/b/c

So fundamentally this is not a good match to RDF. RDF represents graphs
just fine but making assertions about a path as opposed to a node in the
graph is not part of the model. You would need to somehow reify or
encode your paths into an RDF representation.

> > (1) Use SPARQL property paths.
> 
> If I understand what you're saying here, it would mean run a query and
> create statements for all the results? That won't work for me. I
> forgot to mention, I need to set this up such that if I then go add
> new fields, I need the statements to automatically apply. So, if I
> say: (a/b/** hasSomething foo) and after that I create a/b/x, then
> (a/b/x hasSomething foo) needs to implicitly exist without me creating
> it.

Sure but whatever inference process you use some work will have to be
done after an add. That work might be done lazily at query time or
eagerly after an update but work still gets done.

However, this suggestion wouldn't work anyway now I understand you
really do want to associate a property with a path not just use paths as
a short cut for expressing information about a region of the graph.

> > (3) Use custom reasoning
> > It's possible that you could express your statements about class paths
> > as a set of rules which match the property path and conclude the
> > assertion.
> 
> This is what I expected to need to do. But I'm not understanding how
> to express those rules, nor how to do a query for a path.

Now I understand your problem then I'm not sure I do either :)

> > One tricky bit of (3) is that Jena rules have no builtin notion of path
> > expression. In particular you would need to explicitly define the
> > descendentOf closure in order to use that in your rules. That might be
> > better done by relying on the builtin transitive reasoner which would
> > mean using the same trick as (2) of using subClassOf to represent your
> > child relationships but now you would have more power than pure OWL
> > restrictions for expressing your property path assertions.
> 
> Does subClassOf handle cycles such as a/b/c/b/c/b/c/...? I need to support 
> that.

SubClassOf allows cycles but not in that sense. You can have 

   c subClassOf b subClassOf c

but that just means b and c are equivalent classes. There's simply no
notion of traversal and no ability to distinguish a/b/c from a/b/c/b/c.

> So... How could I model these statements on path expressions and path
> queries in the rules?

Possibly the best advice is "don't start from here". This doesn't sound
like an RDF-shaped problem. I would be inclined to develop a custom
solution in prolog or whatever.

However, if you want to give this a go then the best I can come up with
is as follows ...


Encode your query as an RDF graph. Using some predicate to represent the
links in the path (maybe ex:next) and some predicate to link the nodes
(maybe ex:isa) in your "query" graph to your fields.  So a query "a/b/c"
would be an RDF graph:

   ex:root ex:next :a1 .
   :a1  ex:isa  :a;  ex:next :b1 .
   :b1  ex:isa  :b;  ex:next :c1 .
   :c1  ex:isa  :c;  ex:next ex:terminal .

Represent your path assertions using rules and represent your node
assertions using an RDF graph of background knowledge.

For example, to attach a property to the field c you would have in your
background knowledge:

    :c  :hasColour  :green .

To "attach" a property to a path like a/b/c you would write a rule such
as:

   (ex:root ex:next ?a) (?a ex:isa :a)
   (?a ex:next ?b)      (?b ex:isa :b)
   (?b ex:next ?c)      (?c ex:isa :c)
      -> (?c eg:hasFlavour eg:sour) .

Some syntactic preprocessor could translate a more friendly assertional
language into such rules.

Your rule set would also need to compute the transitive closure of
ex:next for use in ** rules and would need to compute properties of the
leaf node from the background knowledge:

   (?x ex:next ex:terminal)  (?x ex:isa ?f) (?f ?p ?v) -> (?x ?p ?v) .

To answer a query you create a merge of your query graph and background
knowledge graph, run the rules over it and read off the answer from the
terminal node.

Whether that's a *good* way to do it and whether the performance would
meet your needs is unknown.

Dave


Reply via email to