I'd agree with John's assessment, and add that classification can be a good
choice here. I.e. instead of encoding meaning in the URI's, encode meaning in
class definitions - where you can also add metadata about meaning in
annotations. So you can have class (owl:class) "Xyz" and all of its members
(rdf:type) are the URIs you currently are searching for. Of course, you'd have
to do a search to create the class members, but that's a one-time process that
performance can be worked around. From there, it becomes a straightforward
SPARQL "lookup" to query for members of the class, {?inst rdf:type :Xyx}.
You can also document the model directly in the model by adding annotations,
such as {:Xyz rdfs:comment "All member of this class have such-and-such
characteristics, starting with having an 'xyz' in the local name."}
You can then take the modeling a bit deeper and create subclasses, say for
example all of the Xyz's that also have a property 'x'. Then you can find all
direct members of this class {?inst a :Xyz-x} or find all of the members of Xyz
and it's subclasses {?cls rdfs:subClassOf* :Xyz . ?inst rdf:type ?cls}, etc.
RDFS/OWL modeling is really useful, and performant, for that kind of
classification use case and takes just a bit of data modeling to achieve.
Hope that helps some.
-- Scott
From: <[email protected]> on behalf of John Snelson
<[email protected]>
Reply-To: MarkLogic Developer Discussion <[email protected]>
Date: Monday, September 26, 2016 at 7:26 AM
To: "[email protected]" <[email protected]>
Subject: Re: [MarkLogic Dev General] Triple index: all IRIs matching a string
There's no fast way to do that against the triple index. Sounds like you
might be better off encoding meaning into the RDF graph rather than the
IRI scheme.
John
On 25/09/16 14:13, Florent Georges wrote:
Hi,
Given a string "xyz", I'd like to find all the IRIs in a subject
position in the triple index, so that the string appear anywhere in
the IRI after the last '/' or '#'. That is, for "xyz", I'd like to
have:
http://example.org/proj/component#xyz
http://example.org/proj/component#abc-xyz
http://example.org/proj/component/abc-xyz-123
But I am not quite sure what would be the correct way to use the
triple index for this, potentially on large data sets. Any idea?
Regards,
--
John Snelson, Lead Engineer http://twitter.com/jpcs
MarkLogic Corporation http://www.marklogic.com
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general