I'd agree with John's assessment, and add that classification can be a good 
choice here.  I.e. instead of encoding meaning in the URI's, encode meaning in 
class definitions - where you can also add metadata about meaning in 
annotations.  So you can have class (owl:class) "Xyz" and all of its members 
(rdf:type) are the URIs you currently are searching for. Of course, you'd have 
to do a search to create the class members, but that's a one-time process that 
performance can be worked around.  From there, it becomes a straightforward 
SPARQL "lookup" to query for members of the class, {?inst rdf:type :Xyx}.

You can also document the model directly in the model by adding annotations, 
such as {:Xyz rdfs:comment "All member of this class have such-and-such 
characteristics, starting with having an 'xyz' in the local name."}

You can then take the modeling a bit deeper and create subclasses, say for 
example all of the Xyz's that also have a property 'x'.  Then you can find all 
direct members of this class {?inst a :Xyz-x} or find all of the members of Xyz 
and it's subclasses {?cls rdfs:subClassOf* :Xyz . ?inst rdf:type ?cls}, etc.

RDFS/OWL modeling is really useful, and performant, for that kind of 
classification use case and takes just a bit of data modeling to achieve.

Hope that helps some.
-- Scott

From: <[email protected]> on behalf of John Snelson 
<[email protected]>
Reply-To: MarkLogic Developer Discussion <[email protected]>
Date: Monday, September 26, 2016 at 7:26 AM
To: "[email protected]" <[email protected]>
Subject: Re: [MarkLogic Dev General] Triple index: all IRIs matching a string

There's no fast way to do that against the triple index. Sounds like you
might be better off encoding meaning into the RDF graph rather than the
IRI scheme.

John

On 25/09/16 14:13, Florent Georges wrote:
Hi,

Given a string "xyz", I'd like to find all the IRIs in a subject
position in the triple index, so that the string appear anywhere in
the IRI after the last '/' or '#'.  That is, for "xyz", I'd like to
have:

http://example.org/proj/component#xyz
http://example.org/proj/component#abc-xyz
http://example.org/proj/component/abc-xyz-123

But I am not quite sure what would be the correct way to use the
triple index for this, potentially on large data sets.  Any idea?

Regards,



--
John Snelson, Lead Engineer                    http://twitter.com/jpcs
MarkLogic Corporation                         http://www.marklogic.com

_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to