Re: [MarkLogic Dev General] Optimizing Reverse Queries

Jason Hunter Mon, 01 May 2017 20:26:30 -0700

> Another question: having gotten a result from a reverse search at the full 
> document level, is there a way to know *which* queries matched? If so then it 
> would be easy enough to apply those queries to the relevant elements to do 
> additional filtering (although I suppose that might get us back to the same 
> place).


I'm a little confused.  You're putting multiple serialized queries into each 
document?  If you have just one serialized query in a document it's going to be 
obvious which query was the reverse match -- it was that one.

> In particular, if I have 125,000 reverse queries applied to a single document 
> (assuming that total database volume doesn’t affect query speed in this case) 
> on a modern fast server with appropriate indexes in place, how fast should I 
> expect that query to take? 1ms?, 10ms?, 100ms? 1 second?

If you have 125,000 documents each with a serialized query in it and you do a 
reverse query for one document against those serialized queries and there's no 
hits, it should be extremely fast.  More hits will slow things a little bit 
because hits involve a little work.  The IMLS paper explains what the algorithm 
has to do.  I suspect (but haven't measured) that it's a lot like forward 
queries in that the timing depends a lot on number of matches.

> Our corpus has about 25 million elements that would be fragments per the 
> advice above (about 1.5 million full documents). 

If you have 25 million elements you want to run against 125,000 serialized 
queries, wouldn't forward queries be faster?  You'd only have to do 125,000 
search calls instead of 25,000,000.  :)

> I’ve never done much with fragments in MarkLogic so I’m not sure what the 
> full implication of making these subelements into fragments would be for 
> other processing.

Yeah, fragmentation is not to be done lightly.

-jh-

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Optimizing Reverse Queries

Reply via email to