On Jan 18, 2014, at 4:29 PM, Andy Seaborne <[email protected]> wrote: > On 17/01/14 14:15, Pierre-Andre Michel wrote: >> >> On Jan 16, 2014, at 7:53 PM, Andy Seaborne <[email protected]> wrote: >> >>> On 16/01/14 08:41, Pierre-Andre Michel wrote: >>>> Hello Andy, >>>> >>>> As promised I have run a test to see if text:query allows the use of >>>> multiple predicates as proposed below: >>> >>> thanks for testing that. That way you can OR as well as AND. >>> >>>> >>>>>>> So if you write: >>>>>>> >>>>>>> ?a text:query(pred:cv-name 'ubl AND field2:ubiquitin' 10) . >>>>>>> >>>>>>> where field2 is the name of text:field name then you may be a single, >>>>>>> conjunctive query. >>>>>> >>>>>> OK, I will try what you suggest and tell you if it works or not. >>>> >>>> and the answer is: Yes, it works, great ! >>>> >>>> So I can run efficiently queries with multiple criteria (fields / >>>> predicates) for a single subject variable. >>>> Now If I want to text:query 2 subject variables ?a and ?b, for example: >>>> >>>> ?a text:query(pred:organ 'liver', 25) . >>>> ?b text:query(pred:author 'John Smith') . >>>> >>>> the second query will still be called 25 times if 25 solutions are found >>>> for ?a during graph traveral. >>>> Why don't we cache the result of the queries so that after the first call >>>> we dont invoke solr or lucene anymore but simply return an iterator on the >>>> result list previously built ? >>>> Does it make sense to you ? >>> >>> The optimizer does nothing here so that's what happens. It needs a >>> cross-product spotter to do that; it doesn't have one. >>> The optimizer/evaluator has no concept of "text:query" being special so it >>> blindly executes it. >> >> OK, so is there a way to provide the optimizer with a cross-product spotter >> concerning test:query ? > > Currently, you would need to add change the default optimizations applied, or > provide your own, to add a new one. > > There may be a way to write a query that stops the optimization applying at > this point but it will a visible change to the query. > > The specific optimization that is causing this is controlled by a switch > "ARQ.optIndexJoinStrategy". It's not specific to property functions like > text:query and is one of the more important optimizations done. You could try > running with it off but it may have consequences elsewhere in the overall > query. > >>> But aren't you going to connect ?a and ?b in some way? >> >> Yes ?a and ?b would be connected some way but the problem remains. > > Could you describe the use case here? If I understand the situation better > it will at least guide future work. The general external index usage is more > centred around one access per pattern. Your case looks like it is a bit more > complicated. >
Hi Andy, Thanks for your explanations about the switch ARQ.optIndexJoinStrategy. Our use case is roughly the following: we have a dataset with Np Proteins described with Na Annotations supported by Ne Evidences based on Np Publications (Np=20'000, Na>5'000'000, Ne>10'000'000,Np=400'000). A query involving 2 text:queries on distinct entities/variables could be like: ?isoform :has-function ?annotation . ?annotation text:query(pred:description 'glucose' 1000) . ?annotation :supported-by ?evidence . ?evidence :based-on ?publication . ?publication text:query(pred:abstract '(+crystallographic +SPR studies)' 10000) . Hope it helps you to better understand our needs, Cheers Pierre-Andre > Andy > >> >>> >>> Andy >>> >>> >> >
