Re: using multiple text searches in a query

Pierre-Andre Michel Sun, 19 Jan 2014 03:56:45 -0800

On Jan 18, 2014, at 4:29 PM, Andy Seaborne <[email protected]> wrote:

> On 17/01/14 14:15, Pierre-Andre Michel wrote:
>> 
>> On Jan 16, 2014, at 7:53 PM, Andy Seaborne <[email protected]> wrote:
>> 
>>> On 16/01/14 08:41, Pierre-Andre Michel wrote:
>>>> Hello Andy,
>>>> 
>>>> As promised I have run a test to see if text:query allows the use of 
>>>> multiple predicates as proposed below:
>>> 
>>> thanks for testing that.  That way you can OR as well as AND.
>>> 
>>>> 
>>>>>>> So if you write:
>>>>>>> 
>>>>>>> ?a text:query(pred:cv-name 'ubl AND field2:ubiquitin' 10) .
>>>>>>> 
>>>>>>> where field2 is the name of text:field name then you may be a single, 
>>>>>>> conjunctive query.
>>>>>> 
>>>>>> OK,  I will try what you suggest and tell you if it works or not.
>>>> 
>>>> and the answer is: Yes, it works, great !
>>>> 
>>>> So I can run efficiently queries with multiple criteria (fields / 
>>>> predicates) for a single subject variable.
>>>> Now If I want to text:query 2 subject variables ?a and ?b, for example:
>>>> 
>>>> ?a text:query(pred:organ 'liver', 25) .
>>>> ?b text:query(pred:author 'John Smith') .
>>>> 
>>>> the second query will still be called 25 times if 25 solutions are found 
>>>> for ?a during graph traveral.
>>>> Why don't we cache the result of the queries so that after the first call 
>>>> we dont invoke solr or lucene anymore but simply return an iterator on the 
>>>> result list previously built ?
>>>> Does it make sense to you ?
>>> 
>>> The optimizer does nothing here so that's what happens.  It needs a 
>>> cross-product spotter to do that; it doesn't have one.
>>>  The optimizer/evaluator has no concept of "text:query" being special so it 
>>> blindly executes it.
>> 
>> OK, so is there a way to provide the optimizer with a cross-product spotter 
>> concerning test:query ?
> 
> Currently, you would need to add change the default optimizations applied, or 
> provide your own, to add a new one.
> 
> There may be a way to write a query that stops the optimization applying at 
> this point but it will a visible change to the query.
> 
> The specific optimization that is causing this is controlled by a switch 
> "ARQ.optIndexJoinStrategy".  It's not specific to property functions like 
> text:query and is one of the more important optimizations done. You could try 
> running with it off but it may have consequences elsewhere in the overall 
> query.
> 
>>> But aren't you going to connect ?a and ?b in some way?
>> 
>> Yes ?a and ?b would be connected some way but the problem remains.
> 
> Could you describe the use case here?  If I understand the situation better 
> it will at least guide future work.  The general external index usage is more 
> centred around one access per pattern.  Your case looks like it is a bit more 
> complicated.
>


Hi Andy,

Thanks for your explanations about the switch ARQ.optIndexJoinStrategy. 

Our use case is roughly the following: we have a dataset with Np Proteins 
described with Na Annotations supported by Ne Evidences based on Np 
Publications (Np=20'000, Na>5'000'000, Ne>10'000'000,Np=400'000).
A query involving 2 text:queries on distinct entities/variables could be like:

?isoform :has-function ?annotation .
?annotation text:query(pred:description 'glucose'  1000) .
?annotation :supported-by ?evidence .
?evidence :based-on ?publication .
?publication text:query(pred:abstract '(+crystallographic +SPR studies)' 10000) 
.

Hope it helps you to better understand our needs,
Cheers
Pierre-Andre


>       Andy
> 
>> 
>>> 
>>>     Andy
>>> 
>>> 
>> 
>

Re: using multiple text searches in a query

Reply via email to