Phoenix Version: 4.14.2-HBase-1.4
HBase Version: AWS EMR 
Release
label:emr-5.24.1
Hadoop distribution:Amazon
Applications:Phoenix 4.14.1,
Hue 4.4.0, HBase 1.4.9


Having an issue where ANY(ARRAY[]) stops the query returning any
results when used in a 'AND' conjunction

e.g (fielda = 'a') AND (fieldb = any(array['a','b','c]))

Always returns zero results, if i change to disjunction (OR) it works
fine but obviously isn't what's wanted here.  Excuse the long post but
I wanted to be as clear as possible.

It's quite possible I have misunderstood the way ANY() works but...

Here's a simple query that returns a correct number of results:

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
 FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME = 'TRIAL00015')  
 ORDER BY OBSERVATIONDATE DESC
 LIMIT 10
 OFFSET 0

So there's definitely records where biomaterial name and trialname have
these values

If I change it to

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
 FROM DEV_OAPI.OBSERVATION
WHERE (TRIALNAME = ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))  
 ORDER BY OBSERVATIONDATE DESC
 LIMIT 10
 OFFSET 0

I get valid results

So the ANY(ARRAY[]) function works

Here's the explain which looks very odd to more, but it works

PLAN                                                                   
                                                                       
CLIENT 1-CHUNK 50574 ROWS 314572800 BYTES PARALLEL 1-WAY FULL SCAN OVER
DEV_OAPI.OBSERVATION                                                   
                                                                       
SERVER FILTER BY
org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
[children=[ARRAY['TRIAL00015'], TRIALNAME =
org.apache.phoenix.expression.function.ArrayElemRefExpression
[children=[ARRAY['TRIAL00015'], 1]]]]
    SERVER TOP 10 ROWS SORTED BY [OBSERVATIONDATE
DESC]                                                                 C
CLIENT MERGE
SORT                                                                   
CLIENT LIMIT 10    

So far so good, BUT.

However if I combine the ARRAY expression with any expression using AND
I get zero results, even tho as above both sides of the conjunction
return true.

e.g.

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
 FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
 ORDER BY OBSERVATIONDATE DESC
 LIMIT 10
 OFFSET 0

Explain (newlines added):
 SERVER FILTER BY 
 (BIOMATERIALNAME = 'SCION00424' 
 AND 
 org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
[children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 
 TRIALNAME =
org.apache.phoenix.expression.function.ArrayElemRefExpression
[children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 1]]]])

Just out of interest I tried with strings only in the array check

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
 FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND ('TRIAL00015' =
ANY(ARRAY['TRIAL00015']))  
 ORDER BY OBSERVATIONDATE DESC
 LIMIT 10
 OFFSET 0

This works fine (in a kind of unhelpful way)

I have tested using the thick client:
        <!-- HBASE THICK CLIENT DEPS -->
        <dependency>
            <groupId>org.apache.phoenix</groupId>
            <artifactId>phoenix-core</artifactId>
            <version>4.14.2-HBase-1.4</version>
        </dependency>

and the thin client

         <dependency>
            <groupId>org.apache.phoenix</groupId>
            <artifactId>phoenix-queryserver-client</artifactId>
            <version>4.14.2-HBase-1.4</version>
        </dependency>

I've tried using braces and re-ordering but any query of the form:
<anyclause> AND field = ANY(ARRAY['value1'...])  
Returns zero results regardless of values

We can't change version as we are using the Amazon AWS EMR managed
stack and no other phoenix libraries work.

Thanks for taking the time to read this far!

Cheers

Simon

Reply via email to