Re: Issue using ANY ARRAY feature

2019-12-09 Thread Josh Elser

Hi Simon,

Thanks for replying back with your fix. We appreciate when folks do this 
so that others can also see the solution.


http://phoenix.apache.org/download.html only publishes the "latest" 
release for a line that we're maintaining. That's why you'll see 4.14.3 
listed on the website, not 4.14.0/1/2.


If you see a release x.y.z, you can reasonably assume[1] that you'll 
also find release x.y.z' where z'=[0,z). We expect that compatibility is 
maintained in the bugfix releases to some line, so there is no reason to 
not update to the latest version.


- Josh

[1] The one caveat here is that if there is a security-issues, we may 
explicitly pull a release from being downloaded.


On 12/8/19 6:12 PM, Simon Mottram wrote:

Update:

Just in case anyone hits this issue in future with the AWS managed
HBase, the fix is to use a very specific version of the driver

For thick client:

 
 org.apache.phoenix
 phoenix-core
 4.14.1-HBase-1.4
 

For thin client:

 
 org.apache.phoenix
 phoenix-queryserver-client
 4.14.1-HBase-1.4
 

This required support help from AWS as this driver version is not
mentioned on the official Apache Phoenix download page

Regards

Simon


On Sun, 2019-11-17 at 21:03 +, Simon Mottram wrote:

Phoenix Version: 4.14.2-HBase-1.4
HBase Version: AWS EMR
Release
label:emr-5.24.1
Hadoop distribution:Amazon
Applications:Phoenix 4.14.1,
Hue 4.4.0, HBase 1.4.9


Having an issue where ANY(ARRAY[]) stops the query returning any
results when used in a 'AND' conjunction

e.g (fielda = 'a') AND (fieldb = any(array['a','b','c]))

Always returns zero results, if i change to disjunction (OR) it works
fine but obviously isn't what's wanted here.  Excuse the long post
but
I wanted to be as clear as possible.

It's quite possible I have misunderstood the way ANY() works but...

Here's a simple query that returns a correct number of results:

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
'TRIAL00015')
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

So there's definitely records where biomaterial name and trialname
have
these values

If I change it to

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (TRIALNAME = ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

I get valid results

So the ANY(ARRAY[]) function works

Here's the explain which looks very odd to more, but it works

PLAN
   
  
   
CLIENT 1-CHUNK 50574 ROWS 314572800 BYTES PARALLEL 1-WAY FULL SCAN

OVER
DEV_OAPI.OBSERVATION
   
  
   
SERVER FILTER BY

org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
[children=[ARRAY['TRIAL00015'], TRIALNAME =
org.apache.phoenix.expression.function.ArrayElemRefExpression
[children=[ARRAY['TRIAL00015'], 1
 SERVER TOP 10 ROWS SORTED BY [OBSERVATIONDATE
DESC]
  C
CLIENT MERGE
SORT
   
CLIENT LIMIT 10


So far so good, BUT.

However if I combine the ARRAY expression with any expression using
AND
I get zero results, even tho as above both sides of the conjunction
return true.

e.g.

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

Explain (newlines added):
  SERVER FILTER BY
  (BIOMATERIALNAME = 'SCION00424'
  AND
  org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
[children=[ARRAY['TRIAL00015','SOMETHING ELSE'],
  TRIALNAME =
org.apache.phoenix.expression.function.ArrayElemRefExpression
[children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 1)

Just out of interest I tried with strings only in the array check

SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
  FROM DEV_OAPI.OBSERVATION
WHERE (BIOMATERIALNAME = 'ROOT00386') AND ('TRIAL00015' =
ANY(ARRAY['TRIAL00015']))
  ORDER BY OBSERVATIONDATE DESC
  LIMIT 10
  OFFSET 0

This works fine (in a kind of unhelpful way)

I have tested using the thick client:
 
 
 org.apache.phoenix
 phoenix-core
 4.14.2-HBase-1.4
 

and the thin client

  
 org.apache.phoenix
 phoenix-queryserver-client
 4.14.2-HBase-1.4
 

I've tried using braces and re-ordering but any query of the form:
 AND field = ANY(ARRAY['value1'...])
Returns zero results regardless of values

We can't change version as we are using the Amazon AWS EMR managed
stack and no other phoenix libraries work.

Thanks for taking the time to read this far!

Cheers

Simon


Re: Issue using ANY ARRAY feature

2019-12-08 Thread Simon Mottram
Update:

Just in case anyone hits this issue in future with the AWS managed
HBase, the fix is to use a very specific version of the driver

For thick client:


org.apache.phoenix
phoenix-core
4.14.1-HBase-1.4


For thin client:


org.apache.phoenix
phoenix-queryserver-client
4.14.1-HBase-1.4


This required support help from AWS as this driver version is not
mentioned on the official Apache Phoenix download page

Regards

Simon


On Sun, 2019-11-17 at 21:03 +, Simon Mottram wrote:
> Phoenix Version: 4.14.2-HBase-1.4
> HBase Version: AWS EMR 
> Release
> label:emr-5.24.1
> Hadoop distribution:Amazon
> Applications:Phoenix 4.14.1,
> Hue 4.4.0, HBase 1.4.9
> 
> 
> Having an issue where ANY(ARRAY[]) stops the query returning any
> results when used in a 'AND' conjunction
> 
> e.g (fielda = 'a') AND (fieldb = any(array['a','b','c]))
> 
> Always returns zero results, if i change to disjunction (OR) it works
> fine but obviously isn't what's wanted here.  Excuse the long post
> but
> I wanted to be as clear as possible.
> 
> It's quite possible I have misunderstood the way ANY() works but...
> 
> Here's a simple query that returns a correct number of results:
> 
> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>  FROM DEV_OAPI.OBSERVATION
> WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
> 'TRIAL00015')  
>  ORDER BY OBSERVATIONDATE DESC
>  LIMIT 10
>  OFFSET 0
> 
> So there's definitely records where biomaterial name and trialname
> have
> these values
> 
> If I change it to
> 
> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>  FROM DEV_OAPI.OBSERVATION
> WHERE (TRIALNAME = ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))  
>  ORDER BY OBSERVATIONDATE DESC
>  LIMIT 10
>  OFFSET 0
> 
> I get valid results
> 
> So the ANY(ARRAY[]) function works
> 
> Here's the explain which looks very odd to more, but it works
> 
> PLAN 
>   
>  
>   
> CLIENT 1-CHUNK 50574 ROWS 314572800 BYTES PARALLEL 1-WAY FULL SCAN
> OVER
> DEV_OAPI.OBSERVATION 
>   
>  
>   
> SERVER FILTER BY
> org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
> [children=[ARRAY['TRIAL00015'], TRIALNAME =
> org.apache.phoenix.expression.function.ArrayElemRefExpression
> [children=[ARRAY['TRIAL00015'], 1
> SERVER TOP 10 ROWS SORTED BY [OBSERVATIONDATE
> DESC]
>  C
> CLIENT MERGE
> SORT 
>   
> CLIENT LIMIT 10
> 
> So far so good, BUT.
> 
> However if I combine the ARRAY expression with any expression using
> AND
> I get zero results, even tho as above both sides of the conjunction
> return true.
> 
> e.g.
> 
> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>  FROM DEV_OAPI.OBSERVATION
> WHERE (BIOMATERIALNAME = 'ROOT00386') AND (TRIALNAME =
> ANY(ARRAY['TRIAL00015', 'SOMETHING ELSE']))
>  ORDER BY OBSERVATIONDATE DESC
>  LIMIT 10
>  OFFSET 0
> 
> Explain (newlines added):
>  SERVER FILTER BY 
>  (BIOMATERIALNAME = 'SCION00424' 
>  AND 
>  org.apache.phoenix.expression.function.ArrayAnyComparisonExpression
> [children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 
>  TRIALNAME =
> org.apache.phoenix.expression.function.ArrayElemRefExpression
> [children=[ARRAY['TRIAL00015','SOMETHING ELSE'], 1)
> 
> Just out of interest I tried with strings only in the array check
> 
> SELECT BIOMATERIALNAME, OBSERVATIONDATE, TRIALNAME
>  FROM DEV_OAPI.OBSERVATION
> WHERE (BIOMATERIALNAME = 'ROOT00386') AND ('TRIAL00015' =
> ANY(ARRAY['TRIAL00015']))  
>  ORDER BY OBSERVATIONDATE DESC
>  LIMIT 10
>  OFFSET 0
> 
> This works fine (in a kind of unhelpful way)
> 
> I have tested using the thick client:
> 
> 
> org.apache.phoenix
> phoenix-core
> 4.14.2-HBase-1.4
> 
> 
> and the thin client
> 
>  
> org.apache.phoenix
> phoenix-queryserver-client
> 4.14.2-HBase-1.4
> 
> 
> I've tried using braces and re-ordering but any query of the form:
>  AND field = ANY(ARRAY['value1'...])  
> Returns zero results regardless of values
> 
> We can't change version as we are using the Amazon AWS EMR managed
> stack and no other phoenix libraries work.
> 
> Thanks for taking the time to read this far!
> 
> Cheers
> 
> Simon