Hi
Got my fingers crossed that there's a work around for this as this really is a
big problem for us
We are using:
Amazon EMR
Release label:emr-6.1.0
Hadoop distribution:Amazon
Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0
Thin Client version:
phoenix-5.0.0-HBase-2.0-thin-client.jar
We get the following error when doing an LAST_VALUE aggregation where
1. A JOIN is empty
2. The column is INTEGER or DATETIME
Remote driver error: IllegalArgumentException: offset (25) + length (4) exceed
the capacity of the array: 25
The query that breaks is:
SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
ON DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
OFFSET 0;
I can refactor this using EXIST but get same error, presumably the driver knows
to treat them the same:
SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
OFFSET 0;
If we remove the external reference we get no error, regardless of whether
there are any hits or not
-- these all work
There are no hits for this query
SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
BIOMATERIAL_TYPE = 'aardvark'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
OFFSET 0;
Lots of hits for this query:
SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10 OFFSET 0;
I've tried weird things like:
Comparing exists to TRUE to try and force it into a normal BOOLEAN value, same
IllegalArgumentException.
SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
(EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')) = TRUE
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
OFFSET 0;
And are you prepared for this one, which throws exact same error:
SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
TRUE
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
OFFSET 0;
Change FIRST_VALUE to AVG and it works fine.