Rafit Izhak Ratzin created ABDERA-426:
-----------------------------------------
Summary: Group by a divided value (e.g., time/10) returns NULL.
Key: ABDERA-426
URL: https://issues.apache.org/jira/browse/ABDERA-426
Project: Abdera
Issue Type: Bug
Reporter: Rafit Izhak Ratzin
Running with an environment that includes: phoenix version 4.5.0, HBase
version 0.98, Cloudera 5.2
When I group by a divided value (e.g., time/10) the divided value will return
as NULL.
Please see the simple examples below,
Thanks in advance,
--Rafit
create table test(time integer not null, hostname varchar not null,usage float
constraint pk PRIMARY KEY(time, hostname));
upsert into test values(1439853441,'qa8',3.28);
upsert into test values(1439853449,'qa8',3.28);
upsert into test values(1439853459,'qa8',3.28);
upsert into test values(1439853458,'qa8',3.27);
upsert into test values(1439853457,'qa8',6.27);
upsert into test values(1439853462,'qa8',8.27);
upsert into test values(1439853462,'qa9',8.27);
upsert into test values(1439853457,'qa9',6.27);
0: jdbc:phoenix:localhost> select * from test;
+------------------------------------------+------------------------------------------+----------------------------------------+
| TIME | HOSTNAME |
USAGE |
+------------------------------------------+------------------------------------------+----------------------------------------+
| 1439853441 | qa8
| 3.28 |
| 1439853449 | qa8
| 3.28 |
| 1439853457 | qa8
| 6.27 |
| 1439853457 | qa9
| 6.27 |
| 1439853458 | qa8
| 3.27 |
| 1439853459 | qa8
| 3.28 |
| 1439853462 | qa8
| 8.27 |
| 1439853462 | qa9
| 8.27 |
+------------------------------------------+------------------------------------------+----------------------------------------+
0: jdbc:phoenix:localhost> select time/10, hostname, usage from test;
+----------------------------------------+------------------------------------------+------------------------------------------+
| (TIME / 10) | HOSTNAME |
USAGE |
+----------------------------------------+------------------------------------------+------------------------------------------+
| 143985344 | qa8
| 3.28 |
| 143985344 | qa8
| 3.28 |
| 143985345 | qa8
| 6.27 |
| 143985345 | qa9
| 6.27 |
| 143985345 | qa8
| 3.27 |
| 143985345 | qa8
| 3.28 |
| 143985346 | qa8
| 8.27 |
| 143985346 | qa9
| 8.27 |
+----------------------------------------+------------------------------------------+------------------------------------------+
0: jdbc:phoenix:localhost> select time/10 as tm, hostname, avg(usage) from test
group by hostname, tm;
+----------------------------------------+------------------------------------------+------------------------------------------+
| TM | HOSTNAME |
AVG(USAGE) |
+----------------------------------------+------------------------------------------+------------------------------------------+
| null |
| 3.2799 |
| null |
| 4.2733 |
| null |
| 6.2699 |
| null |
| 8.27 |
| null |
| 8.27 |
+----------------------------------------+------------------------------------------+------------------------------------------+
hostname is empty, time/10 is null
I thought it might be related to the fact that the time is a primary key so I
ran the next test case:
0: jdbc:phoenix:localhost> create table test1(time integer not null, hostname
varchar not null,usage float,period integer constraint pk PRIMARY KEY(time,
hostname));
0: jdbc:phoenix:localhost> upsert into test1
values(1439853462,'qa9',8.27,1439853462);
0: jdbc:phoenix:localhost> upsert into test1
values(1439853461,'qa9',8.27,1439853362);
0: jdbc:phoenix:localhost> upsert into test1
values(1439853461,'qa9',5.27,1439853461);
0: jdbc:phoenix:localhost> upsert into test1
values(1439853451,'qa9',4.27,1439853451);
0: jdbc:phoenix:localhost> select * from test1;
+-------------------------------------+-------------------------------------+--------------------------------------+---------------------------------------+
| TIME | HOSTNAME |
USAGE | PERIOD |
+-------------------------------------+-------------------------------------+--------------------------------------+---------------------------------------+
| 1439853451 | qa9 | 4.27
| 1439853451 |
| 1439853461 | qa9 | 5.27
| 1439853461 |
| 1439853462 | qa9 | 8.27
| 1439853462 |
+-------------------------------------+------------------------------------+--------------------------------------+----------------------------------------+
0: jdbc:phoenix:localhost> select period/10 as tm, hostname, usage from test1 ;
+----------------------------------------+------------------------------------------+------------------------------------------+
| TM | HOSTNAME |
USAGE |
+----------------------------------------+------------------------------------------+------------------------------------------+
| 143985345 | qa9 |
4.27 |
| 143985346 | qa9 |
5.27 |
| 143985346 | qa9 |
8.27 |
+----------------------------------------+------------------------------------------+------------------------------------------+
+----------------------------------------+------------------------------------------+------------------------------------------+
| TM | HOSTNAME |
AVG(USAGE) |
+----------------------------------------+------------------------------------------+------------------------------------------+
| null | qa8
| 4.2699 |
| null | qa9
| 4.2699 |
| null | qa9
| 6.77 |
+----------------------------------------+------------------------------------------+------------------------------------------+
Now we can see the hostname (it is not an empty field as before) but tm is
still null.
Please see below the response from James for a work arround:
Hi Rafit,
Looks like a bug. Please file a JIRA. The following seems to work as a
workaround:
select cast(time/10.0 as integer) as tm, hostname, avg(usage) from test group
by hostname, tm;
You might also consider using a date[1] type instead of an integer and then
using the TRUNC function[2] which work with both numeric and time types.
Thanks,
James
[1] https://phoenix.apache.org/language/datatypes.html#date_type
[2] https://phoenix.apache.org/language/functions.html#trunc
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)