[jira] [Created] (IMPALA-8110) Parquet stat filtering does not handle narrowed int types correctly

2019-01-24 Thread Csaba Ringhofer (JIRA)
Csaba Ringhofer created IMPALA-8110:
---

 Summary: Parquet stat filtering does not handle narrowed int types 
correctly
 Key: IMPALA-8110
 URL: https://issues.apache.org/jira/browse/IMPALA-8110
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Csaba Ringhofer


Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the 
value does not fit into the 8/16 bit signed int's range, the value will 
overflow, e.g writing 128 as int32 and then rereading it as int8 will return 
-128. This is normal as far as I understand, but min/max stat filtering does 
not handle this case correctly:

create table tnarrow (i int) stored as parquet;
insert into tnarrow values (1), (201); 
alter table tnarrow change column i i tinyint;
set PARQUET_READ_STATISTICS=0;
select * from tnarrow where i < 0;
-> returns 1 row: -56
set PARQUET_READ_STATISTICS=1;
-> returns 0 row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8110) Parquet stat filtering does not handle narrowed int types correctly

2019-01-24 Thread Csaba Ringhofer (JIRA)
Csaba Ringhofer created IMPALA-8110:
---

 Summary: Parquet stat filtering does not handle narrowed int types 
correctly
 Key: IMPALA-8110
 URL: https://issues.apache.org/jira/browse/IMPALA-8110
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Csaba Ringhofer


Impala can read int32 Parquet columns as tiny/smallint SQL columns. If the 
value does not fit into the 8/16 bit signed int's range, the value will 
overflow, e.g writing 128 as int32 and then rereading it as int8 will return 
-128. This is normal as far as I understand, but min/max stat filtering does 
not handle this case correctly:

create table tnarrow (i int) stored as parquet;
insert into tnarrow values (1), (201); 
alter table tnarrow change column i i tinyint;
set PARQUET_READ_STATISTICS=0;
select * from tnarrow where i < 0;
-> returns 1 row: -56
set PARQUET_READ_STATISTICS=1;
-> returns 0 row



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)