Paul Rogers created DRILL-4762:
----------------------------------
Summary: Parquet file with INT_32 column fails in simple SELECT
Key: DRILL-4762
URL: https://issues.apache.org/jira/browse/DRILL-4762
Project: Apache Drill
Issue Type: Bug
Components: Execution - Data Types
Affects Versions: 1.7.0
Reporter: Paul Rogers
Create a Parquet file with the following schema:
message int32Data { required int32 index; required int32 value (INT_32); }
See attached file int_32.parquet.
Query it as a local file using the web UI as follows:
SELECT * from `local`.`root`.`int_32.parquet`;
The following error is reported:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
UnsupportedOperationException: unsupported type: INT32 INT_32 Fragment 0:0
[Error Id: 79fdbc5d-2c69-47bd-a8a5-28939546e13d on 172.30.1.28:31010]
This message suggests that the Parquet logical (or "original") type of signed
INT_32 is not supported. Logical types are important because the storage type
(int32) simply says how to store the data, the logical type says how to
interpret that data. In this case, the logical type is identical to the storage
type: a 32-bit signed integer.
Strangely, the exact same file, without the logical type, works:
message int32Data { required int32 index; required int32 value; }
Creates file int32.parquet (attached). Queried with:
SELECT * from `local`.`root`.`int32.parquet`;
Produces the expected 5 rows of output. (Values are 0, -1, 1, min int and max
int).
Expected Drill to support all Parquet logical types (or at least those on top
of the scalar types.)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)