-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20877/
-----------------------------------------------------------
Review request for Tajo.
Bugs: TAJO-806
https://issues.apache.org/jira/browse/TAJO-806
Repository: tajo
Description
-------
In below case, currently, TajoWriteSupport just takes the schema of the table
{{orders}}. In other words, each column qualifier was {{default.orders}}
instead of {{default.parquet_test}}. This is a bug. In such a case, we can meet
the following error when we read parquet files.
{noformat}
default> create table parquet_test using parquet as select * from orders;
Progress: 0%, response time: 1.119 sec
Progress: 0%, response time: 2.121 sec
Progress: 0%, response time: 3.123 sec
Progress: 83%, response time: 4.126 sec
Progress: 100%, response time: 4.709 sec
(1500000 rows, 4.709 sec, 109.9 MiB inserted)
default> select * from parquet_test;
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.
Exception in thread "main" java.lang.NullPointerException
at
parquet.hadoop.InternalParquetRecordReader.close(InternalParquetRecordReader.java:118)
at parquet.hadoop.ParquetReader.close(ParquetReader.java:144)
at
org.apache.tajo.storage.parquet.ParquetScanner.close(ParquetScanner.java:87)
at org.apache.tajo.storage.MergeScanner.close(MergeScanner.java:137)
at org.apache.tajo.jdbc.TajoResultSet.close(TajoResultSet.java:153)
at org.apache.tajo.cli.TajoCli.localQueryCompleted(TajoCli.java:387)
at org.apache.tajo.cli.TajoCli.executeQuery(TajoCli.java:365)
at org.apache.tajo.cli.TajoCli.executeParsedResults(TajoCli.java:322)
at org.apache.tajo.cli.TajoCli.runShell(TajoCli.java:311)
at org.apache.tajo.cli.TajoCli.main(TajoCli.java:490)
Apr 30, 2014 11:04:01 AM INFO: parquet.hadoop.ParquetFileReader: reading
another 1 footers
{noformat}
The patch fixes the bug where CreateTableNode takes the wrong schema.
In addition, I found the potential problem where ParquetFile stores the Tajo
Schema into its extra meta data. I think that it will problem when users
renames its database name or table name. So, I removed the code to insert a
Tajo schema into extra metadata and I changed Parquet reading to not use extra
metadata.
Tajo mainly uses Catalog system to manage schemas, and reading parquet files in
Tajo depends on Tajo catalog. So, it will work well. Also, other systems can
access parquet files by directly reading parquet's native schema.
Diffs
-----
tajo-client/src/main/java/org/apache/tajo/cli/ConnectDatabaseCommand.java
4ec4611
tajo-core/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java
f2ddf13
tajo-core/src/test/java/org/apache/tajo/QueryTestCaseBase.java 961184c
tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java
e058943
tajo-core/src/test/resources/queries/TestInsertQuery/full_table_csv_ddl.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/full_table_parquet_ddl.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/table1_ddl.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwrite.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocation.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocationWithCompression.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteSmallerColumns.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithAsterisk.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression_ddl.sql
PRE-CREATION
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithTargetColumns.sql
PRE-CREATION
tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet.result
PRE-CREATION
tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet2.result
PRE-CREATION
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/ParquetScanner.java
38d8ca4
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoReadSupport.java
4733a2f
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoWriteSupport.java
83b2f7b
Diff: https://reviews.apache.org/r/20877/diff/
Testing
-------
Both are passed successfully.
mvn clean install -Phcatalog-0.12.0
-Dtajo.catalog.store.class=org.apache.tajo.catalog.store.HCatalogStore
mvn clean install
Thanks,
Hyunsik Choi