-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20877/
-----------------------------------------------------------

Review request for Tajo.


Bugs: TAJO-806
    https://issues.apache.org/jira/browse/TAJO-806


Repository: tajo


Description
-------

In below case, currently, TajoWriteSupport just takes the schema of the table 
{{orders}}. In other words, each column qualifier was {{default.orders}} 
instead of {{default.parquet_test}}. This is a bug. In such a case, we can meet 
the following error when we read parquet files.

{noformat}
default> create table parquet_test using parquet as select * from orders;
Progress: 0%, response time: 1.119 sec
Progress: 0%, response time: 2.121 sec
Progress: 0%, response time: 3.123 sec
Progress: 83%, response time: 4.126 sec
Progress: 100%, response time: 4.709 sec
(1500000 rows, 4.709 sec, 109.9 MiB inserted)

default> select * from parquet_test;
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
Exception in thread "main" java.lang.NullPointerException
        at 
parquet.hadoop.InternalParquetRecordReader.close(InternalParquetRecordReader.java:118)
        at parquet.hadoop.ParquetReader.close(ParquetReader.java:144)
        at 
org.apache.tajo.storage.parquet.ParquetScanner.close(ParquetScanner.java:87)
        at org.apache.tajo.storage.MergeScanner.close(MergeScanner.java:137)
        at org.apache.tajo.jdbc.TajoResultSet.close(TajoResultSet.java:153)
        at org.apache.tajo.cli.TajoCli.localQueryCompleted(TajoCli.java:387)
        at org.apache.tajo.cli.TajoCli.executeQuery(TajoCli.java:365)
        at org.apache.tajo.cli.TajoCli.executeParsedResults(TajoCli.java:322)
        at org.apache.tajo.cli.TajoCli.runShell(TajoCli.java:311)
        at org.apache.tajo.cli.TajoCli.main(TajoCli.java:490)
Apr 30, 2014 11:04:01 AM INFO: parquet.hadoop.ParquetFileReader: reading 
another 1 footers
{noformat}

The patch fixes the bug where CreateTableNode takes the wrong schema.

In addition, I found the potential problem where ParquetFile stores the Tajo 
Schema into its extra meta data. I think that it will problem when users 
renames its database name or table name. So, I removed the code to insert a 
Tajo schema into extra metadata and I changed Parquet reading to not use extra 
metadata.

Tajo mainly uses Catalog system to manage schemas, and reading parquet files in 
Tajo depends on Tajo catalog. So, it will work well. Also, other systems can 
access parquet files by directly reading parquet's native schema.


Diffs
-----

  tajo-client/src/main/java/org/apache/tajo/cli/ConnectDatabaseCommand.java 
4ec4611 
  tajo-core/src/main/java/org/apache/tajo/engine/planner/LogicalPlanner.java 
f2ddf13 
  tajo-core/src/test/java/org/apache/tajo/QueryTestCaseBase.java 961184c 
  tajo-core/src/test/java/org/apache/tajo/engine/query/TestInsertQuery.java 
e058943 
  tajo-core/src/test/resources/queries/TestInsertQuery/full_table_csv_ddl.sql 
PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/full_table_parquet_ddl.sql 
PRE-CREATION 
  tajo-core/src/test/resources/queries/TestInsertQuery/table1_ddl.sql 
PRE-CREATION 
  tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwrite.sql 
PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocation.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteLocationWithCompression.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteSmallerColumns.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithAsterisk.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithCompression_ddl.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/queries/TestInsertQuery/testInsertOverwriteWithTargetColumns.sql
 PRE-CREATION 
  
tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet.result
 PRE-CREATION 
  
tajo-core/src/test/resources/results/TestInsertQuery/testInsertOverwriteWithAsteriskUsingParquet2.result
 PRE-CREATION 
  
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/ParquetScanner.java 
38d8ca4 
  
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoReadSupport.java 
4733a2f 
  
tajo-storage/src/main/java/org/apache/tajo/storage/parquet/TajoWriteSupport.java
 83b2f7b 

Diff: https://reviews.apache.org/r/20877/diff/


Testing
-------

Both are passed successfully.

mvn clean install -Phcatalog-0.12.0 
-Dtajo.catalog.store.class=org.apache.tajo.catalog.store.HCatalogStore 

mvn clean install


Thanks,

Hyunsik Choi

Reply via email to