[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152510#comment-15152510 ] Yongzhi Chen commented on HIVE-13039: - Yes, only for branch-1 > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, > HIVE-13039.2.branch-1.txt, HIVE-13039.2.patch, HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151171#comment-15151171 ] Sergio Peña commented on HIVE-13039: Nevermind. I will revert the patch, and re-committed it. Is it only for branch-1? > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, > HIVE-13039.2.branch-1.txt, HIVE-13039.2.patch, HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151170#comment-15151170 ] Sergio Peña commented on HIVE-13039: Thanks. Could you create another JIRA for the new failure? i already committed the changes :P. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, > HIVE-13039.2.branch-1.txt, HIVE-13039.2.patch, HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150798#comment-15150798 ] Sergio Peña commented on HIVE-13039: Thanks. I committed to the branch-1 as well. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, > HIVE-13039.2.patch, HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150704#comment-15150704 ] Yongzhi Chen commented on HIVE-13039: - [~spena], I tried to add the tests, but each one has many build errors. I think the files are added with other fixes which are not in branch-1. And for this jira, it has its own tests, so it is safe to only have the newly added tests. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 2.1.0 > > Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, > HIVE-13039.2.patch, HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149371#comment-15149371 ] Sergio Peña commented on HIVE-13039: [~ychena] Can we add those unit-tests to branch-1? At least the ones that do not take too much time to include. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 2.1.0 > > Attachments: HIVE-13039.1.branch1.txt, HIVE-13039.1.patch, > HIVE-13039.2.patch, HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149348#comment-15149348 ] Yongzhi Chen commented on HIVE-13039: - Thanks [~spena] for reviewing the code. The following 3 files are not in branch-1, so I remove the changes related to the 3 file. ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java attach the change for branch-1. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 2.1.0 > > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch, > HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148812#comment-15148812 ] Sergio Peña commented on HIVE-13039: Looks good +1 > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch, > HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148645#comment-15148645 ] Yongzhi Chen commented on HIVE-13039: - The failures are not related. [~spena], could you review the change? > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch, > HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146697#comment-15146697 ] Yongzhi Chen commented on HIVE-13039: - The failures are related. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch, > HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146515#comment-15146515 ] Hive QA commented on HIVE-13039: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12787679/HIVE-13039.3.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9788 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-vector_partition_diff_num_cols.q-orc_merge9.q-vector_decimal_aggregate.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_mat_5 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_5 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_3 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_4 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_cte_mat_5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestMetaStoreAuthorization.testMetaStoreAuthorization org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6976/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6976/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6976/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12787679 - PreCommit-HIVE-TRUNK-Build > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch, > HIVE-13039.3.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144712#comment-15144712 ] Sergio Peña commented on HIVE-13039: Test {{org.apache.hadoop.hive.ql.io.parquet.TestParquetRecordReaderWrapper.testBuilder}} is failing because it uses {{between}} tests as well. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144721#comment-15144721 ] Yongzhi Chen commented on HIVE-13039: - Three failures are related. Patch 3 fix the failures. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144185#comment-15144185 ] Hive QA commented on HIVE-13039: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12787599/HIVE-13039.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9762 tests executed *Failed tests:* {noformat} TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.io.parquet.TestParquetRecordReaderWrapper.testBuilder org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression3 org.apache.hadoop.hive.ql.io.sarg.TestConvertAstToSearchArg.testExpression5 org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6952/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6952/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6952/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12787599 - PreCommit-HIVE-TRUNK-Build > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142854#comment-15142854 ] Yongzhi Chen commented on HIVE-13039: - [~spena], could you review the change? Thanks > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142910#comment-15142910 ] Sergio Peña commented on HIVE-13039: Thanks [~ychena]. The patch looks good. Could you add some test cases to {{TestParquetFilterPredicate}} as well? It would be great if you can include different examples, like {{BETWEEN 5 and 1}}, {{BETWEEN 1 and 5}}, {{BETWEEN 1 and 1}} to see how the Filter predicate would be. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13039) BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table
[ https://issues.apache.org/jira/browse/HIVE-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143475#comment-15143475 ] Sergio Peña commented on HIVE-13039: Thanks. Those tests are good. Btw, {{testFilterFloatColumns}} is failing: {noformat} Expected :and(and(not(eq(a, null)), not(and(lt(a, 20.3), not(lteq(a, 10.2), not(or(or(eq(b, 1), eq(b, 2)), eq(b, 3 Actual :and(and(not(eq(a, null)), not(and(lteq(a, 20.3), not(lt(a, 10.2), not(or(or(eq(b, 1), eq(b, 2)), eq(b, 3 {noformat} It's related to your change. I did not notice that we're using a {{between}} call there as well. > BETWEEN predicate is not functioning correctly with predicate pushdown on > Parquet table > --- > > Key: HIVE-13039 > URL: https://issues.apache.org/jira/browse/HIVE-13039 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 1.2.1, 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-13039.1.patch, HIVE-13039.2.patch > > > BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as > it is by default in newer Hive versions). To reproduce(in a cluster, not > local setup): > CREATE TABLE parquet_tbl( > key int, > ldate string) > PARTITIONED BY ( > lyear string ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'; > insert overwrite table parquet_tbl partition (lyear='2016') select > 1, > '2016-02-03' from src limit 1; > set hive.optimize.ppd.storage = true; > set hive.optimize.ppd = true; > select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03'; > No row will be returned in a cluster. > But if you turn off hive.optimize.ppd, one row will be returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)