date:20151212

[jira] [Commented] (HIVE-11890) Create ORC module

2015-12-12 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054814#comment-15054814
 ] 

Owen O'Malley commented on HIVE-11890:
--

Yeah, I still have the Reader & RecordReader to move.

> Create ORC module
> -
>
> Key: HIVE-11890
> URL: https://issues.apache.org/jira/browse/HIVE-11890
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.0.0
>
> Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, 
> HIVE-11890.patch
>
>
> Start moving classes over to the ORC module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054792#comment-15054792
 ] 

Hive QA commented on HIVE-12055:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777322/HIVE-12055.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6338/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6338/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777322 - PreCommit-HIVE-TRUNK-Build

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC, Shims
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054774#comment-15054774
 ] 

Hive QA commented on HIVE-12656:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777306/HIVE-12656.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 111 failed/errored test(s), 9865 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_decimal_round.q-cbo_windowing.q-tez_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_udaf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_dependency2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_orig_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_orig_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_boolexpr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_coltype_literals
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_decode_name
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_special_char
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_where
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rename_partition_location
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_and
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_not
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_unquote_or
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_u

[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO

2015-12-12 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11775:
---
Attachment: HIVE-11775.07.patch

> Implement limit push down through union all in CBO
> --
>
> Key: HIVE-11775
> URL: https://issues.apache.org/jira/browse/HIVE-11775
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, 
> HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, 
> HIVE-11775.06.patch, HIVE-11775.07.patch
>
>
> Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually 
> push limit down through union all, which reduces the intermediate number of 
> rows in union branches. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path

2015-12-12 Thread Owen O'Malley (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-12055:
-
Attachment: HIVE-12055.patch

Reloaded for jenkins.

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC, Shims
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054732#comment-15054732
 ] 

Hive QA commented on HIVE-12661:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777298/HIVE-12661.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1322 failed/errored test(s), 9881 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-mapreduce2.q-load_dyn_part3.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_file_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_clusterby_sortby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_skewed_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_not_sorted
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguous_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ansi_sql_arithmetic
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join18_multi_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join28
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join31
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_stats

[jira] [Commented] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054663#comment-15054663
 ] 

Hive QA commented on HIVE-12590:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777291/HIVE-12590.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf_matchpath
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_self_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_date_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_interval_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6334/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6334/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6334/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777291 - PreCommit-HIVE-TRUNK-Build

> Repeated UDAFs with literals can produce incorrect result
> -
>
> Key: HIVE-12590
> URL: https://issues.apache.org/jira/browse/HIVE-12590
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0
>Reporter: Laljo John Pullokkaran
>Assignee: Ashutosh Chauhan
>Priority: Critical

[jira] [Updated] (HIVE-12656) Turn hive.compute.query.using.stats on by default

2015-12-12 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12656:
---
Attachment: HIVE-12656.01.patch

> Turn hive.compute.query.using.stats on by default
> -
>
> Key: HIVE-12656
> URL: https://issues.apache.org/jira/browse/HIVE-12656
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12656.01.patch
>
>
> We now have hive.compute.query.using.stats=false by default. We plan to turn 
> it on by default so that we can have better performance. We can also set it 
> to false in some test cases to maintain the original purpose of those tests..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054616#comment-15054616
 ] 

Hive QA commented on HIVE-12662:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777282/HIVE-12662.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6333/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6333/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6333/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777282 - PreCommit-HIVE-TRUNK-Build

> StackOverflowError in HiveSortJoinReduceRule when limit=0
> -
>
> Key: HIVE-12662
> URL: https://issues.apache.org/jira/browse/HIVE-12662
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12662.patch
>
>
> L96 of HiveSortJoinReduceRule, you will see 
> {noformat}
> // Finally, if we do not reduce the input size, we bail out
> if (RexLiteral.intValue(sortLimit.fetch)
> >= RelMetadataQuery.getRowCount(reducedInput)) {
>   return false;
> }
> {noformat}
> It is using “ RelMetadataQuery.getRowCount” which is always at least 1. This 
> is the problem that we resolved in CALCITE-987.
> To confirm this, I just run the q file :
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.optimize.limitjointranspose=true;
> set hive.optimize.limitjointranspose.reductionpercentage=1f;
> set hive.optimize.limitjointranspose.reductiontuples=0;
> explain
> select *
> from src src1 right outer join (
>   select *
>   from src src2 left outer join src src3
>   on src2.value = src3.value) src2
> on src1.key = src2.key
> limit 0;
> {noformat}
>   And I got
> {noformat}
> 2015-12-11T10:21:04,435 ERROR [c1efb099-f900-46dc-9f74-97af0944a99d main[]]: 
> parse.CalcitePlanner (CalcitePlanner.java:genOPTree(301)) - CBO failed, 
> skipping CBO.
> java.lang.RuntimeException: java.lang.StackOverflowError
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:749)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:645)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:264)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10076)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:223)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2

[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054532#comment-15054532
 ] 

Hive QA commented on HIVE-12478:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777279/HIVE-12478.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 139 failed/errored test(s), 9880 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_without_localtask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_position
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_self_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_hive_626
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_parse
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_star
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_join_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_mapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_mapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_table_access_keys_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mr_diff_schema_alias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_nested_mapjoin
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver

[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2015-12-12 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12661:
---
Attachment: HIVE-12661.01.patch

> StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
> ---
>
> Key: HIVE-12661
> URL: https://issues.apache.org/jira/browse/HIVE-12661
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12661.01.patch
>
>
> PROBLEM:
> Hive stats are autogathered properly till an 'analyze table [tablename] 
> compute statistics for columns' is run. Then it does not auto-update the 
> stats till the command is run again. repo:
> {code}
> set hive.stats.autogather=true; 
> set hive.stats.atomic=false ; 
> set hive.stats.collect.rawdatasize=true ; 
> set hive.stats.collect.scancols=false ; 
> set hive.stats.collect.tablekeys=false ; 
> set hive.stats.fetch.column.stats=true; 
> set hive.stats.fetch.partition.stats=true ; 
> set hive.stats.reliable=false ; 
> set hive.compute.query.using.stats=true; 
> CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 
> 'orc.compress'='NONE') ; 
> insert into calendar values (2010), (2011), (2012); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> ++--+ 
> select max(year) from calendar; 
> | 2012 | 
> insert into calendar values (2013); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> | 2013 | 
> ++--+ 
> select max(year) from calendar; 
> | 2013 | 
> insert into calendar values (2014); 
> select max(year) from calendar; 
> | 2014 |
> analyze table calendar compute statistics for columns;
> insert into calendar values (2015);
> select max(year) from calendar;
> | 2014 |
> insert into calendar values (2016), (2017), (2018);
> select max(year) from calendar;
> | 2014  |
> analyze table calendar compute statistics for columns;
> select max(year) from calendar;
> | 2018  |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

2015-12-12 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12590:

Attachment: HIVE-12590.2.patch

> Repeated UDAFs with literals can produce incorrect result
> -
>
> Key: HIVE-12590
> URL: https://issues.apache.org/jira/browse/HIVE-12590
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0
>Reporter: Laljo John Pullokkaran
>Assignee: Ashutosh Chauhan
>Priority: Critical
> Attachments: HIVE-12590.2.patch, HIVE-12590.patch
>
>
> Repeated UDAF with literals could produce wrong result.
> This is not a common use case, nevertheless a bug.
> hive> select max('pants'), max('pANTS') from t1 group by key;
>  Total MapReduce CPU Time Spent: 0 msec
> OK
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> pANTS pANTS
> Time taken: 296.252 seconds, Fetched: 5 row(s)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054367#comment-15054367
 ] 

Hive QA commented on HIVE-12643:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777237/HIVE-12643.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9896 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6331/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6331/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6331/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777237 - PreCommit-HIVE-TRUNK-Build

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0

2015-12-12 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054331#comment-15054331
 ] 

Pengcheng Xiong commented on HIVE-12662:


LGTM +1. I think we do not have any better solution so far for Hive until next 
release of Calcite.

> StackOverflowError in HiveSortJoinReduceRule when limit=0
> -
>
> Key: HIVE-12662
> URL: https://issues.apache.org/jira/browse/HIVE-12662
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12662.patch
>
>
> L96 of HiveSortJoinReduceRule, you will see 
> {noformat}
> // Finally, if we do not reduce the input size, we bail out
> if (RexLiteral.intValue(sortLimit.fetch)
> >= RelMetadataQuery.getRowCount(reducedInput)) {
>   return false;
> }
> {noformat}
> It is using “ RelMetadataQuery.getRowCount” which is always at least 1. This 
> is the problem that we resolved in CALCITE-987.
> To confirm this, I just run the q file :
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.optimize.limitjointranspose=true;
> set hive.optimize.limitjointranspose.reductionpercentage=1f;
> set hive.optimize.limitjointranspose.reductiontuples=0;
> explain
> select *
> from src src1 right outer join (
>   select *
>   from src src2 left outer join src src3
>   on src2.value = src3.value) src2
> on src1.key = src2.key
> limit 0;
> {noformat}
>   And I got
> {noformat}
> 2015-12-11T10:21:04,435 ERROR [c1efb099-f900-46dc-9f74-97af0944a99d main[]]: 
> parse.CalcitePlanner (CalcitePlanner.java:genOPTree(301)) - CBO failed, 
> skipping CBO.
> java.lang.RuntimeException: java.lang.StackOverflowError
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:749)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:645)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:264)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10076)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:223)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1138) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1187) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1063) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> {noformat}
> via [~pxiong]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054319#comment-15054319
 ] 

Hive QA commented on HIVE-12541:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777227/HIVE-12541.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6330/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6330/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6330/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777227 - PreCommit-HIVE-TRUNK-Build

> SymbolicTextInputFormat should supports the path with regex
> ---
>
> Key: HIVE-12541
> URL: https://issues.apache.org/jira/browse/HIVE-12541
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.14.0, 1.2.0, 1.2.1
>Reporter: Xiaowei Wang
>Assignee: Xiaowei Wang
> Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch
>
>
> 1, In fact,SybolicTextInputFormat supports the path with regex .I add some 
> test sql . 
> 2, But ,when using CombineHiveInputFormat to combine  input files , It cannot 
> resolve the path with regex ,so it will get a wrong result.I  give a example 
> ,and fix the problem.
> Table desc :
> {noformat}
> CREATE External TABLE `symlink_text_input_format`(
>   `key` string,
>   `value` string)
> ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
>   'viewfs://nsX/user/hive/warehouse/symlink_text_input_format'  
> {noformat}
> There is a link file in the dir 
> '/user/hive/warehouse/symlink_text_input_format' ,   the content of the link 
> file is 
> {noformat}
>  viewfs://nsx/tmp/symlink* 
> {noformat}
> it contains one path ,and the path contains a regex!
> Execute the sql : 
> {noformat}
> set hive.rework.mapredwork = true ;
> set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
> set mapred.min.split.size.per.rack= 0 ;
> set mapred.min.split.size.per.node= 0 ;
> set mapred.max.split.size= 0 ;
> select count(*) from  symlink_text_input_format ;
> {noformat}
> It will get a wrong result :0 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0

2015-12-12 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12662:
---
Attachment: HIVE-12662.patch

[~pxiong], could you take a look? Thanks

> StackOverflowError in HiveSortJoinReduceRule when limit=0
> -
>
> Key: HIVE-12662
> URL: https://issues.apache.org/jira/browse/HIVE-12662
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12662.patch
>
>
> L96 of HiveSortJoinReduceRule, you will see 
> {noformat}
> // Finally, if we do not reduce the input size, we bail out
> if (RexLiteral.intValue(sortLimit.fetch)
> >= RelMetadataQuery.getRowCount(reducedInput)) {
>   return false;
> }
> {noformat}
> It is using “ RelMetadataQuery.getRowCount” which is always at least 1. This 
> is the problem that we resolved in CALCITE-987.
> To confirm this, I just run the q file :
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.optimize.limitjointranspose=true;
> set hive.optimize.limitjointranspose.reductionpercentage=1f;
> set hive.optimize.limitjointranspose.reductiontuples=0;
> explain
> select *
> from src src1 right outer join (
>   select *
>   from src src2 left outer join src src3
>   on src2.value = src3.value) src2
> on src1.key = src2.key
> limit 0;
> {noformat}
>   And I got
> {noformat}
> 2015-12-11T10:21:04,435 ERROR [c1efb099-f900-46dc-9f74-97af0944a99d main[]]: 
> parse.CalcitePlanner (CalcitePlanner.java:genOPTree(301)) - CBO failed, 
> skipping CBO.
> java.lang.RuntimeException: java.lang.StackOverflowError
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.rethrowCalciteException(CalcitePlanner.java:749)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:645)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:264)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10076)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:223)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1138) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1187) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1063) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> {noformat}
> via [~pxiong]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2015-12-12 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12478:
---
Attachment: HIVE-12478.patch

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2015-12-12 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12478:
---
Attachment: (was: HIVE-12478.patch)

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2015-12-12 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054278#comment-15054278
 ] 

Jesus Camacho Rodriguez commented on HIVE-12478:


Initial patch groups PPD and JoinTransitive inference in one single invocation. 
I have been running some local tests, and we do not fall in any loop.

In addition, HiveRuleRegistry has been enhanced in case we want to register 
predicates that have been pushed. However, the enhancement is not used yet. 
Further, I think we will not need to use it, as being able to register the 
operators that we have already visited in a given rule will be sufficient.

Another discussion that we had is whether HiveRuleRegistry should register the 
operators or the digest of the plan. We still register the operators objects. 
The reason is that if we had registered the digest of a given subplan in the 
tree, and the same subplan appears somewhere else in the tree, we would not 
trigger the optimization (and we should trigger it).

[~pxiong], could you check if after applying this patch, ReduceExpressions 
could be plugged in the same invocation as PPD and join transitive inference 
(step 3 in pre-join optimizations)?

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2015-12-12 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12478:
---
Attachment: HIVE-12478.patch

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054261#comment-15054261
 ] 

Hive QA commented on HIVE-12590:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777223/HIVE-12590.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 104 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cluster
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cp_sel
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_identity_project_remove_skip
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join38
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_clusterby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_matchpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regex_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqual_corr_expr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_folder_constants
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_date_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_1
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestM

[jira] [Commented] (HIVE-12644) Support for offset in HiveSortMergeRule

2015-12-12 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054221#comment-15054221
 ] 

Jesus Camacho Rodriguez commented on HIVE-12644:


We need to check specifically for {{topOffset < (bottomOffset + bottomLimit)}}.

Observe this setting:
topOffset=1 topLimit=3
bottomOffset=1 bottomLimit=2

In this case, given the change that you propose, we would create a limit 0. 
However, that is not correct. We need to create a single SortLimit operator 
containing:
offset=2 limit 1

> Support for offset in HiveSortMergeRule
> ---
>
> Key: HIVE-12644
> URL: https://issues.apache.org/jira/browse/HIVE-12644
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12644.patch
>
>
> After HIVE-11531 goes in, HiveSortMergeRule needs to be extended to support 
> offset properly when it merges operators that contain Limit. Otherwise, limit 
> pushdown through outer join optimization (introduced in HIVE-11684) will not 
> work properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12633) LLAP: package included serde jars

2015-12-12 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054205#comment-15054205
 ] 

Hive QA commented on HIVE-12633:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777219/HIVE-12633.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9880 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_custom_key2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6328/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6328/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6328/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777219 - PreCommit-HIVE-TRUNK-Build

> LLAP: package included serde jars
> -
>
> Key: HIVE-12633
> URL: https://issues.apache.org/jira/browse/HIVE-12633
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12633.01.patch, HIVE-12633.02.patch, 
> HIVE-12633.patch
>
>
> Some SerDes like JSONSerde are not packaged with LLAP. One cannot localize 
> jars on the daemon (due to security consideration if nothing else), so we 
> should package them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-12 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054188#comment-15054188
 ] 

Gopal V commented on HIVE-12473:


bq. The converter should work off of the final type, it needs to convert the 
string partition value to the output of the evaluator.

What it is doing before fix was {{cast(dt as int)}}, which is wrong - it should 
do {{cast(dt as date)}}. Even if the final UDF type is an int due to 
{{year(dt}}.

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-12-12 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054165#comment-15054165
 ] 

Gunther Hagleitner commented on HIVE-11634:
---

FYI it's these changes that mean dpp is no longer working:
{noformat}
-Select Operator
-  expressions: UDFToDouble(UDFToInteger((hr / 2))) (type: 
double)
-  outputColumnNames: _col0
-  Statistics: Num rows: 1 Data size: 7 Basic stats: 
COMPLETE Column stats: NONE
-  Group By Operator
-keys: _col0 (type: double)
-mode: hash
-outputColumnNames: _col0
-Statistics: Num rows: 1 Data size: 7 Basic stats: 
COMPLETE Column stats: NONE
-Dynamic Partitioning Event Operator
-  Target Input: srcpart
-  Partition key expr: UDFToDouble(hr)
-  Statistics: Num rows: 1 Data size: 7 Basic stats: 
COMPLETE Column stats: NONE
-  Target column: hr
-  Target Vertex: Map 1
{noformat}

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>  Labels: TODOC1.3
> Fix For: 1.3.0
>
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12573) some DPP tests are broken

2015-12-12 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054161#comment-15054161
 ] 

Gunther Hagleitner commented on HIVE-12573:
---

I don't believe the effect is only cosmetic. Having an un-executable synthetic 
predicate left over in the TS operator breaks anyone trying to implement filter 
pushdown. However, this was introduced by HIVE-12462, which I don't think is 
the right fix in the first place. Makes sense to find out what's happening 
there before proceeding with this.

> some DPP tests are broken
> -
>
> Key: HIVE-12573
> URL: https://issues.apache.org/jira/browse/HIVE-12573
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12573.patch
>
>
> -It looks like LLAP out files were not updated in some DPP JIRA because the 
> test was entirely broken in HiveQA at the time- actually looks like out files 
> have explain output with a glitch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-12-12 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-12462:
--
Priority: Blocker  (was: Critical)

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

2015-12-12 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054154#comment-15054154
 ] 

Gunther Hagleitner commented on HIVE-11634:
---

[~hsubramaniyan]/[~jpullokkaran] this breaks dpp (see changed golden file in 
this patch or HIVE-12462). Not sure what the best way to fix this is.

> Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
> --
>
> Key: HIVE-11634
> URL: https://issues.apache.org/jira/browse/HIVE-11634
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>  Labels: TODOC1.3
> Fix For: 1.3.0
>
> Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, 
> HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, 
> HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, 
> HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, 
> HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, 
> HIVE-11634.96.patch, HIVE-11634.97.patch, HIVE-11634.98.patch, 
> HIVE-11634.99.patch, HIVE-11634.990.patch, HIVE-11634.991.patch, 
> HIVE-11634.992.patch, HIVE-11634.993.patch, HIVE-11634.994.patch, 
> HIVE-11634.995.patch, HIVE-11634.patch
>
>
> Currently, we do not support partition pruning for the following scenario
> {code}
> create table pcr_t1 (key int, value string) partitioned by (ds string);
> insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src 
> where key < 20 order by key;
> insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src 
> where key < 20 order by key;
> explain extended select ds from pcr_t1 where struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> If we run the above query, we see that all the partitions of table pcr_t1 are 
> present in the filter predicate where as we can prune  partition 
> (ds='2000-04-10'). 
> The optimization is to rewrite the above query into the following.
> {code}
> explain extended select ds from pcr_t1 where  (struct(ds)) IN 
> (struct('2000-04-08'), struct('2000-04-09')) and  struct(ds, key) in 
> (struct('2000-04-08',1), struct('2000-04-09',2));
> {code}
> The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09'))  
> is used by partition pruner to prune the columns which otherwise will not be 
> pruned.
> This is an extension of the idea presented in HIVE-11573.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-12-12 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054151#comment-15054151
 ] 

Gunther Hagleitner commented on HIVE-12462:
---

I've looked into this some more. This should have already worked. There are 
tests in dynamic_partition_pruner.q that check for this type of join condition 
(with udf). However these tests no longer work. HIVE-11634 broke them. 
[~hsubramaniyan]/[~jpullokkaran] can you please take a look at this? I don't 
think the patch proposed here is the right fix and should probably be reverted.

HIVE-11634 changes the golden file of the dynamic_partition_pruner.q - it 
effectively disables the optimization and I'm not sure why. The synthetic 
predicate in dpp is of the form (col IN (reducesink operator)) which for some 
reason gets lost in HIVE-11634.

HIVE-11634 also seem to leave you with different expressions in the table scan 
and the filter and I'm thinking this is wrong as well (i.e.: the fix in this 
patch shouldn't work either).

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-12 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15054147#comment-15054147
 ] 

Gunther Hagleitner commented on HIVE-12473:
---

I can't figure out why this fix would be doing the right thing. The converter 
should work off of the final type, it needs to convert the string partition 
value to the output of the evaluator. I think this should be reverted.

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-12-12 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner reopened HIVE-12462:
---

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11890) Create ORC module

[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path

[jira] [Commented] (HIVE-12656) Turn hive.compute.query.using.stats on by default

[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO

[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path

[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

[jira] [Commented] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

[jira] [Updated] (HIVE-12656) Turn hive.compute.query.using.stats on by default

[jira] [Commented] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0

[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

[jira] [Updated] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

[jira] [Commented] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0

[jira] [Commented] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex

[jira] [Updated] (HIVE-12662) StackOverflowError in HiveSortJoinReduceRule when limit=0

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

[jira] [Commented] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result

[jira] [Commented] (HIVE-12644) Support for offset in HiveSortMergeRule

[jira] [Commented] (HIVE-12633) LLAP: package included serde jars

[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

[jira] [Commented] (HIVE-12573) some DPP tests are broken

[jira] [Updated] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

[jira] [Commented] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)

[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

[jira] [Reopened] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

31 matches

Site Navigation

Mail list logo

Footer information