date:20160524

[jira] [Commented] (HIVE-13354) Add ability to specify Compaction options per table and per request

2016-05-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299500#comment-15299500
 ] 

Hive QA commented on HIVE-13354:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805815/HIVE-13354.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 9902 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_16.q-skewjoin.q-vectorization_div0.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_union_group_by.q-vector_auto_smb_mapjoin_14.q-union_fast_stats.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-union2.q-bucket4.q-and-12-more - did 
not produce a TEST-*.xml file
TestSparkCliDriver-auto_join30.q-join2.q-input17.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby2.q-custom_input_output_format.q-join41.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-skewjoinopt8.q-union_remove_1.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_complex_types.q-groupby_map_ppr_multi_distinct.q-vectorization_16.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-parallel_join1.q-escape_distributeby1.q-auto_sortmerge_join_7.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_transform.q-union_remove_7.q-date_udf.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-script_pipe.q-stats12.q-auto_join24.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-skewjoinopt15.q-join39.q-avro_joins_native.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries_with_filters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_vec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_dynamic_partition
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_where_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_orderby_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_context
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_nested_mapjoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join21
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join28
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby6
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_map
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_leftsemijoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_subquery2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert

[jira] [Commented] (HIVE-13736) View's input/output formats are TEXT by default

2016-05-24 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299491#comment-15299491
 ] 

Lefty Leverenz commented on HIVE-13736:
---

Doc note:  This should be documented in the Create View section of the DDL doc, 
as well as mentioning it in the descriptions for *hive.default.fileformat* and 
*hive.default.fileformat.managed* (with version information).

* [DDL -- Create View | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateView]
* [Configuration Properties -- File Formats | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-FileFormats]
** [hive.default.fileformat | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.default.fileformat]
** [hive.default.fileformat.managed | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.default.fileformat.managed]

The link above to *hive.default.fileformat.managed* won't work until it gets 
documented.  It was introduced by HIVE-9915 in release 1.2.0.

> View's input/output formats are TEXT by default
> ---
>
> Key: HIVE-13736
> URL: https://issues.apache.org/jira/browse/HIVE-13736
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Pavas Garg
>Assignee: Yongzhi Chen
>Priority: Minor
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13736.1.patch
>
>
> Feature request where Hive View's input/output formats are text by default in 
> order to help 3rd party compatibility



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13736) View's input/output formats are TEXT by default

2016-05-24 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-13736:
--
Labels: TODOC2.1  (was: )

> View's input/output formats are TEXT by default
> ---
>
> Key: HIVE-13736
> URL: https://issues.apache.org/jira/browse/HIVE-13736
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Pavas Garg
>Assignee: Yongzhi Chen
>Priority: Minor
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13736.1.patch
>
>
> Feature request where Hive View's input/output formats are text by default in 
> order to help 3rd party compatibility



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13760) Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running for more than the configured timeout value.

2016-05-24 Thread zhihai xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299444#comment-15299444
 ] 

zhihai xu commented on HIVE-13760:
--

I attached a patch HIVE-13760.000.patch which add a configuration 
HIVE_QUERY_TIMEOUT_SECONDS and use the smaller one as the real timeout value in 
SQLOperation. I think it may give user more flexibility, which may help save 
Hadoop cluster resource.

> Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running 
> for more than the configured timeout value.
> 
>
> Key: HIVE-13760
> URL: https://issues.apache.org/jira/browse/HIVE-13760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 2.0.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-13760.000.patch
>
>
> Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running 
> for more than the configured timeout value. The default value will be -1 , 
> which means no timeout. This will be useful for  user to manage queries with 
> SLA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13760) Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running for more than the configured timeout value.

2016-05-24 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-13760:
-
Status: Patch Available  (was: Open)

> Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running 
> for more than the configured timeout value.
> 
>
> Key: HIVE-13760
> URL: https://issues.apache.org/jira/browse/HIVE-13760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 2.0.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-13760.000.patch
>
>
> Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running 
> for more than the configured timeout value. The default value will be -1 , 
> which means no timeout. This will be useful for  user to manage queries with 
> SLA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13760) Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running for more than the configured timeout value.

2016-05-24 Thread zhihai xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated HIVE-13760:
-
Attachment: HIVE-13760.000.patch

> Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running 
> for more than the configured timeout value.
> 
>
> Key: HIVE-13760
> URL: https://issues.apache.org/jira/browse/HIVE-13760
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 2.0.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Attachments: HIVE-13760.000.patch
>
>
> Add a HIVE_QUERY_TIMEOUT configuration to kill a query if a query is running 
> for more than the configured timeout value. The default value will be -1 , 
> which means no timeout. This will be useful for  user to manage queries with 
> SLA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13617) LLAP: support non-vectorized execution in IO

2016-05-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299384#comment-15299384
 ] 

Hive QA commented on HIVE-13617:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805678/HIVE-13617.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 45 failed/errored test(s), 10035 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-mapjoin_mapjoin.q-insert_into1.q-vector_decimal_2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-multi_insert.q-join5.q-groupby6.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_llap_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_udf
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mapjoin_decimal
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_ppd_basic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_group_by
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llapdecider
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics
org.apache.hadoop.hive.metastore.TestRetryingHMSHandler.testRetryingHMSHandler
org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testDelegationTokenSharedStore
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore
org.apache.hive.hcatalog.cli.TestPermsGrp.testCustomPerms
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.org.apache.hive.service.TestHS2ImpersonationWithRemoteMS
{noformat}

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/380/testReport
Console output:

[jira] [Commented] (HIVE-13829) Property "hive.mapjoin.optimized.keys" does not exist

2016-05-24 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299380#comment-15299380
 ] 

Lefty Leverenz commented on HIVE-13829:
---

HIVE-9331 removed *hive.mapjoin.optimized.keys* and 
*hive.mapjoin.lazy.hashtable* in release 1.1.0 but the documentation hasn't be 
updated yet.

> Property "hive.mapjoin.optimized.keys" does not exist
> -
>
> Key: HIVE-13829
> URL: https://issues.apache.org/jira/browse/HIVE-13829
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.0.0
> Environment: Hadoop 2.7.2, Hive 2.0.0, Spark 1.6.1, Kerberos
>Reporter: Alexandre Linte
>
> Refering to the documentation 
> (https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties), 
> it is possible to set the following property "hive.mapjoin.optimized.keys". 
> Unfortunately, this property seems to be unknown to Hive.
> Here is an extract of the hive-site.xml which includes the property:
> {noformat}
>   
>   hive.mapjoin.optimized.hashtable
>   true
>   Whether Hive should use a memory-optimized hash table for 
> MapJoin. Only works on Tez, because memory-optimized hash table cannot be 
> serialized.
>   
> {noformat}
> In the logs I have:
> {noformat}
> May 24 09:09:02 hiveserver2.bigdata.fr HiveConf of name 
> hive.mapjoin.optimized.keys does not exist
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13829) Property "hive.mapjoin.optimized.keys" does not exist

2016-05-24 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299380#comment-15299380
 ] 

Lefty Leverenz edited comment on HIVE-13829 at 5/25/16 3:10 AM:


HIVE-9331 removed *hive.mapjoin.optimized.keys* and 
*hive.mapjoin.lazy.hashtable* in release 1.1.0 but the documentation hasn't 
been updated yet.


was (Author: le...@hortonworks.com):
HIVE-9331 removed *hive.mapjoin.optimized.keys* and 
*hive.mapjoin.lazy.hashtable* in release 1.1.0 but the documentation hasn't be 
updated yet.

> Property "hive.mapjoin.optimized.keys" does not exist
> -
>
> Key: HIVE-13829
> URL: https://issues.apache.org/jira/browse/HIVE-13829
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 2.0.0
> Environment: Hadoop 2.7.2, Hive 2.0.0, Spark 1.6.1, Kerberos
>Reporter: Alexandre Linte
>
> Refering to the documentation 
> (https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties), 
> it is possible to set the following property "hive.mapjoin.optimized.keys". 
> Unfortunately, this property seems to be unknown to Hive.
> Here is an extract of the hive-site.xml which includes the property:
> {noformat}
>   
>   hive.mapjoin.optimized.hashtable
>   true
>   Whether Hive should use a memory-optimized hash table for 
> MapJoin. Only works on Tez, because memory-optimized hash table cannot be 
> serialized.
>   
> {noformat}
> In the logs I have:
> {noformat}
> May 24 09:09:02 hiveserver2.bigdata.fr HiveConf of name 
> hive.mapjoin.optimized.keys does not exist
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13841) Orc split generation returns different strategies with cache enabled vs disabled

2016-05-24 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13841:
-
Attachment: HIVE-13841.1.patch

[~sershe] Can you please take a look?

> Orc split generation returns different strategies with cache enabled vs 
> disabled
> 
>
> Key: HIVE-13841
> URL: https://issues.apache.org/jira/browse/HIVE-13841
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13841.1.patch
>
>
> Split strategy chosen by OrcInputFormat should not change when enabling or 
> disabling footer cache. Currently if footer cache is disabled minSplits in 
> OrcInputFormat.Context will be set to -1 which is used during determination 
> of split strategies. minSplits should be set to requested value or some 
> default instead of cache size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13841) Orc split generation returns different strategies with cache enabled vs disabled

2016-05-24 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13841:
-
Status: Patch Available  (was: Open)

> Orc split generation returns different strategies with cache enabled vs 
> disabled
> 
>
> Key: HIVE-13841
> URL: https://issues.apache.org/jira/browse/HIVE-13841
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13841.1.patch
>
>
> Split strategy chosen by OrcInputFormat should not change when enabling or 
> disabling footer cache. Currently if footer cache is disabled minSplits in 
> OrcInputFormat.Context will be set to -1 which is used during determination 
> of split strategies. minSplits should be set to requested value or some 
> default instead of cache size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13840) Orc split generation is reading file footers twice

2016-05-24 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13840:
-
Status: Patch Available  (was: Open)

> Orc split generation is reading file footers twice
> --
>
> Key: HIVE-13840
> URL: https://issues.apache.org/jira/browse/HIVE-13840
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13840.1.patch
>
>
> Recent refactorings to move orc out introduced a regression in split 
> generation. This leads to reading the orc file footers twice during split 
> generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13840) Orc split generation is reading file footers twice

2016-05-24 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13840:
-
Attachment: HIVE-13840.1.patch

[~owen.omalley] Can you please take a look? This fixes the perf regression. 

> Orc split generation is reading file footers twice
> --
>
> Key: HIVE-13840
> URL: https://issues.apache.org/jira/browse/HIVE-13840
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13840.1.patch
>
>
> Recent refactorings to move orc out introduced a regression in split 
> generation. This leads to reading the orc file footers twice during split 
> generation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10417) Parallel Order By return wrong results for partitioned tables

2016-05-24 Thread Nemon Lou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10417:
-
Status: Open  (was: Patch Available)

> Parallel Order By return wrong results for partitioned tables
> -
>
> Key: HIVE-10417
> URL: https://issues.apache.org/jira/browse/HIVE-10417
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.0.0, 0.13.1, 0.14.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-10417.patch
>
>
> Following is the script that reproduce this bug.
> set hive.optimize.sampling.orderby=true;
> set mapreduce.job.reduces=10;
> select * from src order by key desc limit 10;
> +--++
> | src.key  | src.value  |
> +--++
> | 98   | val_98 |
> | 98   | val_98 |
> | 97   | val_97 |
> | 97   | val_97 |
> | 96   | val_96 |
> | 95   | val_95 |
> | 95   | val_95 |
> | 92   | val_92 |
> | 90   | val_90 |
> | 90   | val_90 |
> +--++
> 10 rows selected (47.916 seconds)
> reset;
> create table src_orc_p (key string ,value string )
> partitioned by (kp string)
> stored as orc
> tblproperties("orc.compress"="SNAPPY");
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.max.dynamic.partitions.pernode=1;
> set hive.exec.max.dynamic.partitions=1;
> insert into table src_orc_p partition(kp) select *,substring(key,1) from src 
> distribute by substring(key,1);
> set mapreduce.job.reduces=10;
> set hive.optimize.sampling.orderby=true;
> select * from src_orc_p order by key desc limit 10;
> ++--+-+
> | src_orc_p.key  | src_orc_p.value  | src_orc_p.kend  |
> ++--+-+
> | 0  | val_0| 0   |
> | 0  | val_0| 0   |
> | 0  | val_0| 0   |
> ++--+-+
> 3 rows selected (39.861 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13839) [Refactor] Remove SHIMS.listLocatedStatus

2016-05-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13839:

Attachment: HIVE-13839.patch

[~sershe] Can you please take a look?

> [Refactor] Remove SHIMS.listLocatedStatus
> -
>
> Key: HIVE-13839
> URL: https://issues.apache.org/jira/browse/HIVE-13839
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13839.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13839) [Refactor] Remove SHIMS.listLocatedStatus

2016-05-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13839:

Status: Patch Available  (was: Open)

> [Refactor] Remove SHIMS.listLocatedStatus
> -
>
> Key: HIVE-13839
> URL: https://issues.apache.org/jira/browse/HIVE-13839
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13839.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13332) support dumping all row indexes in ORC FileDump

2016-05-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299240#comment-15299240
 ] 

Hive QA commented on HIVE-13332:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805677/HIVE-13332.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 86 failed/errored test(s), 10089 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestHBaseCliDriver.org.apache.hadoop.hive.cli.TestHBaseCliDriver
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge9
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_orc_merge_diff_fs
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_create_merge_compressed
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cte_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join0
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_all_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_grouping_sets
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_part_col_char
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks
org.apache.hadoop.hive.metastore.TestHiveMetaStoreStatsMerge.testStatsMerge

[jira] [Updated] (HIVE-13794) HIVE_RPC_QUERY_PLAN should always be set when generating LLAP splits

2016-05-24 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-13794:
--
   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Committed to master

> HIVE_RPC_QUERY_PLAN should always be set when generating LLAP splits
> 
>
> Key: HIVE-13794
> URL: https://issues.apache.org/jira/browse/HIVE-13794
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 2.1.0
>
> Attachments: HIVE-13794.1.patch
>
>
> This option was being added in the test, but really should be set any time we 
> are generating the LLAP input splits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-24 Thread Nachiket Vaidya (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299189#comment-15299189
 ] 

Nachiket Vaidya commented on HIVE-13836:


I tested the patch with 8 threads concurrently running DDL operations and it is 
working fine.
I also calculated the time with and without db notifications enabled and timing 
are almost the same.


> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
> Key: HIVE-13836
> URL: https://issues.apache.org/jira/browse/HIVE-13836
> Project: Hive
>  Issue Type: Bug
>Reporter: Nachiket Vaidya
>Priority: Critical
> Attachments: HIVE-13836.patch
>
>
> I used pyhs2 python client to create tables/partitions in hive. I was working 
> fine until I moved to multithreaded scripts which created 8 connections and 
> ran DDL queries concurrently.
> I got the error as
> {noformat}
> 2016-05-04 17:49:26,226 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: 
> HMSHandler Fatal error: Invalid state. Transaction has already started
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
> at 
> org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
> at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
> at 
> org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
> at 
> org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
> at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-24 Thread Nachiket Vaidya (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299177#comment-15299177
 ] 

Nachiket Vaidya commented on HIVE-13836:


[~sushanth] and [~alangates], could either of you please take a quick look?

> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
> Key: HIVE-13836
> URL: https://issues.apache.org/jira/browse/HIVE-13836
> Project: Hive
>  Issue Type: Bug
>Reporter: Nachiket Vaidya
>Priority: Critical
> Attachments: HIVE-13836.patch
>
>
> I used pyhs2 python client to create tables/partitions in hive. I was working 
> fine until I moved to multithreaded scripts which created 8 connections and 
> ran DDL queries concurrently.
> I got the error as
> {noformat}
> 2016-05-04 17:49:26,226 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: 
> HMSHandler Fatal error: Invalid state. Transaction has already started
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
> at 
> org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
> at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
> at 
> org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
> at 
> org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
> at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13836) DbNotifications giving an error = Invalid state. Transaction has already started

2016-05-24 Thread Nachiket Vaidya (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nachiket Vaidya updated HIVE-13836:
---
Attachment: HIVE-13836.patch

Patch for the fix

> DbNotifications giving an error = Invalid state. Transaction has already 
> started
> 
>
> Key: HIVE-13836
> URL: https://issues.apache.org/jira/browse/HIVE-13836
> Project: Hive
>  Issue Type: Bug
>Reporter: Nachiket Vaidya
>Priority: Critical
> Attachments: HIVE-13836.patch
>
>
> I used pyhs2 python client to create tables/partitions in hive. I was working 
> fine until I moved to multithreaded scripts which created 8 connections and 
> ran DDL queries concurrently.
> I got the error as
> {noformat}
> 2016-05-04 17:49:26,226 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-4-thread-194]: 
> HMSHandler Fatal error: Invalid state. Transaction has already started
> org.datanucleus.transaction.NucleusTransactionException: Invalid state. 
> Transaction has already started
> at 
> org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47)
> at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131)
> at 
> org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88)
> at 
> org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:463)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.addNotificationEvent(ObjectStore.java:7522)
> at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
> at com.sun.proxy.$Proxy10.addNotificationEvent(Unknown Source)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.enqueue(DbNotificationListener.java:261)
> at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:123)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1483)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1502)
> at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:138)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
> at 
> com.sun.proxy.$Proxy14.create_table_with_environment_context(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9267)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13564:
---
Attachment: HIVE-13564.01.patch

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13564:
---
Status: Patch Available  (was: Open)

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13564:
---
Status: Open  (was: Patch Available)

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13564:
---
Attachment: (was: HIVE-13564.01.patch)

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-24 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299137#comment-15299137
 ] 

Jason Dere commented on HIVE-13837:
---

+1 if tests look good

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>

[jira] [Updated] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13564:
---
Status: Patch Available  (was: Open)

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13564) Deprecate HIVE_STATS_COLLECT_RAWDATASIZE

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13564:
---
Attachment: HIVE-13564.01.patch

> Deprecate HIVE_STATS_COLLECT_RAWDATASIZE
> 
>
> Key: HIVE-13564
> URL: https://issues.apache.org/jira/browse/HIVE-13564
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Statistics
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Minor
> Attachments: HIVE-13564.01.patch
>
>
> Reasons (1) It is only used in stats20.q (2) We already have a 
> "HIVESTATSAUTOGATHER" configuration to tell if we are going to collect 
> rawDataSize and #rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-24 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299132#comment-15299132
 ] 

Pengcheng Xiong commented on HIVE-13837:


[~jdere], could u please review? Thanks.

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>TOK_DESTINATION
>

[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13837:
---
Status: Patch Available  (was: Open)

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>

[jira] [Updated] (HIVE-13837) current_timestamp() output format is different in some cases

2016-05-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13837:
---
Attachment: HIVE-13837.01.patch

not sure why we discover this so late. ccing [~ashutoshc]

> current_timestamp() output format is different in some cases
> 
>
> Key: HIVE-13837
> URL: https://issues.apache.org/jira/browse/HIVE-13837
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13837.01.patch
>
>
> As [~jdere] reports:
> {code}
> current_timestamp() udf returns result with different format in some cases.
> select current_timestamp() returns result with decimal precision:
> {noformat}
> hive> select current_timestamp();
> OK
> 2016-04-14 18:26:58.875
> Time taken: 0.077 seconds, Fetched: 1 row(s)
> {noformat}
> But output format is different for select current_timestamp() from all100k 
> union select current_timestamp() from over100k limit 5; 
> {noformat}
> hive> select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3
> Total jobs = 1
> Launching Job 1 out of 1
> Tez session was closed. Reopening...
> Session re-established.
> Status: Running (Executing on YARN cluster with App id 
> application_1460611908643_0624)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 ..  llap SUCCEEDED  1  100  
>  0   0  
> Map 4 ..  llap SUCCEEDED  1  100  
>  0   0  
> Reducer 3 ..  llap SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 03/03  [==>>] 100%  ELAPSED TIME: 0.92 s
>  
> --
> OK
> 2016-04-14 18:29:56
> Time taken: 10.558 seconds, Fetched: 1 row(s)
> {noformat}
> explain plan for select current_timestamp();
> {noformat}
> hive> explain extended select current_timestamp();
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_INSERT
>   TOK_DESTINATION
>  TOK_DIR
> TOK_TMP_FILE
>   TOK_SELECT
>  TOK_SELEXPR
> TOK_FUNCTION
>current_timestamp
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   GatherStats: false
>   Select Operator
> expressions: 2016-04-14 18:30:57.206 (type: timestamp)
> outputColumnNames: _col0
> ListSink
> Time taken: 0.062 seconds, Fetched: 30 row(s)
> {noformat}
> explain plan for select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> {noformat}
> hive> explain extended select current_timestamp() from all100k union select 
> current_timestamp() from over100k limit 5;
> OK
> ABSTRACT SYNTAX TREE:
>   
> TOK_QUERY
>TOK_FROM
>   TOK_SUBQUERY
>  TOK_QUERY
> TOK_FROM
>TOK_SUBQUERY
>   TOK_UNIONALL
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  all100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>  TOK_QUERY
> TOK_FROM
>TOK_TABREF
>   TOK_TABNAME
>  over100k
> TOK_INSERT
>TOK_DESTINATION
>   TOK_DIR
>  TOK_TMP_FILE
>TOK_SELECT
>   TOK_SELEXPR
>  TOK_FUNCTION
> current_timestamp
>   _u1
> TOK_INSERT
>TOK_DESTINATION
>

[jira] [Updated] (HIVE-13835) TestMiniTezCliDriver.vector_complex_all.q needs golden file update

2016-05-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13835:

Attachment: HIVE-13835.patch

[~mmccline] Can you take a quick look? Doesn't warrant Hive QA run.

> TestMiniTezCliDriver.vector_complex_all.q needs golden file update
> --
>
> Key: HIVE-13835
> URL: https://issues.apache.org/jira/browse/HIVE-13835
> Project: Hive
>  Issue Type: Task
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13835.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13824) NOSUCHMethodFound org.fusesource.jansi.internal.Kernel32.GetConsoleOutputCP()I

2016-05-24 Thread Ekta Paliwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298956#comment-15298956
 ] 

Ekta Paliwal commented on HIVE-13824:
-

Please help me sir.

> NOSUCHMethodFound org.fusesource.jansi.internal.Kernel32.GetConsoleOutputCP()I
> --
>
> Key: HIVE-13824
> URL: https://issues.apache.org/jira/browse/HIVE-13824
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Hive
> Environment: WIndows 8, HADOOP 2.7, HIVE 1.2.1, SPARK 1.6.1
>Reporter: Ekta Paliwal
>
> 0
> down vote
> favorite
> I have been trying to install hive on windows. I have 64 bit windows 8 on 
> which HADOOP and SPARK are running. I have
> 1.HADOOP_HOME
> 2.HIVE_HOME
> 3.SPARK_HOME
> 4.Platform
> 5.PATH
> all these variables set up on my system. Also, I was getting these error 
> before
> Missing Hive Execution Jar: 
> C:\hadoop1\hadoop-2.7.2\apache-hive-1.2.1-bin/lib/hive-exec-*.jar
> I solved these error by editing the Hive file inside bin folder of HIVE. 
> These errors are because of the forward slash"/" in environment variables in 
> HIVE file. I replace them with "\" and those errors are gone. But now I am 
> facing another problem. I am getting these error
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/C:/spark/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/C:/hadoop2.7/hadoop-2.7.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Beeline version 1.6.1 by Apache Hive
> Exception in thread "main" java.lang.NoSuchMethodError: 
> org.fusesource.jansi.internal.Kernel32.GetConsoleOutputCP()I
> at 
> jline.WindowsTerminal.getConsoleOutputCodepage(WindowsTerminal.java:293)
> at jline.WindowsTerminal.getOutputEncoding(WindowsTerminal.java:186)
> at jline.console.ConsoleReader.(ConsoleReader.java:230)
> at jline.console.ConsoleReader.(ConsoleReader.java:221)
> at jline.console.ConsoleReader.(ConsoleReader.java:209)
> at org.apache.hive.beeline.BeeLine.getConsoleReader(BeeLine.java:834)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:770)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:484)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:467)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> I have searched alot on these. Also I have posted these question on HIVE User 
> mailing List but got no response. Please help me with this. Not even getting 
> results when google this error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13248) Change date_add/date_sub/to_date functions to return Date type rather than String

2016-05-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298948#comment-15298948
 ] 

Ashutosh Chauhan commented on HIVE-13248:
-

[~mmccline] is the best person to review this change.

> Change date_add/date_sub/to_date functions to return Date type rather than 
> String
> -
>
> Key: HIVE-13248
> URL: https://issues.apache.org/jira/browse/HIVE-13248
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13248.1.patch, HIVE-13248.2.patch
>
>
> Some of the original "date" related functions return string values rather 
> than Date values, because they were created before the Date type existed in 
> Hive. We can try to change these to return Date in the 2.x line.
> Date values should be implicitly convertible to String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-24 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298923#comment-15298923
 ] 

Aihua Xu commented on HIVE-13149:
-

Those tests don't seem to be related.

[~ctang.ma] I just fixed the test TestJdbcWithMiniHS2 in the new patch. Can you 
take another look?

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9422) LLAP: row-level vectorized SARGs

2016-05-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298912#comment-15298912
 ] 

Hive QA commented on HIVE-9422:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12798204/HIVE-9422.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/378/testReport
Console output: 
http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/378/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-378/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-378/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   8a67958..115d225  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 8a67958 HIVE-13799 :  Optimize TableScanRule::checkBucketedTable 
(Rajesh Balamohan via Gopal V)
+ git clean -f -d
Removing 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java.orig
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at 115d225 HIVE-12467: Add number of dynamic partitions to error 
message (Lars Francke reviewed by Prasanth Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12798204 - PreCommit-HIVE-MASTER-Build

> LLAP: row-level vectorized SARGs
> 
>
> Key: HIVE-9422
> URL: https://issues.apache.org/jira/browse/HIVE-9422
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Sergey Shelukhin
>Assignee: Yohei Abe
> Attachments: HIVE-9422.2.patch, HIVE-9422.WIP1.patch
>
>
> When VRBs are built from encoded data, sargs can be applied on low level to 
> reduce the number of rows to process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298910#comment-15298910
 ] 

Hive QA commented on HIVE-13149:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805914/HIVE-13149.8.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 9986 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-explainuser_4.q-update_after_multiple_inserts.q-mapreduce2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_grouping_sets.q-update_all_partitioned.q-cte_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorized_parquet.q-insert_values_non_partitioned.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-auto_join_reordering_values.q-ptf_seqfile.q-auto_join18.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-bucketmapjoin10.q-join_rc.q-skewjoinopt13.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby_grouping_id2.q-vectorization_13.q-auto_sortmerge_join_13.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-order.q-auto_join18_multi_distinct.q-union2.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-skewjoin_noskew.q-sample2.q-skewjoinopt10.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.llap.daemon.impl.TestTaskExecutorService.testPreemptionQueueComparator
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparator
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorParallelism
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorWithinDagPriority
org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
org.apache.hadoop.hive.metastore.TestFilterHooks.org.apache.hadoop.hive.metastore.TestFilterHooks
org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf.org.apache.hadoop.hive.metastore.TestHiveMetaStoreGetMetaConf
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs
org.apache.hadoop.hive.metastore.TestHiveMetaStoreStatsMerge.testStatsMerge
org.apache.hadoop.hive.metastore.TestMarkPartitionRemote.testMarkingPartitionSet
org.apache.hadoop.hive.metastore.TestMetaStoreEndFunctionListener.testEndFunctionListener
org.apache.hadoop.hive.metastore.TestMetaStoreInitListener.testMetaStoreInitListener
org.apache.hadoop.hive.metastore.TestMetaStoreMetrics.org.apache.hadoop.hive.metastore.TestMetaStoreMetrics
org.apache.hadoop.hive.metastore.TestRemoteUGIHiveMetaStoreIpAddress.testIpAddress
org.apache.hadoop.hive.metastore.TestRetryingHMSHandler.testRetryingHMSHandler
org.apache.hadoop.hive.ql.security.TestClientSideAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestFolderPermissions.org.apache.hadoop.hive.ql.security.TestFolderPermissions
org.apache.hadoop.hive.ql.security.TestMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropDatabase
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationDrops.testDropPartition
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProvider.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationProviderWithACL.testSimplePrivileges
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbFailure
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadDbSuccess
org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.testReadTableFailure

[jira] [Commented] (HIVE-13759) LlapTaskUmbilicalExternalClient should be closed by the record reader

2016-05-24 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298843#comment-15298843
 ] 

Jason Dere commented on HIVE-13759:
---

[~sseth] can you take a look?

> LlapTaskUmbilicalExternalClient should be closed by the record reader
> -
>
> Key: HIVE-13759
> URL: https://issues.apache.org/jira/browse/HIVE-13759
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13759.1.patch
>
>
> The umbilical external client (and the server socket it creates) doesn't look 
> like it's getting closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13248) Change date_add/date_sub/to_date functions to return Date type rather than String

2016-05-24 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-13248:
--
Attachment: HIVE-13248.2.patch

Updating another qtest - CASE statement required casting date_add() to string 
due to the changed return type.

> Change date_add/date_sub/to_date functions to return Date type rather than 
> String
> -
>
> Key: HIVE-13248
> URL: https://issues.apache.org/jira/browse/HIVE-13248
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13248.1.patch, HIVE-13248.2.patch
>
>
> Some of the original "date" related functions return string values rather 
> than Date values, because they were created before the Date type existed in 
> Hive. We can try to change these to return Date in the 2.x line.
> Date values should be implicitly convertible to String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2016-05-24 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13282:
--
Attachment: smb_fail_issue.patch

> GroupBy and select operator encounter ArrayIndexOutOfBoundsException
> 
>
> Key: HIVE-13282
> URL: https://issues.apache.org/jira/browse/HIVE-13282
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Matt McCline
> Attachments: smb_fail_issue.patch, smb_groupby.q, smb_groupby.q.out
>
>
> The group by and select operators run into the ArrayIndexOutOfBoundsException 
> when they incorrectly initialize themselves with tag 0 but the incoming tag 
> id is different.
> {code}
> select count(*) from
> (select rt1.id from
> (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1
> join
> (select rt2.id from
> (select t2.key as id, t2.value as od from tab_part t2 group by key, value) 
> rt2) vt2
> where vt1.id=vt2.id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2016-05-24 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13282:
--
Attachment: (was: smb_fail_issue.patch)

> GroupBy and select operator encounter ArrayIndexOutOfBoundsException
> 
>
> Key: HIVE-13282
> URL: https://issues.apache.org/jira/browse/HIVE-13282
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Matt McCline
> Attachments: smb_fail_issue.patch, smb_groupby.q, smb_groupby.q.out
>
>
> The group by and select operators run into the ArrayIndexOutOfBoundsException 
> when they incorrectly initialize themselves with tag 0 but the incoming tag 
> id is different.
> {code}
> select count(*) from
> (select rt1.id from
> (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1
> join
> (select rt2.id from
> (select t2.key as id, t2.value as od from tab_part t2 group by key, value) 
> rt2) vt2
> where vt1.id=vt2.id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2016-05-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298795#comment-15298795
 ] 

Vikram Dixit K commented on HIVE-13282:
---

[~mmccline] Can you try the patch I have attached above? I think that is how I 
was able to repro the issue. I think with the patch attached here, it produces 
a result sometimes but that is wrong (if you switch around the tables, it may 
even throw an exception).

> GroupBy and select operator encounter ArrayIndexOutOfBoundsException
> 
>
> Key: HIVE-13282
> URL: https://issues.apache.org/jira/browse/HIVE-13282
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Matt McCline
> Attachments: smb_fail_issue.patch, smb_groupby.q, smb_groupby.q.out
>
>
> The group by and select operators run into the ArrayIndexOutOfBoundsException 
> when they incorrectly initialize themselves with tag 0 but the incoming tag 
> id is different.
> {code}
> select count(*) from
> (select rt1.id from
> (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1
> join
> (select rt2.id from
> (select t2.key as id, t2.value as od from tab_part t2 group by key, value) 
> rt2) vt2
> where vt1.id=vt2.id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException

2016-05-24 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13282:
--
Attachment: smb_fail_issue.patch

> GroupBy and select operator encounter ArrayIndexOutOfBoundsException
> 
>
> Key: HIVE-13282
> URL: https://issues.apache.org/jira/browse/HIVE-13282
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.1, 2.0.0, 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Matt McCline
> Attachments: smb_fail_issue.patch, smb_groupby.q, smb_groupby.q.out
>
>
> The group by and select operators run into the ArrayIndexOutOfBoundsException 
> when they incorrectly initialize themselves with tag 0 but the incoming tag 
> id is different.
> {code}
> select count(*) from
> (select rt1.id from
> (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1
> join
> (select rt2.id from
> (select t2.key as id, t2.value as od from tab_part t2 group by key, value) 
> rt2) vt2
> where vt1.id=vt2.id;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11956) SHOW LOCKS should indicate what acquired the lock

2016-05-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11956:
--
Status: Patch Available  (was: Open)

> SHOW LOCKS should indicate what acquired the lock
> -
>
> Key: HIVE-11956
> URL: https://issues.apache.org/jira/browse/HIVE-11956
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11956.patch
>
>
> This can be a queryId, Flume agent id, Storm bolt id, etc.  This would 
> dramatically help diagnosing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13518) Hive on Tez: Shuffle joins do not choose the right 'big' table.

2016-05-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298769#comment-15298769
 ] 

Vikram Dixit K commented on HIVE-13518:
---

Yes. The test failures are related. I will be posting an update for addressing 
them.

> Hive on Tez: Shuffle joins do not choose the right 'big' table.
> ---
>
> Key: HIVE-13518
> URL: https://issues.apache.org/jira/browse/HIVE-13518
> Project: Hive
>  Issue Type: Bug
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-13518.1.patch
>
>
> Currently the big table is always assumed to be at position 0 but this isn't 
> efficient for some queries as the big table at position 1 could have a lot 
> more keys/skew. We already have a mechanism of choosing the big table that 
> can be leveraged to make the right choice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13833) Add an initial delay when starting the heartbeat

2016-05-24 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13833:
-
Status: Patch Available  (was: Open)

> Add an initial delay when starting the heartbeat
> 
>
> Key: HIVE-13833
> URL: https://issues.apache.org/jira/browse/HIVE-13833
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Minor
> Attachments: HIVE-13833.1.patch
>
>
> Since the scheduling of heartbeat happens immediately after lock acquisition, 
> it's unnecessary to send heartbeat at the time when locks is acquired. Add an 
> initial delay to skip this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13248) Change date_add/date_sub/to_date functions to return Date type rather than String

2016-05-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298767#comment-15298767
 ] 

Ashutosh Chauhan commented on HIVE-13248:
-

[~jdere] If this is ready for review can you please create a RB?

> Change date_add/date_sub/to_date functions to return Date type rather than 
> String
> -
>
> Key: HIVE-13248
> URL: https://issues.apache.org/jira/browse/HIVE-13248
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-13248.1.patch
>
>
> Some of the original "date" related functions return string values rather 
> than Date values, because they were created before the Date type existed in 
> Hive. We can try to change these to return Date in the 2.x line.
> Date values should be implicitly convertible to String.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13833) Add an initial delay when starting the heartbeat

2016-05-24 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13833:
-
Attachment: HIVE-13833.1.patch

[~ekoifman] Can you take a look?

> Add an initial delay when starting the heartbeat
> 
>
> Key: HIVE-13833
> URL: https://issues.apache.org/jira/browse/HIVE-13833
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>Priority: Minor
> Attachments: HIVE-13833.1.patch
>
>
> Since the scheduling of heartbeat happens immediately after lock acquisition, 
> it's unnecessary to send heartbeat at the time when locks is acquired. Add an 
> initial delay to skip this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13518) Hive on Tez: Shuffle joins do not choose the right 'big' table.

2016-05-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298763#comment-15298763
 ] 

Ashutosh Chauhan commented on HIVE-13518:
-

[~vikram.dixit] Does this need more work?

> Hive on Tez: Shuffle joins do not choose the right 'big' table.
> ---
>
> Key: HIVE-13518
> URL: https://issues.apache.org/jira/browse/HIVE-13518
> Project: Hive
>  Issue Type: Bug
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: HIVE-13518.1.patch
>
>
> Currently the big table is always assumed to be at position 0 but this isn't 
> efficient for some queries as the big table at position 1 could have a lot 
> more keys/skew. We already have a mechanism of choosing the big table that 
> can be leveraged to make the right choice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11956) SHOW LOCKS should indicate what acquired the lock

2016-05-24 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11956:
--
Attachment: HIVE-11956.patch

> SHOW LOCKS should indicate what acquired the lock
> -
>
> Key: HIVE-11956
> URL: https://issues.apache.org/jira/browse/HIVE-11956
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Transactions
>Affects Versions: 0.14.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11956.patch
>
>
> This can be a queryId, Flume agent id, Storm bolt id, etc.  This would 
> dramatically help diagnosing issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12467) Add number of dynamic partitions to error message

2016-05-24 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12467:
-
  Resolution: Fixed
   Fix Version/s: 2.1.0
Target Version/s: 2.1.0
  Status: Resolved  (was: Patch Available)

llap_partitioned ran successfully without any issues. 

Thanks [~lars_francke] for the contribution! Committed patch to master. 

> Add number of dynamic partitions to error message
> -
>
> Key: HIVE-12467
> URL: https://issues.apache.org/jira/browse/HIVE-12467
> Project: Hive
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: HIVE-12467.2.patch, HIVE-12467.patch
>
>
> Currently when using dynamic partition insert we get an error message saying 
> that the client tried to create too many dynamic partitions ("Maximum was set 
> to"). I'll extend the error message to specify the number of dynamic 
> partitions which can be helpful for debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13808:
---
Attachment: HIVE-13808.patch

> Use constant expressions to backtrack when we create ReduceSink
> ---
>
> Key: HIVE-13808
> URL: https://issues.apache.org/jira/browse/HIVE-13808
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13808.patch
>
>
> Follow-up of HIVE-13068.
> When we create a RS with constant expressions as keys/values, and immediately 
> after we create a SEL operator that backtracks the expressions from the RS. 
> Currently, we automatically create references for all the keys/values.
> Before, we could rely on Hive ConstantPropagate to propagate the constants to 
> the SEL. However, after HIVE-13068, Hive ConstantPropagate does not get 
> exercised anymore. Thus, we can simply create constant expressions when we 
> create the SEL operator instead of a reference.
> Ex. ql/src/test/results/clientpositive/vector_coalesce.q.out
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> {noformat}
> Plan:
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> POSTHOOK: type: QUERY
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: alltypesorc
> Statistics: Num rows: 12288 Data size: 2641964 Basic stats: 
> COMPLETE Column stats: NONE
> Filter Operator
>   predicate: cdouble is null (type: boolean)
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: cstring1 (type: string), cint (type: int), 
> cfloat (type: float), csmallint (type: smallint), 
> COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string)
> outputColumnNames: _col1, _col2, _col3, _col4, _col5
> Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: null (type: double), _col1 (type: string), 
> _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: 
> string)
>   sort order: ++
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
>   Execution mode: vectorized
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 
> (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: 
> float), KEY.reducesinkkey4 (type: smallint), KEY.reducesinkkey5 (type: string)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE 
> Column stats: NONE
>   Limit
> Number of rows: 10
> Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 10
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13808 started by Jesus Camacho Rodriguez.
--
> Use constant expressions to backtrack when we create ReduceSink
> ---
>
> Key: HIVE-13808
> URL: https://issues.apache.org/jira/browse/HIVE-13808
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up of HIVE-13068.
> When we create a RS with constant expressions as keys/values, and immediately 
> after we create a SEL operator that backtracks the expressions from the RS. 
> Currently, we automatically create references for all the keys/values.
> Before, we could rely on Hive ConstantPropagate to propagate the constants to 
> the SEL. However, after HIVE-13068, Hive ConstantPropagate does not get 
> exercised anymore. Thus, we can simply create constant expressions when we 
> create the SEL operator instead of a reference.
> Ex. ql/src/test/results/clientpositive/vector_coalesce.q.out
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> {noformat}
> Plan:
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> POSTHOOK: type: QUERY
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: alltypesorc
> Statistics: Num rows: 12288 Data size: 2641964 Basic stats: 
> COMPLETE Column stats: NONE
> Filter Operator
>   predicate: cdouble is null (type: boolean)
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: cstring1 (type: string), cint (type: int), 
> cfloat (type: float), csmallint (type: smallint), 
> COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string)
> outputColumnNames: _col1, _col2, _col3, _col4, _col5
> Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: null (type: double), _col1 (type: string), 
> _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: 
> string)
>   sort order: ++
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
>   Execution mode: vectorized
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 
> (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: 
> float), KEY.reducesinkkey4 (type: smallint), KEY.reducesinkkey5 (type: string)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE 
> Column stats: NONE
>   Limit
> Number of rows: 10
> Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 10
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13808) Use constant expressions to backtrack when we create ReduceSink

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13808:
---
Status: Patch Available  (was: In Progress)

> Use constant expressions to backtrack when we create ReduceSink
> ---
>
> Key: HIVE-13808
> URL: https://issues.apache.org/jira/browse/HIVE-13808
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>
> Follow-up of HIVE-13068.
> When we create a RS with constant expressions as keys/values, and immediately 
> after we create a SEL operator that backtracks the expressions from the RS. 
> Currently, we automatically create references for all the keys/values.
> Before, we could rely on Hive ConstantPropagate to propagate the constants to 
> the SEL. However, after HIVE-13068, Hive ConstantPropagate does not get 
> exercised anymore. Thus, we can simply create constant expressions when we 
> create the SEL operator instead of a reference.
> Ex. ql/src/test/results/clientpositive/vector_coalesce.q.out
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> {noformat}
> Plan:
> {noformat}
> EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, 
> cstring1, cint, cfloat, csmallint) as c
> FROM alltypesorc
> WHERE (cdouble IS NULL)
> ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
> LIMIT 10
> POSTHOOK: type: QUERY
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: alltypesorc
> Statistics: Num rows: 12288 Data size: 2641964 Basic stats: 
> COMPLETE Column stats: NONE
> Filter Operator
>   predicate: cdouble is null (type: boolean)
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   Select Operator
> expressions: cstring1 (type: string), cint (type: int), 
> cfloat (type: float), csmallint (type: smallint), 
> COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string)
> outputColumnNames: _col1, _col2, _col3, _col4, _col5
> Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   key expressions: null (type: double), _col1 (type: string), 
> _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: 
> string)
>   sort order: ++
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
>   Execution mode: vectorized
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 
> (type: string), KEY.reducesinkkey2 (type: int), KEY.reducesinkkey3 (type: 
> float), KEY.reducesinkkey4 (type: smallint), KEY.reducesinkkey5 (type: string)
>   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>   Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE 
> Column stats: NONE
>   Limit
> Number of rows: 10
> Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 10
>   Processor Tree:
> ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13072) ROW_NUMBER() function creates wrong results

2016-05-24 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen resolved HIVE-13072.
-
Resolution: Duplicate

> ROW_NUMBER() function creates wrong results
> ---
>
> Key: HIVE-13072
> URL: https://issues.apache.org/jira/browse/HIVE-13072
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Philipp Brandl
>Assignee: Yongzhi Chen
>
> When using ROW_NUMBER() on tables with more than 25000 rows, the function 
> ROW_NUMBER() duplicates rows with separate row numbers.
> Reproduce by using a large table with more than 25000 rows with distinct 
> values and then using a query involving ROW_NUMBER(). It will then result in 
> getting the same distinct values twice with separate row numbers apart by 
> 25000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13444) LLAP: add HMAC signatures to LLAP; verify them on LLAP side

2016-05-24 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298741#comment-15298741
 ] 

Siddharth Seth commented on HIVE-13444:
---

+1, after addressing two minor comments.

> LLAP: add HMAC signatures to LLAP; verify them on LLAP side
> ---
>
> Key: HIVE-13444
> URL: https://issues.apache.org/jira/browse/HIVE-13444
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13444.01.patch, HIVE-13444.02.patch, 
> HIVE-13444.03.patch, HIVE-13444.WIP.patch, HIVE-13444.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13444) LLAP: add HMAC signatures to LLAP; verify them on LLAP side

2016-05-24 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298740#comment-15298740
 ] 

Siddharth Seth commented on HIVE-13444:
---

+1, after addressing two minor comments.

> LLAP: add HMAC signatures to LLAP; verify them on LLAP side
> ---
>
> Key: HIVE-13444
> URL: https://issues.apache.org/jira/browse/HIVE-13444
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13444.01.patch, HIVE-13444.02.patch, 
> HIVE-13444.03.patch, HIVE-13444.WIP.patch, HIVE-13444.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13797) Provide a connection string example in beeline

2016-05-24 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298720#comment-15298720
 ] 

Sergio Peña commented on HIVE-13797:


Thanks [~vihangk1]
The patch looks simple
+1
I'll wait for tests before doing the commit.

> Provide a connection string example in beeline
> --
>
> Key: HIVE-13797
> URL: https://issues.apache.org/jira/browse/HIVE-13797
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-13797.01.patch, HIVE-13797.02.patch, 
> HIVE-13797.04.patch
>
>
> It would save a bunch of googling if we could provide some examples of 
> connection strings directly to beeline help message
> Eg:
> {code}
> ./bin/beeline --help
> Usage: java org.apache.hive.cli.beeline.BeeLine 
>-uthe JDBC URL to connect to
>-r  reconnect to last saved connect url (in 
> conjunction with !save)
>-nthe username to connect as
>-pthe password to connect as
>-dthe driver class to use
>-i   script file for initialization
>-e   query that should be executed
>-f   script file that should be executed
>-w (or) --password-file   the password file to read 
> password from
>--hiveconf property=value   Use value for given property
>--hivevar name=valuehive variable name and value
>This is Hive specific settings in which 
> variables
>can be set at session level and referenced 
> in Hive
>commands or queries.
>--color=[true/false]control whether color is used for display
>--showHeader=[true/false]   show column names in query results
>--headerInterval=ROWS;  the interval between which heades are 
> displayed
>--fastConnect=[true/false]  skip building table/column list for 
> tab-completion
>--autoCommit=[true/false]   enable/disable automatic transaction commit
>--verbose=[true/false]  show verbose error messages and debug info
>--showWarnings=[true/false] display connection warnings
>--showNestedErrs=[true/false]   display nested errors
>--numberFormat=[pattern]format numbers using DecimalFormat pattern
>--force=[true/false]continue running script even after errors
>--maxWidth=MAXWIDTH the maximum width of the terminal
>--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
> columns
>--silent=[true/false]   be more silent
>--autosave=[true/false] automatically save preferences
>--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for 
> result display
>Note that csv, and tsv are deprecated - 
> use csv2, tsv2 instead
>--incremental=[true/false]  Defaults to false. When set to false, the 
> entire result set
>is fetched and buffered before being 
> displayed, yielding optimal
>display column sizing. When set to true, 
> result rows are displayed
>immediately as they are fetched, yielding 
> lower latency and
>memory usage at the price of extra display 
> column padding.
>Setting --incremental=true is recommended 
> if you encounter an OutOfMemory
>on the client side (due to the fetched 
> result set size being large).
>--truncateTable=[true/false]truncate table column when it exceeds 
> length
>--delimiterForDSV=DELIMITER specify the delimiter for 
> delimiter-separated values output format (default: |)
>--isolation=LEVEL   set the transaction isolation level
>--nullemptystring=[true/false]  set to true to get historic behavior of 
> printing null as empty string
>--addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline 
> client side
>--addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in 
> the beeline client side
>--showConnectedUrl=[true/false] Prompt HiveServer2s URI to which this 
> beeline connected.
>Only works for HiveServer2 cluster mode.
>--help  display this message
>  
>Example:
> 1. beeline -u jdbc:hive2://localhost:1 username password
> 2. beeline -n username -p password -u jdbc:hive2://hs2.local:10012
> {code}



--
This message

[jira] [Commented] (HIVE-13797) Provide a connection string example in beeline

2016-05-24 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298719#comment-15298719
 ] 

Vihang Karajgaonkar commented on HIVE-13797:


-r option was missing .. adding it back again.
{noformat}
./beeline --help
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -uthe JDBC URL to connect to
   -r  reconnect to last saved connect url (in 
conjunction with !save)
   -nthe username to connect as
   -pthe password to connect as
   -dthe driver class to use
   -i   script file for initialization
   -e   query that should be executed
   -f   script file that should be executed
   -w (or) --password-file   the password file to read password 
from
   --hiveconf property=value   Use value for given property
   --hivevar name=valuehive variable name and value
   This is Hive specific settings in which 
variables
   can be set at session level and referenced 
in Hive
   commands or queries.
   --color=[true/false]control whether color is used for display
   --showHeader=[true/false]   show column names in query results
   --headerInterval=ROWS;  the interval between which heades are 
displayed
   --fastConnect=[true/false]  skip building table/column list for 
tab-completion
   --autoCommit=[true/false]   enable/disable automatic transaction commit
   --verbose=[true/false]  show verbose error messages and debug info
   --showWarnings=[true/false] display connection warnings
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]format numbers using DecimalFormat pattern
   --force=[true/false]continue running script even after errors
   --maxWidth=MAXWIDTH the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
columns
   --silent=[true/false]   be more silent
   --autosave=[true/false] automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for 
result display
   Note that csv, and tsv are deprecated - use 
csv2, tsv2 instead
   --incremental=[true/false]  Defaults to false. When set to false, the 
entire result set
   is fetched and buffered before being 
displayed, yielding optimal
   display column sizing. When set to true, 
result rows are displayed
   immediately as they are fetched, yielding 
lower latency and
   memory usage at the price of extra display 
column padding.
   Setting --incremental=true is recommended if 
you encounter an OutOfMemory
   on the client side (due to the fetched 
result set size being large).
   --truncateTable=[true/false]truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER specify the delimiter for 
delimiter-separated values output format (default: |)
   --isolation=LEVEL   set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of 
printing null as empty string
   --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client 
side
   --addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in the 
beeline client side
   --showConnectedUrl=[true/false] Prompt HiveServer2s URI to which this 
beeline connected.
   Only works for HiveServer2 cluster mode.
   --help  display this message
 
   Example:
1. Connect using simple authentication to HiveServer2 on localhost:1
$ beeline -u jdbc:hive2://localhost:1 username password

2. Connect using simple authentication to HiveServer2 on hs.local:1 
using -n for username and -p for password
$ beeline -n username -p password -u jdbc:hive2://hs2.local:10012

3. Connect using Kerberos authentication with hive/localh...@mydomain.com 
as HiveServer2 principal
$ beeline -u 
"jdbc:hive2://hs2.local:10013/default;principal=hive/localh...@mydomain.com

4. Connect using SSL connection to HiveServer2 on localhost at 1
$ beeline 
jdbc:hive2://localhost:1/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword

5. Connect using LDAP authentication
$ beeline -u jdbc:hive2://hs2.local:10013/default  

 
{noformat}

> Provide a connection string example in beeline
> --
>
> Key: HIVE-13797
> URL:

[jira] [Updated] (HIVE-13797) Provide a connection string example in beeline

2016-05-24 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-13797:
---
Attachment: HIVE-13797.04.patch

> Provide a connection string example in beeline
> --
>
> Key: HIVE-13797
> URL: https://issues.apache.org/jira/browse/HIVE-13797
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-13797.01.patch, HIVE-13797.02.patch, 
> HIVE-13797.04.patch
>
>
> It would save a bunch of googling if we could provide some examples of 
> connection strings directly to beeline help message
> Eg:
> {code}
> ./bin/beeline --help
> Usage: java org.apache.hive.cli.beeline.BeeLine 
>-uthe JDBC URL to connect to
>-r  reconnect to last saved connect url (in 
> conjunction with !save)
>-nthe username to connect as
>-pthe password to connect as
>-dthe driver class to use
>-i   script file for initialization
>-e   query that should be executed
>-f   script file that should be executed
>-w (or) --password-file   the password file to read 
> password from
>--hiveconf property=value   Use value for given property
>--hivevar name=valuehive variable name and value
>This is Hive specific settings in which 
> variables
>can be set at session level and referenced 
> in Hive
>commands or queries.
>--color=[true/false]control whether color is used for display
>--showHeader=[true/false]   show column names in query results
>--headerInterval=ROWS;  the interval between which heades are 
> displayed
>--fastConnect=[true/false]  skip building table/column list for 
> tab-completion
>--autoCommit=[true/false]   enable/disable automatic transaction commit
>--verbose=[true/false]  show verbose error messages and debug info
>--showWarnings=[true/false] display connection warnings
>--showNestedErrs=[true/false]   display nested errors
>--numberFormat=[pattern]format numbers using DecimalFormat pattern
>--force=[true/false]continue running script even after errors
>--maxWidth=MAXWIDTH the maximum width of the terminal
>--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
> columns
>--silent=[true/false]   be more silent
>--autosave=[true/false] automatically save preferences
>--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for 
> result display
>Note that csv, and tsv are deprecated - 
> use csv2, tsv2 instead
>--incremental=[true/false]  Defaults to false. When set to false, the 
> entire result set
>is fetched and buffered before being 
> displayed, yielding optimal
>display column sizing. When set to true, 
> result rows are displayed
>immediately as they are fetched, yielding 
> lower latency and
>memory usage at the price of extra display 
> column padding.
>Setting --incremental=true is recommended 
> if you encounter an OutOfMemory
>on the client side (due to the fetched 
> result set size being large).
>--truncateTable=[true/false]truncate table column when it exceeds 
> length
>--delimiterForDSV=DELIMITER specify the delimiter for 
> delimiter-separated values output format (default: |)
>--isolation=LEVEL   set the transaction isolation level
>--nullemptystring=[true/false]  set to true to get historic behavior of 
> printing null as empty string
>--addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline 
> client side
>--addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in 
> the beeline client side
>--showConnectedUrl=[true/false] Prompt HiveServer2s URI to which this 
> beeline connected.
>Only works for HiveServer2 cluster mode.
>--help  display this message
>  
>Example:
> 1. beeline -u jdbc:hive2://localhost:1 username password
> 2. beeline -n username -p password -u jdbc:hive2://hs2.local:10012
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13797) Provide a connection string example in beeline

2016-05-24 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-13797:
---
Attachment: (was: HIVE-13797.03.patch)

> Provide a connection string example in beeline
> --
>
> Key: HIVE-13797
> URL: https://issues.apache.org/jira/browse/HIVE-13797
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-13797.01.patch, HIVE-13797.02.patch
>
>
> It would save a bunch of googling if we could provide some examples of 
> connection strings directly to beeline help message
> Eg:
> {code}
> ./bin/beeline --help
> Usage: java org.apache.hive.cli.beeline.BeeLine 
>-uthe JDBC URL to connect to
>-r  reconnect to last saved connect url (in 
> conjunction with !save)
>-nthe username to connect as
>-pthe password to connect as
>-dthe driver class to use
>-i   script file for initialization
>-e   query that should be executed
>-f   script file that should be executed
>-w (or) --password-file   the password file to read 
> password from
>--hiveconf property=value   Use value for given property
>--hivevar name=valuehive variable name and value
>This is Hive specific settings in which 
> variables
>can be set at session level and referenced 
> in Hive
>commands or queries.
>--color=[true/false]control whether color is used for display
>--showHeader=[true/false]   show column names in query results
>--headerInterval=ROWS;  the interval between which heades are 
> displayed
>--fastConnect=[true/false]  skip building table/column list for 
> tab-completion
>--autoCommit=[true/false]   enable/disable automatic transaction commit
>--verbose=[true/false]  show verbose error messages and debug info
>--showWarnings=[true/false] display connection warnings
>--showNestedErrs=[true/false]   display nested errors
>--numberFormat=[pattern]format numbers using DecimalFormat pattern
>--force=[true/false]continue running script even after errors
>--maxWidth=MAXWIDTH the maximum width of the terminal
>--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
> columns
>--silent=[true/false]   be more silent
>--autosave=[true/false] automatically save preferences
>--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for 
> result display
>Note that csv, and tsv are deprecated - 
> use csv2, tsv2 instead
>--incremental=[true/false]  Defaults to false. When set to false, the 
> entire result set
>is fetched and buffered before being 
> displayed, yielding optimal
>display column sizing. When set to true, 
> result rows are displayed
>immediately as they are fetched, yielding 
> lower latency and
>memory usage at the price of extra display 
> column padding.
>Setting --incremental=true is recommended 
> if you encounter an OutOfMemory
>on the client side (due to the fetched 
> result set size being large).
>--truncateTable=[true/false]truncate table column when it exceeds 
> length
>--delimiterForDSV=DELIMITER specify the delimiter for 
> delimiter-separated values output format (default: |)
>--isolation=LEVEL   set the transaction isolation level
>--nullemptystring=[true/false]  set to true to get historic behavior of 
> printing null as empty string
>--addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline 
> client side
>--addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in 
> the beeline client side
>--showConnectedUrl=[true/false] Prompt HiveServer2s URI to which this 
> beeline connected.
>Only works for HiveServer2 cluster mode.
>--help  display this message
>  
>Example:
> 1. beeline -u jdbc:hive2://localhost:1 username password
> 2. beeline -n username -p password -u jdbc:hive2://hs2.local:10012
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function

2016-05-24 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298699#comment-15298699
 ] 

Aihua Xu commented on HIVE-13453:
-

[~ctang.ma], [~ychena], [~szehon] and [~mohitsabharwal] Can you guys review the 
code? The test failures don't seem to be related. 

> Support ORDER BY and windowing clause in partitioning clause with distinct 
> function
> ---
>
> Key: HIVE-13453
> URL: https://issues.apache.org/jira/browse/HIVE-13453
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13453.1.patch, HIVE-13453.2.patch, 
> HIVE-13453.3.patch, HIVE-13453.4.patch
>
>
> Current distinct function on partitioning doesn't support order by and 
> windowing clause due to performance reason. Explore an efficient way to 
> support that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13797) Provide a connection string example in beeline

2016-05-24 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298689#comment-15298689
 ] 

Vihang Karajgaonkar commented on HIVE-13797:


Adding new lines after each example which we removed by mistake.

{noformat}
./beeline --help
Usage: java org.apache.hive.cli.beeline.BeeLine 
   -uthe JDBC URL to connect to
   -nthe username to connect as
   -pthe password to connect as
   -dthe driver class to use
   -i   script file for initialization
   -e   query that should be executed
   -f   script file that should be executed
   -w (or) --password-file   the password file to read password 
from
   --hiveconf property=value   Use value for given property
   --hivevar name=valuehive variable name and value
   This is Hive specific settings in which 
variables
   can be set at session level and referenced 
in Hive
   commands or queries.
   --color=[true/false]control whether color is used for display
   --showHeader=[true/false]   show column names in query results
   --headerInterval=ROWS;  the interval between which heades are 
displayed
   --fastConnect=[true/false]  skip building table/column list for 
tab-completion
   --autoCommit=[true/false]   enable/disable automatic transaction commit
   --verbose=[true/false]  show verbose error messages and debug info
   --showWarnings=[true/false] display connection warnings
   --showNestedErrs=[true/false]   display nested errors
   --numberFormat=[pattern]format numbers using DecimalFormat pattern
   --force=[true/false]continue running script even after errors
   --maxWidth=MAXWIDTH the maximum width of the terminal
   --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
columns
   --silent=[true/false]   be more silent
   --autosave=[true/false] automatically save preferences
   --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for 
result display
   Note that csv, and tsv are deprecated - use 
csv2, tsv2 instead
   --incremental=[true/false]  Defaults to false. When set to false, the 
entire result set
   is fetched and buffered before being 
displayed, yielding optimal
   display column sizing. When set to true, 
result rows are displayed
   immediately as they are fetched, yielding 
lower latency and
   memory usage at the price of extra display 
column padding.
   Setting --incremental=true is recommended if 
you encounter an OutOfMemory
   on the client side (due to the fetched 
result set size being large).
   --truncateTable=[true/false]truncate table column when it exceeds length
   --delimiterForDSV=DELIMITER specify the delimiter for 
delimiter-separated values output format (default: |)
   --isolation=LEVEL   set the transaction isolation level
   --nullemptystring=[true/false]  set to true to get historic behavior of 
printing null as empty string
   --addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline client 
side
   --addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in the 
beeline client side
   --showConnectedUrl=[true/false] Prompt HiveServer2s URI to which this 
beeline connected.
   Only works for HiveServer2 cluster mode.
   --help  display this message
 
   Example:
1. Connect using simple authentication to HiveServer2 on localhost:1
$ beeline -u jdbc:hive2://localhost:1 username password

2. Connect using simple authentication to HiveServer2 on hs.local:1 
using -n for username and -p for password
$ beeline -n username -p password -u jdbc:hive2://hs2.local:10012

3. Connect using Kerberos authentication with hive/localh...@mydomain.com 
as HiveServer2 principal
$ beeline -u 
"jdbc:hive2://hs2.local:10013/default;principal=hive/localh...@mydomain.com

4. Connect using SSL connection to HiveServer2 on localhost at 1
$ beeline 
jdbc:hive2://localhost:1/default;ssl=true;sslTrustStore=/usr/local/truststore;trustStorePassword=mytruststorepassword

5. Connect using LDAP authentication
$ beeline -u jdbc:hive2://hs2.local:10013/default  

 
{noformat}

> Provide a connection string example in beeline
> --
>
> Key: HIVE-13797
> URL: https://issues.apache.org/jira/browse/HIVE-13797
> Project: Hive
>

[jira] [Updated] (HIVE-13797) Provide a connection string example in beeline

2016-05-24 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-13797:
---
Attachment: HIVE-13797.03.patch

Patch which adds new lines after each connection example.

> Provide a connection string example in beeline
> --
>
> Key: HIVE-13797
> URL: https://issues.apache.org/jira/browse/HIVE-13797
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-13797.01.patch, HIVE-13797.02.patch, 
> HIVE-13797.03.patch
>
>
> It would save a bunch of googling if we could provide some examples of 
> connection strings directly to beeline help message
> Eg:
> {code}
> ./bin/beeline --help
> Usage: java org.apache.hive.cli.beeline.BeeLine 
>-uthe JDBC URL to connect to
>-r  reconnect to last saved connect url (in 
> conjunction with !save)
>-nthe username to connect as
>-pthe password to connect as
>-dthe driver class to use
>-i   script file for initialization
>-e   query that should be executed
>-f   script file that should be executed
>-w (or) --password-file   the password file to read 
> password from
>--hiveconf property=value   Use value for given property
>--hivevar name=valuehive variable name and value
>This is Hive specific settings in which 
> variables
>can be set at session level and referenced 
> in Hive
>commands or queries.
>--color=[true/false]control whether color is used for display
>--showHeader=[true/false]   show column names in query results
>--headerInterval=ROWS;  the interval between which heades are 
> displayed
>--fastConnect=[true/false]  skip building table/column list for 
> tab-completion
>--autoCommit=[true/false]   enable/disable automatic transaction commit
>--verbose=[true/false]  show verbose error messages and debug info
>--showWarnings=[true/false] display connection warnings
>--showNestedErrs=[true/false]   display nested errors
>--numberFormat=[pattern]format numbers using DecimalFormat pattern
>--force=[true/false]continue running script even after errors
>--maxWidth=MAXWIDTH the maximum width of the terminal
>--maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying 
> columns
>--silent=[true/false]   be more silent
>--autosave=[true/false] automatically save preferences
>--outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv]  format mode for 
> result display
>Note that csv, and tsv are deprecated - 
> use csv2, tsv2 instead
>--incremental=[true/false]  Defaults to false. When set to false, the 
> entire result set
>is fetched and buffered before being 
> displayed, yielding optimal
>display column sizing. When set to true, 
> result rows are displayed
>immediately as they are fetched, yielding 
> lower latency and
>memory usage at the price of extra display 
> column padding.
>Setting --incremental=true is recommended 
> if you encounter an OutOfMemory
>on the client side (due to the fetched 
> result set size being large).
>--truncateTable=[true/false]truncate table column when it exceeds 
> length
>--delimiterForDSV=DELIMITER specify the delimiter for 
> delimiter-separated values output format (default: |)
>--isolation=LEVEL   set the transaction isolation level
>--nullemptystring=[true/false]  set to true to get historic behavior of 
> printing null as empty string
>--addlocaldriverjar=DRIVERJARNAME Add driver jar file in the beeline 
> client side
>--addlocaldrivername=DRIVERNAME Add drvier name needs to be supported in 
> the beeline client side
>--showConnectedUrl=[true/false] Prompt HiveServer2s URI to which this 
> beeline connected.
>Only works for HiveServer2 cluster mode.
>--help  display this message
>  
>Example:
> 1. beeline -u jdbc:hive2://localhost:1 username password
> 2. beeline -n username -p password -u jdbc:hive2://hs2.local:10012
> {code}



--
This message was sent by Atlassian

[jira] [Commented] (HIVE-12467) Add number of dynamic partitions to error message

2016-05-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298685#comment-15298685
 ] 

Prasanth Jayachandran commented on HIVE-12467:
--

[~lars_francke] This mostly looks good. I will run llap_partitioned test case 
alone to see if it is related. If not I will go ahead and commit it shortly.

> Add number of dynamic partitions to error message
> -
>
> Key: HIVE-12467
> URL: https://issues.apache.org/jira/browse/HIVE-12467
> Project: Hive
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Minor
> Attachments: HIVE-12467.2.patch, HIVE-12467.patch
>
>
> Currently when using dynamic partition insert we get an error message saying 
> that the client tried to create too many dynamic partitions ("Maximum was set 
> to"). I'll extend the error message to specify the number of dynamic 
> partitions which can be helpful for debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13773) Stats state is not captured correctly in dynpart_sort_optimization_acid.q

2016-05-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298676#comment-15298676
 ] 

Ashutosh Chauhan commented on HIVE-13773:
-

I agree with you in general. But, in this particular test case stats are wrong 
even when query is run in isolation.

> Stats state is not captured correctly in dynpart_sort_optimization_acid.q
> -
>
> Key: HIVE-13773
> URL: https://issues.apache.org/jira/browse/HIVE-13773
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13773.01.patch, t.q, t.q.out, t.q.out.right
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13453) Support ORDER BY and windowing clause in partitioning clause with distinct function

2016-05-24 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298671#comment-15298671
 ] 

Hive QA commented on HIVE-13453:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12805655/HIVE-13453.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 106 failed/errored test(s), 10030 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_16.q-skewjoin.q-vectorization_div0.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-dynpart_sort_optimization2.q-tez_dynpart_hashjoin_3.q-orc_vectorization_ppd.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-groupby2.q-tez_dynpart_hashjoin_1.q-custom_input_output_format.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-union2.q-bucket4.q-and-12-more - did 
not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_coalesce.q-cbo_windowing.q-tez_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestSparkCliDriver-groupby10.q-groupby4_noskew.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkClient - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_no_match
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_into1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_values_dynamic_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge_incompat2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_short_regress
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_date_funcs
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_funcs
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join22
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_count
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_decimal_1_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby1_map_nomap
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_innerjoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input1_limit
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert_into2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join16
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_alt_syntax
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_casesensitive
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_unqual3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_vc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part7

[jira] [Commented] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298662#comment-15298662
 ] 

Vikram Dixit K commented on HIVE-13832:
---

+1

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.1.patch, HIVE-13832.2.patch, HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298661#comment-15298661
 ] 

Jesus Camacho Rodriguez commented on HIVE-13832:


Exactly, that is what I thought. I uploaded the new patch, could you check? 
Thanks!

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.1.patch, HIVE-13832.2.patch, HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13832:
---
Attachment: HIVE-13832.2.patch

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.1.patch, HIVE-13832.2.patch, HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298644#comment-15298644
 ] 

Vikram Dixit K commented on HIVE-13832:
---

I think in that case, a change in bin.xml is missing in the original patch. 
That would work too!

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.1.patch, HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298642#comment-15298642
 ] 

Jesus Camacho Rodriguez commented on HIVE-13832:


The output folder in {{packaging/src/main/assembly/bin.xml}} would be still the 
same, thus that should not impact Ambari? It is not such a big deal to add a 
single file to the rat exclusions, but it would be a bit more robust to use the 
same convention for the whole project i.e. storing within the _scripts_ folder. 
And that would avoid future additional changes.

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.1.patch, HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13773) Stats state is not captured correctly in dynpart_sort_optimization_acid.q

2016-05-24 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298628#comment-15298628
 ] 

Eugene Koifman commented on HIVE-13773:
---

I don't see how queries can be answered from stats for Acid tables.  Acid 
tables are versioned and stats, afaik, are not.
So unless the query is asking for approximate info, I would not rely on stats.

> Stats state is not captured correctly in dynpart_sort_optimization_acid.q
> -
>
> Key: HIVE-13773
> URL: https://issues.apache.org/jira/browse/HIVE-13773
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13773.01.patch, t.q, t.q.out, t.q.out.right
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13832:
--
Attachment: HIVE-13832.1.patch

Unfortunately, ambari project depends on a specific location of this script. I 
have added a specific exclude for the file. If in case in the future, we end up 
having more such sql scripts, we can exclude that directory.

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.1.patch, HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13832:
--
Assignee: Vikram Dixit K  (was: Jesus Camacho Rodriguez)

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Vikram Dixit K
> Attachments: HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-05-24 Thread Austin Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Lee updated HIVE-12679:
--
Attachment: HIVE-12679.2.patch

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Affects Versions: 2.1.0
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Attachments: HIVE-12679.1.patch, HIVE-12679.2.patch, HIVE-12679.patch
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12679) Allow users to be able to specify an implementation of IMetaStoreClient via HiveConf

2016-05-24 Thread Austin Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Lee updated HIVE-12679:
--
Status: In Progress  (was: Patch Available)

> Allow users to be able to specify an implementation of IMetaStoreClient via 
> HiveConf
> 
>
> Key: HIVE-12679
> URL: https://issues.apache.org/jira/browse/HIVE-12679
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Metastore, Query Planning
>Affects Versions: 2.1.0
>Reporter: Austin Lee
>Assignee: Austin Lee
>Priority: Minor
>  Labels: metastore
> Attachments: HIVE-12679.1.patch, HIVE-12679.patch
>
>
> Hi,
> I would like to propose a change that would make it possible for users to 
> choose an implementation of IMetaStoreClient via HiveConf, i.e. 
> hive-site.xml.  Currently, in Hive the choice is hard coded to be 
> SessionHiveMetaStoreClient in org.apache.hadoop.hive.ql.metadata.Hive.  There 
> is no other direct reference to SessionHiveMetaStoreClient other than the 
> hard coded class name in Hive.java and the QL component operates only on the 
> IMetaStoreClient interface so the change would be minimal and it would be 
> quite similar to how an implementation of RawStore is specified and loaded in 
> hive-metastore.  One use case this change would serve would be one where a 
> user wishes to use an implementation of this interface without the dependency 
> on the Thrift server.
>   
> Thank you,
> Austin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13828) Enable hive.orc.splits.include.file.footer by default

2016-05-24 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298489#comment-15298489
 ] 

Sergey Shelukhin commented on HIVE-13828:
-

{quote}
 that ends up in the recovery information for Tez.
{quote}
1) Does it have to? 
2) Can we just use the binary form like the HBase metastore cache? The paths in 
LocalCache and RemoteCache should be relatively easy to reuse/reconcile.

> Enable hive.orc.splits.include.file.footer by default
> -
>
> Key: HIVE-13828
> URL: https://issues.apache.org/jira/browse/HIVE-13828
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> As a part of setting up the OrcInputFormat.getRecordReader in the task side, 
> hive ends up opening the file path and reads the metadata information. If  
> hive.orc.splits.include.file.footer=true, this metadata info can be passed on 
> to task side which can help reduce the overhead.  It would be good to 
> consider enabling this parameter by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13832) Add missing license header to files

2016-05-24 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298473#comment-15298473
 ] 

Sergey Shelukhin commented on HIVE-13832:
-

[~vikram.dixit] this was a part of HIVE-13438, is it ok to move it? Is there 
some test? Maybe there could just be a license header added.
The rest looks ok

> Add missing license header to files
> ---
>
> Key: HIVE-13832
> URL: https://issues.apache.org/jira/browse/HIVE-13832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13832.patch
>
>
> Preparing to cut the branch for 2.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298467#comment-15298467
 ] 

Ashutosh Chauhan commented on HIVE-13826:
-

+1 pending tests

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13826.01.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13826) Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER

2016-05-24 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13826:

Status: Patch Available  (was: Open)

> Make VectorUDFAdaptor work for GenericUDFBetween when used as FILTER
> 
>
> Key: HIVE-13826
> URL: https://issues.apache.org/jira/browse/HIVE-13826
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13826.01.patch
>
>
> GenericUDFBetween doesn't vectorize with VectorUDFAdaptor when used as FILTER 
> (i.e. as single item for WHERE).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13825) Using JOIN in 2 tables that has same path locations, but different colum names fail wtih an error exception

2016-05-24 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-13825:
---
Summary: Using JOIN in 2 tables that has same path locations, but different 
colum names fail wtih an error exception  (was: Map joins with cloned tables 
with same locations, but different column names throw error exceptions)

> Using JOIN in 2 tables that has same path locations, but different colum 
> names fail wtih an error exception
> ---
>
> Key: HIVE-13825
> URL: https://issues.apache.org/jira/browse/HIVE-13825
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergio Peña
>
> The following scenario of 2 tables with same locations cannot be used on a 
> JOIN query:
> {noformat}
> hive> create table t1 (a string, b string) location 
> '/user/hive/warehouse/test1';
> OK
> hive> create table t2 (c string, d string) location 
> '/user/hive/warehouse/test1';
> OK
> hive> select t1.a from t1 join t2 on t1.a = t2.c;
> ...
> 2016-05-23 16:39:57 Starting to launch local task to process map join;
>   maximum memory = 477102080
> Execution failed with exit status: 2
> Obtaining error information
> Task failed!
> Task ID:
>   Stage-4
> Logs:
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> {noformat}
> The logs contain this error exception:
> {noformat}
> 2016-05-23T16:39:58,163 ERROR [main]: mr.MapredLocalTask (:()) - Hive Runtime 
> Error: Map local work failed
> java.lang.RuntimeException: cannot find field a from [0:c, 1:d]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485)
> at 
> org.apache.hadoop.hive.serde2.BaseStructObjectInspector.getStructFieldRef(BaseStructObjectInspector.java:133)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:973)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:999)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:75)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:499)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:403)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:383)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:751)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: HIVE-13149.8.patch

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-05-24 Thread Aihua Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: (was: HIVE-13149.8.patch)

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, 
> HIVE-13149.6.patch, HIVE-13149.7.patch, HIVE-13149.8.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13640) Disable Hive PartitionConditionRemover optimizer when CBO has optimized the plan

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13640:
---
Target Version/s:   (was: 2.1.0)

> Disable Hive PartitionConditionRemover optimizer when CBO has optimized the 
> plan
> 
>
> Key: HIVE-13640
> URL: https://issues.apache.org/jira/browse/HIVE-13640
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13640.wip.patch
>
>
> We should bring PartitionConditionRemover to CBO and disable it in Hive. This 
> should allow us to fold expressions in CBO more tight.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12637) make retryable SQLExceptions in TxnHandler configurable

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12637:
---
Target Version/s:   (was: 1.3.0, 2.1.0)

> make retryable SQLExceptions in TxnHandler configurable
> ---
>
> Key: HIVE-12637
> URL: https://issues.apache.org/jira/browse/HIVE-12637
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC1.3, TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12637.1.patch, HIVE-12637.2.patch
>
>
> same for CompactionTxnHandler
> would be convenient if the user could specify some RegEx (perhaps by db type) 
> which will tell TxnHandler.checkRetryable() that this is should be retried.
> The regex should probably apply to String produced by 
> {noformat}
>   private static String getMessage(SQLException ex) {
> return ex.getMessage() + "(SQLState=" + ex.getSQLState() + ",ErrorCode=" 
> + ex.getErrorCode() + ")";
>   }
> {noformat}
> This make it flexible.
> See if we need to add Db type (and possibly version) of the DB being used.
> With 5 different DBs supported this gives control end users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10280) LLAP: Handle errors while sending source state updates to the daemons

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10280:
---
Fix Version/s: 2.1.0

> LLAP: Handle errors while sending source state updates to the daemons
> -
>
> Key: HIVE-10280
> URL: https://issues.apache.org/jira/browse/HIVE-10280
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.1.0
>
> Attachments: HIVE-10280.1.patch
>
>
> Will likely be handled as marking the node as bad. May need a retry policy in 
> place though before marking a node bad to handle temporary network glitches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10155) LLAP: switch to sensible cache policy

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10155:
---
Target Version/s:   (was: 2.1.0)

> LLAP: switch to sensible cache policy
> -
>
> Key: HIVE-10155
> URL: https://issues.apache.org/jira/browse/HIVE-10155
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> FIFO policy is currently the default. We can test LRFU one but there were 
> concerns that it won't scale. One option is to implement a two-tier policy, 
> FIFO/LRU for blocks referenced once, and something more complex (LFU, LRFU) 
> for blocks referenced more than once. That should be friendly to large scans 
> of a fact table in terms of behavior and overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12626) Publish some of the LLAP cache counters via Tez

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12626:
---
Target Version/s:   (was: 2.1.0)

> Publish some of the LLAP cache counters via Tez
> ---
>
> Key: HIVE-12626
> URL: https://issues.apache.org/jira/browse/HIVE-12626
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>
> To make them available via the final the final dag details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10280) LLAP: Handle errors while sending source state updates to the daemons

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10280:
---
Target Version/s:   (was: 2.1.0)

> LLAP: Handle errors while sending source state updates to the daemons
> -
>
> Key: HIVE-10280
> URL: https://issues.apache.org/jira/browse/HIVE-10280
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.1.0
>
> Attachments: HIVE-10280.1.patch
>
>
> Will likely be handled as marking the node as bad. May need a retry policy in 
> place though before marking a node bad to handle temporary network glitches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12637) make retryable SQLExceptions in TxnHandler configurable

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12637:
---
Fix Version/s: 2.1.0
   1.3.0

> make retryable SQLExceptions in TxnHandler configurable
> ---
>
> Key: HIVE-12637
> URL: https://issues.apache.org/jira/browse/HIVE-12637
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC1.3, TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12637.1.patch, HIVE-12637.2.patch
>
>
> same for CompactionTxnHandler
> would be convenient if the user could specify some RegEx (perhaps by db type) 
> which will tell TxnHandler.checkRetryable() that this is should be retried.
> The regex should probably apply to String produced by 
> {noformat}
>   private static String getMessage(SQLException ex) {
> return ex.getMessage() + "(SQLState=" + ex.getSQLState() + ",ErrorCode=" 
> + ex.getErrorCode() + ")";
>   }
> {noformat}
> This make it flexible.
> See if we need to add Db type (and possibly version) of the DB being used.
> With 5 different DBs supported this gives control end users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13201) Compaction shouldn't be allowed on non-ACID table

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13201:
---
Target Version/s:   (was: 1.3.0, 2.1.0)

> Compaction shouldn't be allowed on non-ACID table
> -
>
> Key: HIVE-13201
> URL: https://issues.apache.org/jira/browse/HIVE-13201
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13201.1.patch, HIVE-13201.2.patch
>
>
> Looks like compaction is allowed on non-ACID table, although that's of no 
> sense and does nothing. Moreover the compaction request will be enqueued into 
> COMPACTION_QUEUE metastore table, which brings unnecessary overhead.
> We should prevent compaction commands being allowed on non-ACID tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13201) Compaction shouldn't be allowed on non-ACID table

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13201:
---
Fix Version/s: 2.1.0
   1.3.0

> Compaction shouldn't be allowed on non-ACID table
> -
>
> Key: HIVE-13201
> URL: https://issues.apache.org/jira/browse/HIVE-13201
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13201.1.patch, HIVE-13201.2.patch
>
>
> Looks like compaction is allowed on non-ACID table, although that's of no 
> sense and does nothing. Moreover the compaction request will be enqueued into 
> COMPACTION_QUEUE metastore table, which brings unnecessary overhead.
> We should prevent compaction commands being allowed on non-ACID tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12639) Handle exceptions during SARG creation

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12639:
---
Target Version/s:   (was: 2.0.0, 2.1.0)

> Handle exceptions during SARG creation
> --
>
> Key: HIVE-12639
> URL: https://issues.apache.org/jira/browse/HIVE-12639
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Bad predicates can cause SearchArgument creation to throw exception.  For 
> example, filters like where ts = '2014-15-16 17:18:19.20' can throw 
> IllegalArgumentException during SARG creation as timestamp is of wrong format 
> (month is invalid). If SARG creation fails, it should return YES_NO_NULL 
> TruthValue instead of throwing exception. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13245) VectorDeserializeRow throws IndexOutOfBoundsException

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13245:
---
Target Version/s:   (was: 2.1.0)

> VectorDeserializeRow throws IndexOutOfBoundsException
> -
>
> Key: HIVE-13245
> URL: https://issues.apache.org/jira/browse/HIVE-13245
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>
> When running following query on TPCDS 1000 scale, VectorDeserializeRow threw 
> ArrayIndexOutOfBoundsException
> {code:title=Query}
> SELECT `customer_address`.`ca_zip`   AS `ca_zip`, 
>`customer_demographics`.`cd_education_status` AS 
> `cd_education_status`, 
>Sum(`store_sales`.`ss_net_paid`)  AS `SUM:SS_NET_PAID:ok` 
> FROM   `store_sales` `store_sales` 
>INNER JOIN `customer` `customer` 
>ON ( `store_sales`.`ss_customer_sk` = 
>   `customer`.`c_customer_sk` ) 
>INNER JOIN `customer_address` `customer_address` 
>ON ( `customer`.`c_current_addr_sk` = 
>   `customer_address`.`ca_address_sk` ) 
>INNER JOIN `customer_demographics` `customer_demographics` 
>ON ( `customer`.`c_current_cdemo_sk` = 
> `customer_demographics`.`cd_demo_sk` ) 
> WHERE  ( `customer`.`c_first_sales_date_sk` > 2452300 
>  AND `customer_demographics`.`cd_gender` = 'F' 
>  AND `customer`.`c_current_addr_sk` IS NOT NULL 
>  AND `store_sales`.`ss_sold_date_sk` IS NOT NULL 
>  AND `customer`.`c_current_cdemo_sk` IS NOT NULL ) 
> GROUP  BY `ca_zip`, 
>   `cd_education_status`;
> {code}
> {code:title=Exception}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:354)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:59)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:59)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:36)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:356)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
>   ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:62)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
>   ... 17 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ArrayIndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:392)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:143)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:121)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>   at 
>

[jira] [Updated] (HIVE-13256) LLAP: RowGroup counter is wrong

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13256:
---
Target Version/s:   (was: 2.1.0)

> LLAP: RowGroup counter is wrong
> ---
>
> Key: HIVE-13256
> URL: https://issues.apache.org/jira/browse/HIVE-13256
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Log line from LlapIOCounter
> {code}
> ROWS_EMITTED=23528469, SELECTED_ROWGROUPS=87
> {code}
> If rowgroups contain 10K rows by default then expected count is 235 for the 
> above case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13284) Make ORC Reader resilient to 0 length files

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13284:
---
Target Version/s:   (was: 2.1.0)

> Make ORC Reader resilient to 0 length files
> ---
>
> Key: HIVE-13284
> URL: https://issues.apache.org/jira/browse/HIVE-13284
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> HIVE-13040 creates 0 length ORC files. Reading such files will throw 
> following exception. ORC is resilient to corrupt footers but not 0 length 
> files.
> {code}
> Processing data file file:/app/warehouse/concat_incompat/00_0 [length: 0]
> Exception in thread "main" java.lang.IndexOutOfBoundsException
>   at java.nio.Buffer.checkIndex(Buffer.java:540)
>   at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:510)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:361)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:83)
>   at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.getReader(FileDump.java:239)
>   at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:312)
>   at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:291)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13343) Need to disable hybrid grace hash join in llap mode except for dynamically partitioned hash join

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13343:
---
Fix Version/s: 2.1.0

> Need to disable hybrid grace hash join in llap mode except for dynamically 
> partitioned hash join
> 
>
> Key: HIVE-13343
> URL: https://issues.apache.org/jira/browse/HIVE-13343
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-13343.1.patch, HIVE-13343.2.patch, 
> HIVE-13343.3.patch, HIVE-13343.4.patch, HIVE-13343.5.patch, 
> HIVE-13343.6.patch, HIVE-13343.7.patch
>
>
> Due to performance reasons, we should disable use of hybrid grace hash join 
> in llap when dynamic partition hash join is not used. With dynamic partition 
> hash join, we need hybrid grace hash join due to the possibility of skews.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13602) TPCH q16 return wrong result when CBO is on

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13602:
---
Fix Version/s: 2.1.0

> TPCH q16 return wrong result when CBO is on
> ---
>
> Key: HIVE-13602
> URL: https://issues.apache.org/jira/browse/HIVE-13602
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.0.0, 1.2.2
>Reporter: Nemon Lou
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-13602.01.patch, HIVE-13602.03.patch, 
> HIVE-13602.04.patch, HIVE-13602.05.patch, HIVE-13602.final.patch, 
> calcite_cbo_bad.out, calcite_cbo_good.out, explain_cbo_bad_part1.out, 
> explain_cbo_bad_part2.out, explain_cbo_bad_part3.out, 
> explain_cbo_good(rewrite)_part1.out, explain_cbo_good(rewrite)_part2.out, 
> explain_cbo_good(rewrite)_part3.out
>
>
> Running tpch with factor 2, 
> q16 returns 1,160 rows when CBO is on,
> while returns 24,581 rows when CBO is off.
> See attachment for detail .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13629) Expose Merge-File task and Column-Truncate task from DDLTask

2016-05-24 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13629:
---
Fix Version/s: 2.1.0

> Expose Merge-File task and Column-Truncate task from DDLTask
> 
>
> Key: HIVE-13629
> URL: https://issues.apache.org/jira/browse/HIVE-13629
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: zhihai xu
>Assignee: zhihai xu
> Fix For: 2.1.0
>
> Attachments: HIVE-13629.000.patch
>
>
> DDLTask will create subtask in mergeFiles and truncateTable to support 
> HiveOperation.TRUNCATETABLE, HiveOperation.ALTERTABLE_MERGEFILES and 
> HiveOperation.ALTERPARTITION_MERGEFILES.
> It will be better to expose the tasks which are created at function 
> mergeFiles and truncateTable from class DDLTask to users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 140 matches

Mail list logo