[jira] [Commented] (HIVE-5795) Hive should be able to skip header and footer rows when reading data file for a table
[ https://issues.apache.org/jira/browse/HIVE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061630#comment-15061630 ] Sivanesan commented on HIVE-5795: - But it was the other way around. It was able to skip if the file size is lesser than block size and skips random detail record if the file size is more than block size. My assumption: While using CombineHiveInputFormat, a record around end of first block is skipped. > Hive should be able to skip header and footer rows when reading data file for > a table > - > > Key: HIVE-5795 > URL: https://issues.apache.org/jira/browse/HIVE-5795 > Project: Hive > Issue Type: New Feature >Reporter: Shuaishuai Nie >Assignee: Shuaishuai Nie > Labels: TODOC13 > Fix For: 0.13.0 > > Attachments: HIVE-5795.1.patch, HIVE-5795.2.patch, HIVE-5795.3.patch, > HIVE-5795.4.patch, HIVE-5795.5.patch > > > Hive should be able to skip header and footer lines when reading data file > from table. In this way, user don't need to processing data which generated > by other application with a header or footer and directly use the file for > table operations. > To implement this, the idea is adding new properties in table descriptions to > define the number of lines in header and footer and skip them when reading > the record from record reader. An DDL example for creating a table with > header and footer should be like this: > {code} > Create external table testtable (name string, message string) row format > delimited fields terminated by '\t' lines terminated by '\n' location > '/testtable' tblproperties ("skip.header.line.count"="1", > "skip.footer.line.count"="2"); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators
[ https://issues.apache.org/jira/browse/HIVE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061617#comment-15061617 ] Hive QA commented on HIVE-11355: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778139/HIVE-11355.7.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 184 failed/errored test(s), 9949 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_constprog_dpp org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llapdecider org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_mrr org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_tests org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_joins_explain org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_smb_main org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_group_by org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_join_part_col_char org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join21 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_gby org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_limit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_semijoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_exists org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_union org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_views org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_constprog_dpp org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join0 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_leftsemijoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.Tes
[jira] [Commented] (HIVE-11487) Add getNumPartitionsByFilter api in metastore api
[ https://issues.apache.org/jira/browse/HIVE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061564#comment-15061564 ] Akshay Goyal commented on HIVE-11487: - Sorry, i missed adding generated files in the patch. Updated. > Add getNumPartitionsByFilter api in metastore api > - > > Key: HIVE-11487 > URL: https://issues.apache.org/jira/browse/HIVE-11487 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Akshay Goyal > Attachments: HIVE-11487.01.patch, HIVE-11487.02.patch, > HIVE-11487.03.patch, HIVE-11487.04.patch > > > Adding api for getting number of partitions for a filter will be more optimal > when we are only interested in the number. getAllPartitions will construct > all the partition object which can be time consuming and not required. > Here is a commit we pushed in a forked repo in our organization - > https://github.com/inmobi/hive/commit/68b3534d3e6c4d978132043cec668798ed53e444. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11487) Add getNumPartitionsByFilter api in metastore api
[ https://issues.apache.org/jira/browse/HIVE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay Goyal updated HIVE-11487: Attachment: HIVE-11487.04.patch > Add getNumPartitionsByFilter api in metastore api > - > > Key: HIVE-11487 > URL: https://issues.apache.org/jira/browse/HIVE-11487 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Amareshwari Sriramadasu >Assignee: Akshay Goyal > Attachments: HIVE-11487.01.patch, HIVE-11487.02.patch, > HIVE-11487.03.patch, HIVE-11487.04.patch > > > Adding api for getting number of partitions for a filter will be more optimal > when we are only interested in the number. getAllPartitions will construct > all the partition object which can be time consuming and not required. > Here is a commit we pushed in a forked repo in our organization - > https://github.com/inmobi/hive/commit/68b3534d3e6c4d978132043cec668798ed53e444. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12590) Repeated UDAFs with literals can produce incorrect result
[ https://issues.apache.org/jira/browse/HIVE-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-12590: Attachment: HIVE-12590.3.patch > Repeated UDAFs with literals can produce incorrect result > - > > Key: HIVE-12590 > URL: https://issues.apache.org/jira/browse/HIVE-12590 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 1.0.1, 1.1.1, 1.2.1, 2.0.0 >Reporter: Laljo John Pullokkaran >Assignee: Ashutosh Chauhan >Priority: Critical > Attachments: HIVE-12590.2.patch, HIVE-12590.3.patch, HIVE-12590.patch > > > Repeated UDAF with literals could produce wrong result. > This is not a common use case, nevertheless a bug. > hive> select max('pants'), max('pANTS') from t1 group by key; > Total MapReduce CPU Time Spent: 0 msec > OK > pANTS pANTS > pANTS pANTS > pANTS pANTS > pANTS pANTS > pANTS pANTS > Time taken: 296.252 seconds, Fetched: 5 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061531#comment-15061531 ] Thejas M Nair edited comment on HIVE-12698 at 12/17/15 5:41 AM: [~sershe] I think we should get this interface update into 2.0.0 as well. This improves on what is currently in the branch, which is an unreleased change. was (Author: thejas): [~sershe] I think we should get this interface update into 2.0.0 as well. This improves on what is currently in the branch. > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch, HIVE-12698.2.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061531#comment-15061531 ] Thejas M Nair commented on HIVE-12698: -- [~sershe] I think we should get this interface update into 2.0.0 as well. This improves on what is currently in the branch. > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch, HIVE-12698.2.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-12698: - Attachment: HIVE-12698.2.patch Updated patch to address comments > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch, HIVE-12698.2.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061520#comment-15061520 ] Dapeng Sun commented on HIVE-12698: --- Sorry for the repeat messages, there are something wrong with my web brower > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061509#comment-15061509 ] Dapeng Sun commented on HIVE-12698: --- I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061511#comment-15061511 ] Dapeng Sun commented on HIVE-12698: --- I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061510#comment-15061510 ] Dapeng Sun commented on HIVE-12698: --- I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061513#comment-15061513 ] Dapeng Sun commented on HIVE-12698: --- I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061512#comment-15061512 ] Dapeng Sun commented on HIVE-12698: --- I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061508#comment-15061508 ] Dapeng Sun commented on HIVE-12698: --- I'm agreed with you, we can also fix getHivePrincipal() at {{DDLTask}} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061473#comment-15061473 ] Ferdinand Xu commented on HIVE-12698: - Agree. It will be great to be {code} public HivePrincipal getHivePrincipal(PrincipalDesc principal) {code} which is the same as {code} public HivePrivilegeObject getHivePrivilegeObject(PrivilegeObjectDesc privSubjectDesc) throws HiveException; {code} > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061454#comment-15061454 ] Thejas M Nair commented on HIVE-12698: -- Also, I noticed that DDLTask.java still calls AuthorizationUtils.getHivePrincipal at a few places. I guess that needs to be fixed as well. > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061452#comment-15061452 ] Thejas M Nair commented on HIVE-12698: -- [~dapengsun] [~Ferd] I am wondering if the api HiveAuthorizationTranslator, should have methods that operate on a single element instead of a list of them. That seems more generic and captures the basic logic - ie instead of - {code} public List getHivePrincipals(List principals) {code} have - {code} public HivePrincipal getHivePrincipals(PrincipalDesc principals) {code} What are your thoughts ? > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12684) NPE in stats annotation when all values in decimal column are NULLs
[ https://issues.apache.org/jira/browse/HIVE-12684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12684: - Attachment: HIVE-12684.3.patch Reuploading for precommit tests. > NPE in stats annotation when all values in decimal column are NULLs > --- > > Key: HIVE-12684 > URL: https://issues.apache.org/jira/browse/HIVE-12684 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12684.1.patch, HIVE-12684.2.patch, > HIVE-12684.3.patch, HIVE-12684.3.patch > > > When all column values are null for a decimal column and when column stats > exists. AnnotateWithStatistics optimization can throw NPE. Following is the > exception trace > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:712) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:764) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:750) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:197) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:143) > at > org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:131) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:114) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.walk(LevelOrderWalker.java:143) > at > org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) > at > org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:228) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10156) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:225) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061422#comment-15061422 ] Hive QA commented on HIVE-12667: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778100/HIVE-12667.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6376/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6376/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6376/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6376/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 0f1c112 HIVE-12610: Hybrid Grace Hash Join should fail task faster if processing first batch fails, instead of continuing processing the rest (Wei Zheng via Vikram Dixit K) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 0f1c112 HIVE-12610: Hybrid Grace Hash Join should fail task faster if processing first batch fails, instead of continuing processing the rest (Wei Zheng via Vikram Dixit K) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12778100 - PreCommit-HIVE-TRUNK-Build > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12667.1.patch, HIVE-12667.1.patch > > > HIVE-12473 has added an incorrect comment and also lacks a test case. > Benefits of this fix: >* Does not say: "Probably doesn't work" >* Does not use grammar like "subquery columns and such" >* Adds test cases, that let you verify the fix >* Doesn't rely on certain structure of key expr, just takes the type at > compile time >* Doesn't require an additional walk of each key expression >* Shows the type used in explain -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12695) LLAP: use somebody else's cluster
[ https://issues.apache.org/jira/browse/HIVE-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061419#comment-15061419 ] Hive QA commented on HIVE-12695: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778096/HIVE-12695.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9964 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6375/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6375/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6375/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12778096 - PreCommit-HIVE-TRUNK-Build > LLAP: use somebody else's cluster > - > > Key: HIVE-12695 > URL: https://issues.apache.org/jira/browse/HIVE-12695 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12695.patch > > > For non-HS2 case cluster sharing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12701) select on table with boolean as partition column shows wrong result
[ https://issues.apache.org/jira/browse/HIVE-12701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudipto Nandan updated HIVE-12701: -- Component/s: SQL Database/Schema > select on table with boolean as partition column shows wrong result > --- > > Key: HIVE-12701 > URL: https://issues.apache.org/jira/browse/HIVE-12701 > Project: Hive > Issue Type: Bug > Components: Database/Schema, SQL >Affects Versions: 1.1.0 >Reporter: Sudipto Nandan > > create table hive_aprm02ht7(a int, b int, c int) partitioned by (p boolean) > row format delimited fields terminated by ',' stored as textfile; > load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition > (p=true); > load data local inpath 'hive_data8.txt' into table hive_aprm02ht7 partition > (p=false); > describe hive_aprm02ht7; > col_namedata_type comment > a int > b int > c int > p boolean > # Partition Information > # col_name data_type comment > p boolean > show partitions hive_aprm02ht7; > OK > p=false > p=true > Time taken: 0.057 seconds, Fetched: 2 row(s) > -- everything is shown as true. But first three should be true and the last > three rows should be false > hive> select * from hive_aprm02ht7 where p in (true,false); > OK > 1 2 3 true > 4 5 6 true > 7 8 9 true > 1 2 3 true > 4 5 6 true > 7 8 9 true > Time taken: 0.068 seconds, Fetched: 6 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8494) Hive partitioned table with smallint datatype
[ https://issues.apache.org/jira/browse/HIVE-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudipto Nandan updated HIVE-8494: - Affects Version/s: 1.1.0 > Hive partitioned table with smallint datatype > - > > Key: HIVE-8494 > URL: https://issues.apache.org/jira/browse/HIVE-8494 > Project: Hive > Issue Type: Bug > Components: CLI, Query Processor >Affects Versions: 0.12.0, 0.13.0, 1.1.0 >Reporter: Sudipto Nandan > > create a hive partitioned table with partitioning column datatype smallint > col_namedata_type comment > a int None > b int None > c int None > p smallint None > Partition Information > col_name data_type comment > psmallintNone > Put the following data. See the partition value is 32768 - which exceeds the > smallint limit by 1 > select * from t; > a b c p > 1 2 3 32768 > 4 5 6 32768 > 7 8 9 32768 > hive> select sum(p) from t; > also works > but > hive> select min(p) from t; > fails > It should disallow creation of partition with value 32768 as it exceeds the > smallint limit (SMALLINT (2-byte signed integer, from -32,768 to 32,767) > The same issue is even with int and partition column value of 2,147,483,648 > which exceeds the int limit (INT (4-byte signed integer, from -2,147,483,648 > to 2,147,483,647)) > ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11797) Alter table change columnname doesn't work on avro serde hive table
[ https://issues.apache.org/jira/browse/HIVE-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudipto Nandan updated HIVE-11797: -- Affects Version/s: 1.1.0 > Alter table change columnname doesn't work on avro serde hive table > --- > > Key: HIVE-11797 > URL: https://issues.apache.org/jira/browse/HIVE-11797 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Sudipto Nandan >Assignee: Chaoyu Tang > > We create a table using Avro serde > Hive table name hive_t1. > Then we try to change the column name. > The commands ends successfully but the name of the column is not modified. > create table if not exists hive_t1 > partitioned by (p1 int) > row format SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' > STORED AS > INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' > TBLPROPERTIES ('avro.schema.literal'='{ > "namespace": "testing.hive.avro.serde", > "name": "avro_table", > "type": "record", > "fields": [ > { > "name":"number", > "type":"int", > "doc":"Order of playing the role" > }, > { > "name":"first_name", > "type":"string", > "doc":"first name of actor playing role" > }, > { > "name":"last_name", > "type":"string", > "doc":"last name of actor playing role" > }, > { > "name":"extra_field", > "type":"string", > "doc:":"an extra field not in the original file", > "default":"fishfingers and custard" > } > ] > }'); > hive> alter table hive_t1 change column number number1 int; > OK > Time taken: 0.12 seconds > hive> select * from hive_t1 limit 5; > OK > hive_t1.number hive_t1.first_name hive_t1.last_name > hive_t1.extra_field hive_t1.p1 > 6 Colin Baker fishfingers and custard 100 > 3 Jon Pertwee fishfingers and custard 100 > 4 Tom Baker fishfingers and custard 100 > 5 Peter Davison fishfingers and custard 100 > 11 MattSmith fishfingers and custard 100 > Time taken: 0.05 seconds, Fetched: 5 row(s) > hive> describe hive_t1; > OK > col_namedata_type comment > number int from deserializer > first_name string from deserializer > last_name string from deserializer > extra_field string from deserializer > p1 int > > # Partition Information > # col_name data_type comment > > p1 int > Time taken: 0.051 seconds, Fetched: 10 row(s) > -- Using the below command also the column name is not changed from "number" > to "number1" > hive> alter table hive_t1 change number number1 int; > OK > Time taken: 0.081 seconds > hive> describe hive_t1; > OK > col_namedata_type comment > number int from deserializer > first_name string from deserializer > last_name string from deserializer > extra_field string from deserializer > p1 int > > # Partition Information > # col_name data_type comment > > p1 int > Time taken: 0.054 seconds, Fetched: 10 row(s) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12470) Allow splits to provide custom consistent locations, instead of being tied to data locality
[ https://issues.apache.org/jira/browse/HIVE-12470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061390#comment-15061390 ] Siddharth Seth commented on HIVE-12470: --- RB already exists. On the sorting - the list can change on each refresh, and it isn't known whether the list actually changes or not. That could be tracked. However, given this is not accessed very frequently, I did not try to optimize away the sort. Cache registries by name - for a single client which may communicate with different llap instances. e.g. a single hive server instance which can submit to different llap daemons. > Allow splits to provide custom consistent locations, instead of being tied to > data locality > --- > > Key: HIVE-12470 > URL: https://issues.apache.org/jira/browse/HIVE-12470 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12470.1.txt, HIVE-12470.1.wip.txt > > > LLAP instances may not run on the same nodes as HDFS, or may run on a subset > of the cluster. > Using split locations based on FileSystem locality is not very useful in such > cases - since that guarantees not getting any locality. > Allow a split to map to a specific location - so that there's a chance of > getting cache locality across different queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12676) [hive+impala] Alter table Rename to + Set location in a single step
[ https://issues.apache.org/jira/browse/HIVE-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061378#comment-15061378 ] Lefty Leverenz commented on HIVE-12676: --- [~egmont@c], yes you should file this request separately for Impala. Thanks. > [hive+impala] Alter table Rename to + Set location in a single step > --- > > Key: HIVE-12676 > URL: https://issues.apache.org/jira/browse/HIVE-12676 > Project: Hive > Issue Type: Improvement > Components: hpl/sql >Reporter: Egmont Koblinger >Assignee: Dmitry Tolpeko >Priority: Minor > > Assume a nonstandard table location, let's say /foo/bar/table1. You might > want to rename from table1 to table2 and move the underlying data accordingly > to /foo/bar/table2. > The "alter table ... rename to ..." clause alters the table name, but in the > same step moves the data into the standard location > /user/hive/warehouse/table2. Then a subsequent "alter table ... set location > ..." can move it back to the desired location /foo/bar/table2. > This is problematic if there's any permission problem in the game, e.g. not > being able to write to /user/hive/warehouse. So it should be possible to move > the underlying data to its desired final place without intermittent places in > between. > A probably hard to discover workaround is to set the table to external, then > rename it, then set back to internal and then change its location. > It would be great to be able to do an "alter table ... rename to ... set > location ..." operation in a single step. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061376#comment-15061376 ] Sergey Shelukhin commented on HIVE-12667: - +1 pending tests. I guess we don't have to check for string. > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12667.1.patch, HIVE-12667.1.patch > > > HIVE-12473 has added an incorrect comment and also lacks a test case. > Benefits of this fix: >* Does not say: "Probably doesn't work" >* Does not use grammar like "subquery columns and such" >* Adds test cases, that let you verify the fix >* Doesn't rely on certain structure of key expr, just takes the type at > compile time >* Doesn't require an additional walk of each key expression >* Shows the type used in explain -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12699) LLAP: hive.llap.daemon.work.dirs setting backward compat name doesn't work
[ https://issues.apache.org/jira/browse/HIVE-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061369#comment-15061369 ] Gopal V commented on HIVE-12699: LGTM - +1. > LLAP: hive.llap.daemon.work.dirs setting backward compat name doesn't work > --- > > Key: HIVE-12699 > URL: https://issues.apache.org/jira/browse/HIVE-12699 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Trivial > Attachments: HIVE-12699.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12352) CompactionTxnHandler.markCleaned() may delete too much
[ https://issues.apache.org/jira/browse/HIVE-12352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12352: -- Description: Worker will start with DB in state X (wrt this partition). while it's working more txns will happen, against partition it's compacting. then this will delete state up to X and since then. There may be new delta files created between compaction starting and cleaning. These will not be compacted until more transactions happen. So this ideally should only delete up to TXN_ID that was compacted (i.e. HWM in Worker?) Then this can also run at READ_COMMITTED. So this means we'd want to store HWM in COMPACTION_QUEUE when Worker picks up the job. Actually the problem is even worse (but also solved using HWM as above): Suppose some transactions (against same partition) have started and aborted since the time Worker ran compaction job. That means there are never-compacted delta files with data that belongs to these aborted txns. Following will pick up these aborted txns. s = "select txn_id from TXNS, TXN_COMPONENTS where txn_id = tc_txnid and txn_state = '" + TXN_ABORTED + "' and tc_database = '" + info.dbname + "' and tc_table = '" + info.tableName + "'"; if (info.partName != null) s += " and tc_partition = '" + info.partName + "'"; The logic after that will delete relevant data from TXN_COMPONENTS and if one of these txns becomes empty, it will be picked up by cleanEmptyAbortedTxns(). At that point any metadata about an Aborted txn is gone and the system will think it's committed. HWM in this case would be (in ValidCompactorTxnList) if(minOpenTxn > 0) min(highWaterMark, minOpenTxn) else highWaterMark was: Worker will start with DB in state X (wrt this partition). while it's working more txns will happen, against partition it's compacting. then this will delete state up to X and since then. There may be new delta files created between compaction starting and cleaning. These will not be compacted until more transactions happen. So this ideally should only delete up to TXN_ID that was compacted (i.e. HWM in Worker?) Then this can also run at READ_COMMITTED. So this means we'd want to store HWM in COMPACTION_QUEUE when Worker picks up the job. Actually the problem is even worse (but also solved using HWM as above): Suppose some transactions (against same partition) have started and aborted since the time Worker ran compaction job. That means there are never-compacted delta files with data that belongs to these aborted txns. Following will pick up these aborted txns. s = "select txn_id from TXNS, TXN_COMPONENTS where txn_id = tc_txnid and txn_state = '" + TXN_ABORTED + "' and tc_database = '" + info.dbname + "' and tc_table = '" + info.tableName + "'"; if (info.partName != null) s += " and tc_partition = '" + info.partName + "'"; The logic after that will delete relevant data from TXN_COMPONENTS and if one of these txns becomes empty, it will be picked up by cleanEmptyAbortedTxns(). At that point any metadata about an Aborted txn is gone and the system will think it's committed. > CompactionTxnHandler.markCleaned() may delete too much > -- > > Key: HIVE-12352 > URL: https://issues.apache.org/jira/browse/HIVE-12352 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Blocker > >Worker will start with DB in state X (wrt this partition). >while it's working more txns will happen, against partition it's > compacting. >then this will delete state up to X and since then. There may be new > delta files created >between compaction starting and cleaning. These will not be compacted > until more >transactions happen. So this ideally should only delete >up to TXN_ID that was compacted (i.e. HWM in Worker?) Then this can also > run >at READ_COMMITTED. So this means we'd want to store HWM in > COMPACTION_QUEUE when >Worker picks up the job. > Actually the problem is even worse (but also solved using HWM as above): > Suppose some transactions (against same partition) have started and aborted > since the time Worker ran compaction job. > That means there are never-compacted delta files with data that belongs to > these aborted txns. > Following will pick up these aborted txns. > s = "select txn_id from TXNS, TXN_COMPONENTS where txn_id = tc_txnid and > txn_state = '" + > TXN_ABORTED + "' and tc_database = '" + info.dbname + "' and > tc_table = '" + > info.tableName + "'"; > if (info.partName != null) s += " and tc_partition = '" + > info.partName + "'
[jira] [Updated] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-12698: - Attachment: HIVE-12698.1.patch > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12698.1.patch > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
[ https://issues.apache.org/jira/browse/HIVE-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061356#comment-15061356 ] ASF GitHub Bot commented on HIVE-12698: --- GitHub user thejasmn opened a pull request: https://github.com/apache/hive/pull/58 HIVE-12698 : introduce HiveAuthorizationTranslator interface for isolating authori… …zation impls from hive internal classes You can merge this pull request into a Git repository by running: $ git pull https://github.com/thejasmn/hive HIVE-12698 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/58.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #58 commit 27e73f2a45aa3e5a158c14fb0693567158cef0d7 Author: Thejas Nair Date: 2015-12-17T02:36:52Z introduce HiveAuthorizationTranslator interface for isolating authorization impls from hive internal classes > Remove exposure to internal privilege and principal classes in HiveAuthorizer > - > > Key: HIVE-12698 > URL: https://issues.apache.org/jira/browse/HIVE-12698 > Project: Hive > Issue Type: Bug > Components: Authorization >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 1.3.0, 2.0.0 > > > The changes in HIVE-11179 expose several internal classes to > HiveAuthorization implementations. These include PrivilegeObjectDesc, > PrivilegeDesc, PrincipalDesc and AuthorizationUtils. > We should avoid exposing that to all Authorization implementations, but also > make the ability to customize the mapping of internal classes to the public > api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061350#comment-15061350 ] Lefty Leverenz commented on HIVE-12570: --- [~hsubramaniyan], Fix Version/s should include 1.3.0. > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Instead of the above error message, it be more appropriate to throw the below > error : > ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid > function 'average' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers
[ https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061343#comment-15061343 ] Dapeng Sun commented on HIVE-11179: --- Thank [~thejas] for your follow up. I will also think about how to minimize the api change. > HIVE should allow custom converting from HivePrivilegeObjectDesc to > privilegeObject for different authorizers > - > > Key: HIVE-11179 > URL: https://issues.apache.org/jira/browse/HIVE-11179 > Project: Hive > Issue Type: Improvement >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Labels: Authorization > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch > > > HIVE should allow custom converting from HivePrivilegeObjectDesc to > privilegeObject for different authorizers: > There is a case in Apache Sentry: Sentry support uri and server level > privilege, but in hive side, it uses > {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the > converting, and the code in {{getHivePrivilegeObject()}} only handle the > scenes for table and database > {noformat} > privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW : > HivePrivilegeObjectType.DATABASE; > {noformat} > A solution is move this method to {{HiveAuthorizer}}, so that a custom > Authorizer could enhance it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12700) complex join keys cannot be recognized in Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-12700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyong Zhu updated HIVE-12700: Attachment: job explain plan.txt Implicit Joins.hql explicit join key.hql > complex join keys cannot be recognized in Hive 0.13 > --- > > Key: HIVE-12700 > URL: https://issues.apache.org/jira/browse/HIVE-12700 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 0.13.1 >Reporter: Xiaoyong Zhu >Priority: Critical > Attachments: Implicit Joins.hql, explicit join key.hql, job explain > plan.txt > > > Hi Experts > I am using Hive 0.13 and find a potential bug. Attached “implicit join.hql” > has several join keys (for example store_sales.ss_addr_sk = > customer_address.ca_address_sk) and cannot be regonized by Hive. In such > cases hive won’t be able to optimize and can only do a cross join first which > makes the job runs really long. If I change the log to explicit join keys, > then it works well. > For the below simple query hive can regcogonize the join keys, and I think > Hive should be able to handle the complex situations such as my example, > right? > > SELECT * > FROM table1 t1, table2 t2, table3 t3 > WHERE t1.id = t2.id AND t2.id = t3.id AND t1.zipcode = '02535'; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11865) Disable Hive PPD optimizer when CBO has optimized the plan
[ https://issues.apache.org/jira/browse/HIVE-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061329#comment-15061329 ] Hive QA commented on HIVE-11865: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778074/HIVE-11865.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 82 failed/errored test(s), 9964 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_boolexpr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_columns org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_join_part_col_char org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_between_columns org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_part_col_char org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query13 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query15 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query17 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query18 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query19 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query20 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query21 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query22 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query25 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query26 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query27 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query28 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query29 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query3 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query31 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query32 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query34 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query39 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query40 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query42 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query43 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query45 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query46 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query48 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query50 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query51 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query52 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query54 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query55 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query58 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query64 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query65 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query66 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query67 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query68 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query7 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query70 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query71 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query72 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query73 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query75 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query76 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query79 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query80 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query82 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query84 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query87 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCli
[jira] [Updated] (HIVE-12699) LLAP: hive.llap.daemon.work.dirs setting backward compat name doesn't work
[ https://issues.apache.org/jira/browse/HIVE-12699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12699: Attachment: HIVE-12699.patch [~gopalv] can you take a look? Trivial patch. > LLAP: hive.llap.daemon.work.dirs setting backward compat name doesn't work > --- > > Key: HIVE-12699 > URL: https://issues.apache.org/jira/browse/HIVE-12699 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Trivial > Attachments: HIVE-12699.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers
[ https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061317#comment-15061317 ] Thejas M Nair commented on HIVE-11179: -- Created HIVE-12698 to track the changes to reduce exposure to hive internal classes to general authorization implementations. The changes should also help in reducing chances of breakage of other authorization implementation changes with newer changes. > HIVE should allow custom converting from HivePrivilegeObjectDesc to > privilegeObject for different authorizers > - > > Key: HIVE-11179 > URL: https://issues.apache.org/jira/browse/HIVE-11179 > Project: Hive > Issue Type: Improvement >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Labels: Authorization > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch > > > HIVE should allow custom converting from HivePrivilegeObjectDesc to > privilegeObject for different authorizers: > There is a case in Apache Sentry: Sentry support uri and server level > privilege, but in hive side, it uses > {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the > converting, and the code in {{getHivePrivilegeObject()}} only handle the > scenes for table and database > {noformat} > privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW : > HivePrivilegeObjectType.DATABASE; > {noformat} > A solution is move this method to {{HiveAuthorizer}}, so that a custom > Authorizer could enhance it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12470) Allow splits to provide custom consistent locations, instead of being tied to data locality
[ https://issues.apache.org/jira/browse/HIVE-12470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061291#comment-15061291 ] Sergey Shelukhin commented on HIVE-12470: - Can you post a RB? Why not store the list pre-sorted instead of sorting every time? Also, what is the need for the cache of registries by name? > Allow splits to provide custom consistent locations, instead of being tied to > data locality > --- > > Key: HIVE-12470 > URL: https://issues.apache.org/jira/browse/HIVE-12470 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12470.1.txt, HIVE-12470.1.wip.txt > > > LLAP instances may not run on the same nodes as HDFS, or may run on a subset > of the cluster. > Using split locations based on FileSystem locality is not very useful in such > cases - since that guarantees not getting any locality. > Allow a split to map to a specific location - so that there's a chance of > getting cache locality across different queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061283#comment-15061283 ] Laljo John Pullokkaran commented on HIVE-12570: --- https://issues.apache.org/jira/browse/HIVE-12570 is not about this bug. > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Instead of the above error message, it be more appropriate to throw the below > error : > ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid > function 'average' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12534) Date functions with vectorization is returning wrong results
[ https://issues.apache.org/jira/browse/HIVE-12534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V resolved HIVE-12534. Resolution: Duplicate This is duplicated by HIVE-12479 and HIVE-12535 > Date functions with vectorization is returning wrong results > > > Key: HIVE-12534 > URL: https://issues.apache.org/jira/browse/HIVE-12534 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Critical > Attachments: p26_explain.txt, plan.txt > > > {noformat} > select c.effective_date, year(c.effective_date), month(c.effective_date) from > customers c where c.customer_id = 146028; > hive> set hive.vectorized.execution.enabled=true; > hive> select c.effective_date, year(c.effective_date), > month(c.effective_date) from customers c where c.customer_id = 146028; > 2015-11-19 0 0 > hive> set hive.vectorized.execution.enabled=false; > hive> select c.effective_date, year(c.effective_date), > month(c.effective_date) from customers c where c.customer_id = 146028; > 2015-11-19 201511 > {noformat} > \cc [~gopalv], [~sseth], [~sershe] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11927: --- Attachment: HIVE-11927.14.patch > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch, > HIVE-11927.09.patch, HIVE-11927.10.patch, HIVE-11927.11.patch, > HIVE-11927.12.patch, HIVE-11927.13.patch, HIVE-11927.14.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12518) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test failure for groupby_resolution.q
[ https://issues.apache.org/jira/browse/HIVE-12518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061248#comment-15061248 ] Pengcheng Xiong commented on HIVE-12518: cc'ing [~jpullokkaran]. This is the issue for current return path. Thanks. > CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix test > failure for groupby_resolution.q > --- > > Key: HIVE-12518 > URL: https://issues.apache.org/jira/browse/HIVE-12518 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > > The problem can be reproduced when there is no map group by and the data is > skewed for return path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12610) Hybrid Grace Hash Join should fail task faster if processing first batch fails, instead of continuing processing the rest
[ https://issues.apache.org/jira/browse/HIVE-12610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-12610: -- Fix Version/s: 2.0.0 > Hybrid Grace Hash Join should fail task faster if processing first batch > fails, instead of continuing processing the rest > - > > Key: HIVE-12610 > URL: https://issues.apache.org/jira/browse/HIVE-12610 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 1.3.0, 2.0.0, 1.2.2, 2.1.0 > > Attachments: HIVE-12610.1.patch, HIVE-12610.2.patch, > HIVE-12610.branch-1.patch > > > During processing the spilled partitions, if there's any fatal error, such as > Kryo exception, then we should exit early, instead of moving on to process > the rest of spilled partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11179) HIVE should allow custom converting from HivePrivilegeObjectDesc to privilegeObject for different authorizers
[ https://issues.apache.org/jira/browse/HIVE-11179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061252#comment-15061252 ] Thejas M Nair commented on HIVE-11179: -- This patch is exposing many hive internal classes in the Authorization plugin interface. Classes exposed through the interface would be considered as public API by the users. But I also understand that sentry is quite intertwined with hive internals and needs this ability to do this custom conversion. I think we can minimize the exposure to other api users as well as provide Sentry with the ability it needs by tweaking this change some more. I will create a follow up jira. > HIVE should allow custom converting from HivePrivilegeObjectDesc to > privilegeObject for different authorizers > - > > Key: HIVE-11179 > URL: https://issues.apache.org/jira/browse/HIVE-11179 > Project: Hive > Issue Type: Improvement >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Labels: Authorization > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11179.001.patch, HIVE-11179.001.patch > > > HIVE should allow custom converting from HivePrivilegeObjectDesc to > privilegeObject for different authorizers: > There is a case in Apache Sentry: Sentry support uri and server level > privilege, but in hive side, it uses > {{AuthorizationUtils.getHivePrivilegeObject(privSubjectDesc)}} to do the > converting, and the code in {{getHivePrivilegeObject()}} only handle the > scenes for table and database > {noformat} > privSubjectDesc.getTable() ? HivePrivilegeObjectType.TABLE_OR_VIEW : > HivePrivilegeObjectType.DATABASE; > {noformat} > A solution is move this method to {{HiveAuthorizer}}, so that a custom > Authorizer could enhance it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12470) Allow splits to provide custom consistent locations, instead of being tied to data locality
[ https://issues.apache.org/jira/browse/HIVE-12470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12470: -- Attachment: HIVE-12470.1.txt Patch for review. cc [~gopalv], [~sershe] > Allow splits to provide custom consistent locations, instead of being tied to > data locality > --- > > Key: HIVE-12470 > URL: https://issues.apache.org/jira/browse/HIVE-12470 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12470.1.txt, HIVE-12470.1.wip.txt > > > LLAP instances may not run on the same nodes as HDFS, or may run on a subset > of the cluster. > Using split locations based on FileSystem locality is not very useful in such > cases - since that guarantees not getting any locality. > Allow a split to map to a specific location - so that there's a chance of > getting cache locality across different queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12610) Hybrid Grace Hash Join should fail task faster if processing first batch fails, instead of continuing processing the rest
[ https://issues.apache.org/jira/browse/HIVE-12610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061242#comment-15061242 ] Vikram Dixit K commented on HIVE-12610: --- Thanks Wei! > Hybrid Grace Hash Join should fail task faster if processing first batch > fails, instead of continuing processing the rest > - > > Key: HIVE-12610 > URL: https://issues.apache.org/jira/browse/HIVE-12610 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Wei Zheng >Assignee: Wei Zheng > Fix For: 1.3.0, 1.2.2, 2.1.0 > > Attachments: HIVE-12610.1.patch, HIVE-12610.2.patch, > HIVE-12610.branch-1.patch > > > During processing the spilled partitions, if there's any fatal error, such as > Kryo exception, then we should exit early, instead of moving on to process > the rest of spilled partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11775) Implement limit push down through union all in CBO
[ https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061229#comment-15061229 ] Pengcheng Xiong commented on HIVE-11775: We have a clean QA run. [~jpullokkaran], could you please take a look? Thanks. > Implement limit push down through union all in CBO > -- > > Key: HIVE-11775 > URL: https://issues.apache.org/jira/browse/HIVE-11775 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, > HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, > HIVE-11775.06.patch, HIVE-11775.07.patch, HIVE-11775.08.patch, > HIVE-11775.09.patch, HIVE-11775.10.patch > > > Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually > push limit down through union all, which reduces the intermediate number of > rows in union branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11775) Implement limit push down through union all in CBO
[ https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061224#comment-15061224 ] Hive QA commented on HIVE-11775: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778068/HIVE-11775.10.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9965 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarDataNucleusUnCaching org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6373/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6373/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6373/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12778068 - PreCommit-HIVE-TRUNK-Build > Implement limit push down through union all in CBO > -- > > Key: HIVE-11775 > URL: https://issues.apache.org/jira/browse/HIVE-11775 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, > HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, > HIVE-11775.06.patch, HIVE-11775.07.patch, HIVE-11775.08.patch, > HIVE-11775.09.patch, HIVE-11775.10.patch > > > Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually > push limit down through union all, which reduces the intermediate number of > rows in union branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12661: --- Attachment: HIVE-12661.04.patch > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch, HIVE-12661.04.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators
[ https://issues.apache.org/jira/browse/HIVE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-11355: -- Attachment: HIVE-11355.7.patch Fix for a couple of failing tests. > Hive on tez: memory manager for sort buffers (input/output) and operators > - > > Key: HIVE-11355 > URL: https://issues.apache.org/jira/browse/HIVE-11355 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-11355.1.patch, HIVE-11355.2.patch, > HIVE-11355.3.patch, HIVE-11355.4.patch, HIVE-11355.5.patch, > HIVE-11355.6.patch, HIVE-11355.7.patch > > > We need to better manage the sort buffer allocations to ensure better > performance. Also, we need to provide configurations to certain operators to > stay within memory limits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12685) Remove invalid property in common/src/test/resources/hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061138#comment-15061138 ] Wei Zheng commented on HIVE-12685: -- [~ashutoshc] Can you take a look? > Remove invalid property in common/src/test/resources/hive-site.xml > -- > > Key: HIVE-12685 > URL: https://issues.apache.org/jira/browse/HIVE-12685 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12685.1.patch, HIVE-12685.2.patch, > HIVE-12685.3.patch > > > Currently there's such a property as below, which is obviously wrong > {code} > > javax.jdo.option.ConnectionDriverName > hive-site.xml > Override ConfVar defined in HiveConf > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12685) Remove invalid property in common/src/test/resources/hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12685: - Attachment: HIVE-12685.3.patch Patch 3, which removes the unnecessary common/src/test/resources/hive-site.xml > Remove invalid property in common/src/test/resources/hive-site.xml > -- > > Key: HIVE-12685 > URL: https://issues.apache.org/jira/browse/HIVE-12685 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12685.1.patch, HIVE-12685.2.patch, > HIVE-12685.3.patch > > > Currently there's such a property as below, which is obviously wrong > {code} > > javax.jdo.option.ConnectionDriverName > hive-site.xml > Override ConfVar defined in HiveConf > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12685) Remove invalid property in common/src/test/resources/hive-site.xml
[ https://issues.apache.org/jira/browse/HIVE-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061131#comment-15061131 ] Wei Zheng commented on HIVE-12685: -- Currently we have difference versions of hive-site.xml located everywhere {code} wzheng /tmp/hive $ find . -name hive-site.xml ./beeline/src/test/resources/hive-site.xml ./common/src/test/resources/hive-site.xml ./conf/hive-site.xml ./data/conf/hive-site.xml ./data/conf/llap/hive-site.xml ./data/conf/perf-reg/hive-site.xml ./data/conf/spark/standalone/hive-site.xml ./data/conf/spark/yarn-client/hive-site.xml ./data/conf/tez/hive-site.xml ./hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.xml {code} The one causing problem of this JIRA is common/src/test/resources/hive-site.xml Instead of maintaining multiple pieces, we should get rid of unnecessary hive-site.xml as much as possible. So we should remove common/src/test/resources/hive-site.xml and just use the default one from data/conf (for TestHiveConf). > Remove invalid property in common/src/test/resources/hive-site.xml > -- > > Key: HIVE-12685 > URL: https://issues.apache.org/jira/browse/HIVE-12685 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12685.1.patch, HIVE-12685.2.patch > > > Currently there's such a property as below, which is obviously wrong > {code} > > javax.jdo.option.ConnectionDriverName > hive-site.xml > Override ConfVar defined in HiveConf > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061112#comment-15061112 ] Matt McCline commented on HIVE-12570: - [~hsubramaniyan] [~jpullokkaran] I'm getting "Expression not in GROUP BY key 'wr_return_quantity'] from (TOK_TABLE_OR_COL wr)" on TPCDS-49 on master but not on branch-1. It occurs before and after HIVE-12570 fix, so perhaps the fix is incomplete??? See https://hortonworks.jira.com/browse/BUG-48057 for the query text. > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Instead of the above error message, it be more appropriate to throw the below > error : > ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid > function 'average' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061074#comment-15061074 ] Hive QA commented on HIVE-11927: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12778057/HIVE-11927.13.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 35 failed/errored test(s), 9966 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_hour org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_minute org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_parse_url org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_second org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_elt org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query31 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query39 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query42 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query52 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query64 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query66 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query75 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_elt org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6372/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6372/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6372/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 35 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12778057 - PreCommit-HIVE-TRUNK-Build > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch, > HIVE-11927.09.patch, HIVE-11927.10.patch, HIVE-11927.11.patch, > HIVE-11927.12.patch, HIVE-11927.13.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12697) Remove deprecated post option from webhcat test files
[ https://issues.apache.org/jira/browse/HIVE-12697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aswathy Chellammal Sreekumar updated HIVE-12697: Attachment: HIVE-12697.1.patch [~eugene.koifman] Please review the patch > Remove deprecated post option from webhcat test files > - > > Key: HIVE-12697 > URL: https://issues.apache.org/jira/browse/HIVE-12697 > Project: Hive > Issue Type: Test > Components: WebHCat >Affects Versions: 2.0.0 >Reporter: Aswathy Chellammal Sreekumar >Assignee: Aswathy Chellammal Sreekumar > Labels: test > Attachments: HIVE-12697.1.patch > > > Tests are still having the deprecated post option user.name. Need to remove > them and add the same to query string -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11388) there should only be 1 Initiator for compactions per Hive installation
[ https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061006#comment-15061006 ] Eugene Koifman commented on HIVE-11388: --- here is one general purpose mechanism: create table MUTEX_TABLE(keyname varchar(512) PRIMARY KEY) Then any process that requires a mutex needs to insert a row into this table (as long as everyone agrees on the key) and then do a "Select for update" on this row. If the process dies, "select for update" lock is automatically released. For example, if 2 Initiator instances want to schedule a compaction, each could 1. select * from MUTEX_TABLE where keyname="initiator" for update. If the "initiator" row is already there, only 1 will succeed. The other one, once it unblocks, will already see "this" compaction scheduled. 2. if select in 1 misses, then Initiator can insert "initiator" row and then goto 1. Because of PK only 1 will succeed. Since the keyname is arbitrary, it can be "db/table/partition" to coordinate Workers if necessary. A little primitive, but workable and avoids ZooKeeper and allows all parts of Compaction/HouseKeeping to run on multiple MS nodes. > there should only be 1 Initiator for compactions per Hive installation > -- > > Key: HIVE-11388 > URL: https://issues.apache.org/jira/browse/HIVE-11388 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs > inside the metastore service to manage compactions of ACID tables. There > should be exactly 1 instance of this thread (even with multiple Thrift > services). > This is documented in > https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration > but not enforced. > Should add enforcement, since more than 1 Initiator could cause concurrent > attempts to compact the same table/partition - which will not work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060973#comment-15060973 ] Ashutosh Chauhan commented on HIVE-12688: - makes sense +1 for revert. Lets explore alternatives in follow-up. > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060939#comment-15060939 ] Aihua Xu commented on HIVE-12688: - I'm not able to work on that until next week. So please revert it and I will try to provide better approach for that. So agree with your approach. Sent from my iPhone > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12548) Hive metastore goes down in Kerberos,sentry enabled CDH5.5 cluster
[ https://issues.apache.org/jira/browse/HIVE-12548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12548: Assignee: (was: Vaibhav Gumashta) > Hive metastore goes down in Kerberos,sentry enabled CDH5.5 cluster > -- > > Key: HIVE-12548 > URL: https://issues.apache.org/jira/browse/HIVE-12548 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2 > Environment: RHEL 6.5 CLOUDERA CDH 5.5 >Reporter: narendra reddy ganesana > > [pool-3-thread-10]: Error occurred during processing of message. > java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: > Invalid status -128 > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:356) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TTransportException: Invalid status > -128 > at > org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) > at > org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) > at > org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12075) add analyze command to explictly cache file metadata in HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060930#comment-15060930 ] Alan Gates commented on HIVE-12075: --- +1, looks good. > add analyze command to explictly cache file metadata in HBase metastore > --- > > Key: HIVE-12075 > URL: https://issues.apache.org/jira/browse/HIVE-12075 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12075.01.nogen.patch, HIVE-12075.01.patch, > HIVE-12075.02.patch, HIVE-12075.03.patch, HIVE-12075.04.patch, > HIVE-12075.nogen.patch, HIVE-12075.patch > > > ANALYZE TABLE (spec as usual) CACHE METADATA -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly
[ https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12646: Assignee: (was: Vaibhav Gumashta) > beeline and HIVE CLI do not parse ; in quote properly > - > > Key: HIVE-12646 > URL: https://issues.apache.org/jira/browse/HIVE-12646 > Project: Hive > Issue Type: Bug > Components: CLI, Clients >Reporter: Yongzhi Chen > > Beeline and Cli have to escape ; in the quote while most other shell scripts > need not. For example: > in Beeline: > {noformat} > 0: jdbc:hive2://localhost:1> select ';' from tlb1; > select ';' from tlb1; > 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115 > 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403 > Error: Error while compiling statement: FAILED: ParseException line 1:8 > cannot recognize input near '' ' > {noformat} > while in mysql shell: > {noformat} > mysql> SELECT CONCAT(';', 'foo') FROM test limit 3; > ++ > | ;foo | > | ;foo | > | ;foo | > ++ > 3 rows in set (0.00 sec) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060922#comment-15060922 ] Thejas M Nair commented on HIVE-12688: -- None of the failures are related, these are failing in previous test runs as well. Looks like we really need some cleanup! Can someone please review it ? I think its simpler to commit this small patch and follow up in another jira to implement this feature without the regression. > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators
[ https://issues.apache.org/jira/browse/HIVE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-11355: -- Attachment: HIVE-11355.6.patch > Hive on tez: memory manager for sort buffers (input/output) and operators > - > > Key: HIVE-11355 > URL: https://issues.apache.org/jira/browse/HIVE-11355 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-11355.1.patch, HIVE-11355.2.patch, > HIVE-11355.3.patch, HIVE-11355.4.patch, HIVE-11355.5.patch, HIVE-11355.6.patch > > > We need to better manage the sort buffer allocations to ensure better > performance. Also, we need to provide configurations to certain operators to > stay within memory limits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060896#comment-15060896 ] Hive QA commented on HIVE-12688: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777933/HIVE-12688.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9947 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6371/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6371/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6371/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777933 - PreCommit-HIVE-TRUNK-Build > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12675) PerfLogger should log performance metrics at debug level
[ https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060886#comment-15060886 ] Laljo John Pullokkaran commented on HIVE-12675: --- +1 conditional on clean QA run & adding documentation on perf logger log level. > PerfLogger should log performance metrics at debug level > > > Key: HIVE-12675 > URL: https://issues.apache.org/jira/browse/HIVE-12675 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12675.1.patch > > > As more and more subcomponents of Hive (Tez, Optimizer) etc are using > PerfLogger to track the performance metrics, it will be more meaningful to > set the PerfLogger logging level to DEBUG. Otherwise, we will print the > performance metrics unnecessarily for each and every query if the underlying > subcomponent does not control the PerfLogging via a parameter on its own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-12667: -- Attachment: HIVE-12667.1.patch re-upload to trigger run > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12667.1.patch, HIVE-12667.1.patch > > > HIVE-12473 has added an incorrect comment and also lacks a test case. > Benefits of this fix: >* Does not say: "Probably doesn't work" >* Does not use grammar like "subquery columns and such" >* Adds test cases, that let you verify the fix >* Doesn't rely on certain structure of key expr, just takes the type at > compile time >* Doesn't require an additional walk of each key expression >* Shows the type used in explain -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12695) LLAP: use somebody else's cluster
[ https://issues.apache.org/jira/browse/HIVE-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-12695: --- Assignee: Sergey Shelukhin > LLAP: use somebody else's cluster > - > > Key: HIVE-12695 > URL: https://issues.apache.org/jira/browse/HIVE-12695 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12695.patch > > > For non-HS2 case cluster sharing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060807#comment-15060807 ] Vikram Dixit K commented on HIVE-12667: --- I guess the string-string could be special-cased to avoid some unnecessary calls. Otherwise the code looks good to me. +1 > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12667.1.patch > > > HIVE-12473 has added an incorrect comment and also lacks a test case. > Benefits of this fix: >* Does not say: "Probably doesn't work" >* Does not use grammar like "subquery columns and such" >* Adds test cases, that let you verify the fix >* Doesn't rely on certain structure of key expr, just takes the type at > compile time >* Doesn't require an additional walk of each key expression >* Shows the type used in explain -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12695) LLAP: use somebody else's cluster
[ https://issues.apache.org/jira/browse/HIVE-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12695: Attachment: HIVE-12695.patch [~gopalv] does this make sense? @user:instance uses that user's instance. > LLAP: use somebody else's cluster > - > > Key: HIVE-12695 > URL: https://issues.apache.org/jira/browse/HIVE-12695 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > Attachments: HIVE-12695.patch > > > For non-HS2 case cluster sharing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12694) LLAP: Slider destroy semantics require force
[ https://issues.apache.org/jira/browse/HIVE-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060761#comment-15060761 ] Vikram Dixit K commented on HIVE-12694: --- +1 LGTM. > LLAP: Slider destroy semantics require force > > > Key: HIVE-12694 > URL: https://issues.apache.org/jira/browse/HIVE-12694 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12694.1.patch > > > {code} > 2015-12-16 20:10:55,118 [main] ERROR main.ServiceLauncher - Destroy will > permanently delete directories and registries. Reissue this command with the > --force option if you want to proceed. > {code} > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12694) LLAP: Slider destroy semantics require force
[ https://issues.apache.org/jira/browse/HIVE-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12694: --- Description: {code} 2015-12-16 20:10:55,118 [main] ERROR main.ServiceLauncher - Destroy will permanently delete directories and registries. Reissue this command with the --force option if you want to proceed. {code} NO PRECOMMIT TESTS was: {code} 2015-12-16 20:10:55,118 [main] ERROR main.ServiceLauncher - Destroy will permanently delete directories and registries. Reissue this command with the --force option if you want to proceed. {code} > LLAP: Slider destroy semantics require force > > > Key: HIVE-12694 > URL: https://issues.apache.org/jira/browse/HIVE-12694 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12694.1.patch > > > {code} > 2015-12-16 20:10:55,118 [main] ERROR main.ServiceLauncher - Destroy will > permanently delete directories and registries. Reissue this command with the > --force option if you want to proceed. > {code} > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12694) LLAP: Slider destroy semantics require force
[ https://issues.apache.org/jira/browse/HIVE-12694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-12694: --- Attachment: HIVE-12694.1.patch > LLAP: Slider destroy semantics require force > > > Key: HIVE-12694 > URL: https://issues.apache.org/jira/browse/HIVE-12694 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.0.0, 2.1.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-12694.1.patch > > > {code} > 2015-12-16 20:10:55,118 [main] ERROR main.ServiceLauncher - Destroy will > permanently delete directories and registries. Reissue this command with the > --force option if you want to proceed. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.
[ https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12353: -- Description: One of the things that this method does is delete entries from TXN_COMPONENTS for partition that it was trying to compact. This causes Aborted transactions in TXNS to become empty according to CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be deleted. Once they are deleted, data that belongs to these txns is deemed committed... We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) states. We should also not delete then entry from markedCleaned() We'll have separate process that cleans 'f' and 's' records after X minutes (or after > N records for a given partition exist). This allows SHOW COMPACTIONS to show some history info and how many times compaction failed on a given partition (subject to retention interval) so that we don't have to call markCleaned() on Compactor failures at the same time preventing Compactor to constantly getting stuck on the same bad partition/table. Ideally we'd want to include END_TIME field. was: One of the things that this method does is delete entries from TXN_COMPONENTS for partition that it was trying to compact. This causes Aborted transactions in TXNS to become empty according to CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be delete. We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) states. We should also not delete then entry from markedCleaned() We'll have separate process that cleans 'f' and 's' records after X minutes (or after > N records for a given partition exist). This allows SHOW COMPACTIONS to show some history info and how many times compaction failed on a given partition (subject to retention interval) so that we don't have to call markCleaned() on Compactor failures at the same time preventing Compactor to constantly getting stuck on the same bad partition/table. Ideally we'd want to include END_TIME field. > When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it > should not. > --- > > Key: HIVE-12353 > URL: https://issues.apache.org/jira/browse/HIVE-12353 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Blocker > > One of the things that this method does is delete entries from TXN_COMPONENTS > for partition that it was trying to compact. > This causes Aborted transactions in TXNS to become empty according to > CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be > deleted. > Once they are deleted, data that belongs to these txns is deemed committed... > We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) > states. We should also not delete then entry from markedCleaned() > We'll have separate process that cleans 'f' and 's' records after X minutes > (or after > N records for a given partition exist). > This allows SHOW COMPACTIONS to show some history info and how many times > compaction failed on a given partition (subject to retention interval) so > that we don't have to call markCleaned() on Compactor failures at the same > time preventing Compactor to constantly getting stuck on the same bad > partition/table. > Ideally we'd want to include END_TIME field. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060683#comment-15060683 ] Thejas M Nair commented on HIVE-12688: -- I think its better to roll it out and put in a proper fix when its ready. As it affects only a small number of lines adding it back with proper fix should be straightforward. Do you agree [~aihuaxu] ? > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060603#comment-15060603 ] Pengcheng Xiong commented on HIVE-12663: [~ekoifman], pushed to branch-2.0 just now. Thanks. > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.2.1 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12663: --- Fix Version/s: 2.0.0 > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.2.1 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.0.0, 2.1.0 > > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060580#comment-15060580 ] Thejas M Nair commented on HIVE-12688: -- [~sershe] I was thinking it is better to keep the release clear of blockers to avoid issues. But we can give couple of days for a better fix if you are OK with that (as the release manager for 2.0.0). It depends on cycles someone has to provide to fix the feature to prevent this regression. If we make the change to roll back this feature, there is not too much pressure on anyone working on this. [~aihuaxu] What do you prefer ? Would you have cycles to fix the regression soon ? Or would you prefer adding this feature back again after this patch to roll it out (it gives you more time that way)? > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060581#comment-15060581 ] Eugene Koifman commented on HIVE-12663: --- [~pxiong] could you commit to 2.0 as well please to maintain parity > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.2.1 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.1.0 > > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12692) Make use of the Tez HadoopShim in TaskRunner usage
[ https://issues.apache.org/jira/browse/HIVE-12692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12692: -- Attachment: HIVE-12692.1.txt > Make use of the Tez HadoopShim in TaskRunner usage > -- > > Key: HIVE-12692 > URL: https://issues.apache.org/jira/browse/HIVE-12692 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12692.1.txt > > > TEZ-2910 adds shims for Hadoop to make use of caller context and other > changing hadoop APIs. Hive usage of TezTaskRunner needs to work with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11865) Disable Hive PPD optimizer when CBO has optimized the plan
[ https://issues.apache.org/jira/browse/HIVE-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11865: --- Attachment: HIVE-11865.03.patch New patch contains the following parts: - Disabling Hive PPD. It was just necessary to keep a small part of the code that is responsible for pushing Filter predicates to TableScan operators (SimplePredicatePushDown). - Disabling Hive inference for _isnotnull_ predicates on equi-join inputs. This was done in SemanticAnalyzer, and it is not necessary anymore when we run purely through Calcite. - It introduces a new rule in Calcite that pushes Filter through Sort operator. This was present in Hive, but it was missing on the Calcite side. - It includes logic related to pushing Filter down when return path was on. This should have been added when HIVE-0 went it, but it was difficult to detect as Hive PPD was doing the work for us. I already went through the changes in the q files: they are either changes in the order of Filter predicate factors, or removal of redundant _isnotnull_ factors. I will post the patch to RB for review. [~jpullokkaran], [~ashutoshc], could you take a look? Thanks > Disable Hive PPD optimizer when CBO has optimized the plan > -- > > Key: HIVE-11865 > URL: https://issues.apache.org/jira/browse/HIVE-11865 > Project: Hive > Issue Type: Bug > Components: CBO, Logical Optimizer >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11865.01.patch, HIVE-11865.02.patch, > HIVE-11865.02.patch, HIVE-11865.03.patch, HIVE-11865.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060551#comment-15060551 ] Sergey Shelukhin commented on HIVE-12688: - I think it makes sense to keep this as blocker. If it in a while the proper fix is not in sight we can roll back the regression and postpone a better fix to a future version. [~aihuaxu] how hard would it be to make the better fix? [~thejas] do you think we should rather roll back now? > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060525#comment-15060525 ] Gopal V commented on HIVE-12683: The known 0.4.x OOMs were during split-generation for uncompressed text files (HIVE-10746) or when combine inputformat is used on S3 (HADOOP-11584). > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We have started to look into testing tez query engine. From initial results, > we are getting 30% performance boost over Hive on smaller data set(1-10 GB) > but Hive starts to perform better than Tez as data size increases. Like when > we run a hive query with Tez on about 2.3 TB worth of data, it performs worse > than hive alone.(~20% less performance) Details are in the post below. > On a cluster with 1.3 TB RAM, I set the following property : > set tez.task.resource.memory.mb=1; set tez.am.resource.memory.mb=59205; > set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205; > set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=3670016; > Is it normal or I am missing some property / not configuring some property > properly? Also, I am using an older version of Tez as of now. Could that be > the issue too? I still have to bootstrap latest version of Tez on EMR and > test it and see if that could do any better. > Thought of asking here too > http://www.jwplayer.com/blog/hive-with-tez-on-emr/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO
[ https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11775: --- Attachment: HIVE-11775.10.patch > Implement limit push down through union all in CBO > -- > > Key: HIVE-11775 > URL: https://issues.apache.org/jira/browse/HIVE-11775 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, > HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, > HIVE-11775.06.patch, HIVE-11775.07.patch, HIVE-11775.08.patch, > HIVE-11775.09.patch, HIVE-11775.10.patch > > > Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually > push limit down through union all, which reduces the intermediate number of > rows in union branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060484#comment-15060484 ] rohit garg commented on HIVE-12683: --- Thanks Hitesh for the inputs. I am going to try the recommended settings and update the results here. The AM size was 1024MB. I was just using the bootstrap script provided by Amazon for initial testing. They had an older version of Tez (I think 0.4 or 0.5). > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We have started to look into testing tez query engine. From initial results, > we are getting 30% performance boost over Hive on smaller data set(1-10 GB) > but Hive starts to perform better than Tez as data size increases. Like when > we run a hive query with Tez on about 2.3 TB worth of data, it performs worse > than hive alone.(~20% less performance) Details are in the post below. > On a cluster with 1.3 TB RAM, I set the following property : > set tez.task.resource.memory.mb=1; set tez.am.resource.memory.mb=59205; > set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205; > set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=3670016; > Is it normal or I am missing some property / not configuring some property > properly? Also, I am using an older version of Tez as of now. Could that be > the issue too? I still have to bootstrap latest version of Tez on EMR and > test it and see if that could do any better. > Thought of asking here too > http://www.jwplayer.com/blog/hive-with-tez-on-emr/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060465#comment-15060465 ] Hitesh Shah commented on HIVE-12683: The Tez AM resource sizing has no relation to the task container sizing. That said, for various benchmarks done in the past, I dont believe anyone has needed to go beyond 16GB for the Tez AM for very large DAGs. [~rohitgarg1989] What was the AM size configured to when the OOM happened? If you are running a version older than Tez 0.7.0, there were some memory issues that require a large AM size i.e. large being say 16 GB but for 0.7.0 and higher, even 4 GB should be sufficient for a decent sized DAG. You can set it to 8 GB to be safe for now with Xmx say 6.4 GB and that should be sufficient. If you still hit an OOM with 8 GB, a jira against Tez with the heap dump would be helpful. [~gopalv] anything to add? any configs that need to be tuned / turned off for Hive that ends up using more memory in the AM? Any implicit caching of splits, etc? > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We have started to look into testing tez query engine. From initial results, > we are getting 30% performance boost over Hive on smaller data set(1-10 GB) > but Hive starts to perform better than Tez as data size increases. Like when > we run a hive query with Tez on about 2.3 TB worth of data, it performs worse > than hive alone.(~20% less performance) Details are in the post below. > On a cluster with 1.3 TB RAM, I set the following property : > set tez.task.resource.memory.mb=1; set tez.am.resource.memory.mb=59205; > set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205; > set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=3670016; > Is it normal or I am missing some property / not configuring some property > properly? Also, I am using an older version of Tez as of now. Could that be > the issue too? I still have to bootstrap latest version of Tez on EMR and > test it and see if that could do any better. > Thought of asking here too > http://www.jwplayer.com/blog/hive-with-tez-on-emr/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060409#comment-15060409 ] Thejas M Nair commented on HIVE-12688: -- Yeah, looks like CDH version of Hive has been using this property to restrict access. This is not old behavior of Apache Hive. This is a new feature is not a pattern commonly seen in hadoop ecosystem. In case of HDFS, for example access is restricted on file permissions and not on a user group setting. To secure metastore access, you can already use storage based authorization. I am fine with this feature being added. However, the way it is implemented right now breaks hive not work if hadoop.proxyuser.hive.hosts is properly set. I am not sure why CDH users didn't face this issue, I assume cloudera manager might not be securing this for the clusters. I don't think we can ship Hive 2.0.0 in this form as it is a major regression. If you can change the implementation to fix this issue, please create a follow up jira with patch. I created this patch to rollback the change so that we don't block 2.0.0 release. > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11927: --- Attachment: (was: HIVE-11927.13.patch) > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch, > HIVE-11927.09.patch, HIVE-11927.10.patch, HIVE-11927.11.patch, > HIVE-11927.12.patch, HIVE-11927.13.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11927: --- Attachment: HIVE-11927.13.patch > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch, > HIVE-11927.09.patch, HIVE-11927.10.patch, HIVE-11927.11.patch, > HIVE-11927.12.patch, HIVE-11927.13.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060337#comment-15060337 ] rohit garg commented on HIVE-12683: --- yeah...it was application master..we read somewhere Tez AM memory and Xmx settings should be same as tez container. So, in one of our tests and which ran (as mentioned in the blog), we did set tez.am.resource.memory.mb=59205; set tez.am.launch.cmd-opts =-Xmx47364m; We were tweaking the following properties mainly : set tez.task.resource.memory.mb set tez.am.resource.memory.mb set tez.am.launch.cmd-opts set hive.tez.container.size set hive.tez.java.opts set tez.am.grouping.max-size > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We have started to look into testing tez query engine. From initial results, > we are getting 30% performance boost over Hive on smaller data set(1-10 GB) > but Hive starts to perform better than Tez as data size increases. Like when > we run a hive query with Tez on about 2.3 TB worth of data, it performs worse > than hive alone.(~20% less performance) Details are in the post below. > On a cluster with 1.3 TB RAM, I set the following property : > set tez.task.resource.memory.mb=1; set tez.am.resource.memory.mb=59205; > set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205; > set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=3670016; > Is it normal or I am missing some property / not configuring some property > properly? Also, I am using an older version of Tez as of now. Could that be > the issue too? I still have to bootstrap latest version of Tez on EMR and > test it and see if that could do any better. > Thought of asking here too > http://www.jwplayer.com/blog/hive-with-tez-on-emr/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060302#comment-15060302 ] Hitesh Shah commented on HIVE-12683: It seems like the application master is running out of memory. What is the Tez AM being configured for in terms of memory and Xmx? > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We have started to look into testing tez query engine. From initial results, > we are getting 30% performance boost over Hive on smaller data set(1-10 GB) > but Hive starts to perform better than Tez as data size increases. Like when > we run a hive query with Tez on about 2.3 TB worth of data, it performs worse > than hive alone.(~20% less performance) Details are in the post below. > On a cluster with 1.3 TB RAM, I set the following property : > set tez.task.resource.memory.mb=1; set tez.am.resource.memory.mb=59205; > set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205; > set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=3670016; > Is it normal or I am missing some property / not configuring some property > properly? Also, I am using an older version of Tez as of now. Could that be > the issue too? I still have to bootstrap latest version of Tez on EMR and > test it and see if that could do any better. > Thought of asking here too > http://www.jwplayer.com/blog/hive-with-tez-on-emr/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12691) Compute stats on hbase tables causes Zookeeper connection leaks.
[ https://issues.apache.org/jira/browse/HIVE-12691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060270#comment-15060270 ] Naveen Gangam commented on HIVE-12691: -- I was referrring to these classes in the above comment hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsPublisher.java hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsAggregator.java > Compute stats on hbase tables causes Zookeeper connection leaks. > > > Key: HIVE-12691 > URL: https://issues.apache.org/jira/browse/HIVE-12691 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Fix For: 1.3.0 > > > hive.stats.autogather defaults to true in newer hive releases which causes > stats to be collected on hbase-backed hive tables. > Using HTable APIs causes a new zookeeper connections to be created. So if > HTable.close() is not called, the underlying ZK connection remains open as in > HIVE-12250. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12691) Compute stats on hbase tables causes Zookeeper connection leaks.
[ https://issues.apache.org/jira/browse/HIVE-12691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam resolved HIVE-12691. -- Resolution: Not A Problem Fix Version/s: 1.3.0 These classes have been deleted from Hive 1.3 and Hive 2.0 via HIVE-12005. So this is no longer a issue in the development releases, just the older releases have this issue. Closing the jira as no fixes are needed. > Compute stats on hbase tables causes Zookeeper connection leaks. > > > Key: HIVE-12691 > URL: https://issues.apache.org/jira/browse/HIVE-12691 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Fix For: 1.3.0 > > > hive.stats.autogather defaults to true in newer hive releases which causes > stats to be collected on hbase-backed hive tables. > Using HTable APIs causes a new zookeeper connections to be created. So if > HTable.close() is not called, the underlying ZK connection remains open as in > HIVE-12250. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex
[ https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060190#comment-15060190 ] Xiaowei Wang commented on HIVE-12541: - symlink_text_input_format test case have been update within 2.1.0 version . There is still other test case failed . Seems it does not matter with my patch. > SymbolicTextInputFormat should supports the path with regex > --- > > Key: HIVE-12541 > URL: https://issues.apache.org/jira/browse/HIVE-12541 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.14.0, 1.2.0, 1.2.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang > Fix For: 1.2.1 > > Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch, > HIVE-12541.3.patch, HIVE-12541.4.patch > > > 1, In fact,SybolicTextInputFormat supports the path with regex .I add some > test sql . > 2, But ,when using CombineHiveInputFormat to combine input files , It cannot > resolve the path with regex ,so it will get a wrong result.I give a example > ,and fix the problem. > Table desc : > {noformat} > CREATE External TABLE `symlink_text_input_format`( > `key` string, > `value` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'viewfs://nsX/user/hive/warehouse/symlink_text_input_format' > {noformat} > There is a link file in the dir > '/user/hive/warehouse/symlink_text_input_format' , the content of the link > file is > {noformat} > viewfs://nsx/tmp/symlink* > {noformat} > it contains one path ,and the path contains a regex! > Execute the sql : > {noformat} > set hive.rework.mapredwork = true ; > set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; > set mapred.min.split.size.per.rack= 0 ; > set mapred.min.split.size.per.node= 0 ; > set mapred.max.split.size= 0 ; > select count(*) from symlink_text_input_format ; > {noformat} > It will get a wrong result :0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11828) beeline -f fails on scripts with tabs between column type and comment
[ https://issues.apache.org/jira/browse/HIVE-11828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060183#comment-15060183 ] Andrés Cordero commented on HIVE-11828: --- Same problem between column name and column type. > beeline -f fails on scripts with tabs between column type and comment > - > > Key: HIVE-11828 > URL: https://issues.apache.org/jira/browse/HIVE-11828 > Project: Hive > Issue Type: Bug > Components: Beeline >Affects Versions: 1.2.0 >Reporter: Krzysztof Adamski >Priority: Minor > > This issue was supposed to be resolved by > https://issues.apache.org/jira/browse/HIVE-6359 > However when invoking >create table test (id intCOMMENT 'test'); > the following error appears > beeline -f test.sql > -u"jdbc:hive2://localhost:1/default;principal=hive/FQDN@US-WEST-2.COMPUTE.INTERNAL" > scan complete in 4ms > Connecting to > jdbc:hive2://localhost:1/default;principal=hiveFQDN@US-WEST-2.COMPUTE.INTERNAL > Connected to: Apache Hive (version 1.1.0-cdh5.4.4) > Driver: Hive JDBC (version 1.1.0-cdh5.4.4) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 0: jdbc:hive2://localhost:1/default> create table test (id intCOMMENT > 'test'); > Error: Error while compiling statement: FAILED: ParseException line 1:22 > cannot recognize input near 'intCOMMENT' ''test'' ')' in column type > (state=42000,code=4) > There is no problem when is between the columns e.g. > create table test (id int COMMENT 'test',id2 string COMMENT > 'test2'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060179#comment-15060179 ] rohit garg commented on HIVE-12683: --- I will update here with the results. > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We have started to look into testing tez query engine. From initial results, > we are getting 30% performance boost over Hive on smaller data set(1-10 GB) > but Hive starts to perform better than Tez as data size increases. Like when > we run a hive query with Tez on about 2.3 TB worth of data, it performs worse > than hive alone.(~20% less performance) Details are in the post below. > On a cluster with 1.3 TB RAM, I set the following property : > set tez.task.resource.memory.mb=1; set tez.am.resource.memory.mb=59205; > set tez.am.launch.cmd-opts =-Xmx47364m; set hive.tez.container.size=59205; > set hive.tez.java.opts=-Xmx47364m; set tez.am.grouping.max-size=3670016; > Is it normal or I am missing some property / not configuring some property > properly? Also, I am using an older version of Tez as of now. Could that be > the issue too? I still have to bootstrap latest version of Tez on EMR and > test it and see if that could do any better. > Thought of asking here too > http://www.jwplayer.com/blog/hive-with-tez-on-emr/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12683) Does Tez run slower than hive on larger dataset (~2.5 TB)?
[ https://issues.apache.org/jira/browse/HIVE-12683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060175#comment-15060175 ] rohit garg commented on HIVE-12683: --- Thanks for your inputs. I will try these changes and see if that would give me any performance boost over hive query engine. This was the OOM error I was getting before I tweaked memory settings : 0 FATAL [Socket Reader #1 for port 55739] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[Socket Reader #1 for port 55739,5,main] threw an Error. Shutting down now... java.lang.OutOfMemoryError: GC overhead limit exceeded at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1510) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:750) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:624) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:595) 2015-12-07 20:31:32,859 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread java.lang.OutOfMemoryError: GC overhead limit exceeded 2015-12-07 20:31:30,590 WARN [IPC Server handler 0 on 55739] org.apache.hadoop.ipc.Server: IPC Server handler 0 on 55739, call heartbeat({ containerId=container_1449516549171_0001_01_000100, requestId=10184, startIndex=0, maxEventsToGet=0, taskAttemptId=null, eventCount=0 }), rpc version=2, client version=19, methodsFingerPrint=557389974 from 10.10.30.35:47028 Call#11165 Retry#0: error: java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded at javax.security.auth.SubjectDomainCombiner.optimize(SubjectDomainCombiner.java:464) at javax.security.auth.SubjectDomainCombiner.combine(SubjectDomainCombiner.java:267) at java.security.AccessControlContext.goCombiner(AccessControlContext.java:499) at java.security.AccessControlContext.optimize(AccessControlContext.java:407) at java.security.AccessController.getContext(AccessController.java:501) at javax.security.auth.Subject.doAs(Subject.java:412) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) 2015-12-07 20:32:53,495 INFO [Thread-60] amazon.emr.metrics.MetricsSaver: Saved 4:3 records to /mnt/var/em/raw/i-782f08c8_20151207_7921_07921_raw.bin 2015-12-07 20:32:53,495 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye.. 2015-12-07 20:32:50,435 INFO [IPC Server handler 20 on 55739] org.apache.hadoop.ipc.Server: IPC Server handler 20 on 55739, call getTask(org.apache.tez.common.ContainerContext@409a6aa9), rpc version=2, client version=19, methodsFingerPrint=557389974 from 10.10.30.33:33644 Call#11094 Retry#0: error: java.io.IOException: java.lang.OutOfMemoryError: GC overhead limit exceeded java.io.IOException: java.lang.OutOfMemoryError: GC overhead limit exceeded 2015-12-07 20:32:29,117 WARN [IPC Server handler 23 on 55739] org.apache.hadoop.ipc.Server: IPC Server handler 23 on 55739, call getTask(org.apache.tez.common.ContainerContext@7c7e6992), rpc version=2, client version=19, methodsFingerPrint=557389974 from 10.10.30.38:44218 Call#11260 Retry#0: error: java.lang.OutOfMemoryError: GC overhead limit exceeded java.lang.OutOfMemoryError: GC overhead limit exceeded 2015-12-07 20:32:53,497 INFO [Thread-60] amazon.emr.metrics.MetricsSaver: Saved 1:1 records to /mnt/var/em/raw/i-782f08c8_20151207_7921_07921_raw.bin 2015-12-07 20:32:53,498 INFO [Thread-61] amazon.emr.metrics.MetricsSaver: Saved 1:1 records to /mnt/var/em/raw/i-782f08c8_20151207_7921_07921_raw.bin 2015-12-07 20:32:53,498 INFO [Thread-2] org.apache.tez.dag.app.DAGAppMaster: DAGAppMaster received a signal. Signaling TaskScheduler 2015-12-07 20:32:53,498 INFO [Thread-2] org.apache.tez.dag.app.rm.TaskSchedulerEventHandler: TaskScheduler notified that iSignalled was : true 2015-12-07 20:32:53,499 INFO [Thread-2] org.apache.tez.dag.history.HistoryEventHandler: Stopping HistoryEventHandler 2015-12-07 20:32:53,499 INFO [Thread-2] org.apache.tez.dag.history.recovery.RecoveryService: Stopping RecoveryService 2015-12-07 20:32:53,499 INFO [Thread-2] org.apache.tez.dag.history.recovery.RecoveryService: Closing Summary Stream 2015-12-07 20:32:53,499 INFO [LeaseRenewer:hadoop@10.10.30.148:9000] org.apache.hadoop.util.ExitUtil: Halt with status -1 Message: HaltException > Does Tez run slower than hive on larger dataset (~2.5 TB)? > -- > > Key: HIVE-12683 > URL: https://issues.apache.org/jira/browse/HIVE-12683 > Project: Hive > Issue Type: Bug >Reporter: rohit garg > > We h
[jira] [Commented] (HIVE-11487) Add getNumPartitionsByFilter api in metastore api
[ https://issues.apache.org/jira/browse/HIVE-11487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060121#comment-15060121 ] Hive QA commented on HIVE-11487: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777985/HIVE-11487.03.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6370/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6370/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6370/ Messages: {noformat} This message was trimmed, see log for full details [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-common --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-common --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/common/target/hive-common-2.1.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-common --- [INFO] [INFO] --- maven-jar-plugin:2.2:test-jar (default) @ hive-common --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/common/target/hive-common-2.1.0-SNAPSHOT-tests.jar [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-common --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/common/target/hive-common-2.1.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-common/2.1.0-SNAPSHOT/hive-common-2.1.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/common/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-common/2.1.0-SNAPSHOT/hive-common-2.1.0-SNAPSHOT.pom [INFO] Installing /data/hive-ptest/working/apache-github-source-source/common/target/hive-common-2.1.0-SNAPSHOT-tests.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-common/2.1.0-SNAPSHOT/hive-common-2.1.0-SNAPSHOT-tests.jar [INFO] [INFO] [INFO] Building Hive Serde 2.1.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-serde --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/serde/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/serde (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-serde --- [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-serde --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/serde/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/serde/src/gen/thrift/gen-javabean added. [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-serde --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-serde --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/serde/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-serde --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-serde --- [INFO] Compiling 406 source files to /data/hive-ptest/working/apache-github-source-source/serde/target/classes [WARNING] /data/hive-ptest/working/apache-github-source-source/serde/src/java/org/apache/hadoop/hive/serde2/SerDe.java: Some input files use or override a deprecated API. [WARNING] /data/hive-ptest/working/apache-github-source-source/serde/src/java/org/apache/hadoop/hive/serde2/SerDe.java: Recompile with -Xlint:deprecation for details. [WARNING] /data/hive-ptest/working/apache-github-source-source/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/AbstractPrimitiveLazyObjectInspector.java: Some input files use unchecked or unsafe operations. [WARNING] /data/hive-ptest/working/apache-github-source-source/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/AbstractPrimitiveLazyObjectInspector.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-serde --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 2 resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-pl
[jira] [Commented] (HIVE-12541) SymbolicTextInputFormat should supports the path with regex
[ https://issues.apache.org/jira/browse/HIVE-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060086#comment-15060086 ] Hive QA commented on HIVE-12541: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777960/HIVE-12541.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9948 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6368/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6368/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6368/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777960 - PreCommit-HIVE-TRUNK-Build > SymbolicTextInputFormat should supports the path with regex > --- > > Key: HIVE-12541 > URL: https://issues.apache.org/jira/browse/HIVE-12541 > Project: Hive > Issue Type: Improvement >Affects Versions: 0.14.0, 1.2.0, 1.2.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang > Fix For: 1.2.1 > > Attachments: HIVE-12541.1.patch, HIVE-12541.2.patch, > HIVE-12541.3.patch, HIVE-12541.4.patch > > > 1, In fact,SybolicTextInputFormat supports the path with regex .I add some > test sql . > 2, But ,when using CombineHiveInputFormat to combine input files , It cannot > resolve the path with regex ,so it will get a wrong result.I give a example > ,and fix the problem. > Table desc : > {noformat} > CREATE External TABLE `symlink_text_input_format`( > `key` string, > `value` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'viewfs://nsX/user/hive/warehouse/symlink_text_input_format' > {noformat} > There is a link file in the dir > '/user/hive/warehouse/symlink_text_input_format' , the content of the link > file is > {noformat} > viewfs://nsx/tmp/symlink* > {noformat} > it contains one path ,and the path contains a regex! > Execute the sql : > {noformat} > set hive.rework.mapredwork = true ; > set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; > set mapred.min.split.size.per.rack= 0 ; > set mapred.min.split.size.per.node= 0 ; > set mapred.max.split.size= 0 ; > select count(*) from symlink_text_input_format ; > {noformat} > It will get a wrong result :0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
[ https://issues.apache.org/jira/browse/HIVE-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059945#comment-15059945 ] Aihua Xu commented on HIVE-12688: - You are right. This is hadoop property. Seems that property should not limit the access to metastore server and we were historically using that to limit access to hive. In our own version, somehow we documented as such. Is this the old behavior? Instead of changing it back since it will introduce the issue that it will not block unauthorized access to Hive, can we keep this behavior and try to rework with a correct way that we block the Hive access somewhere else in a later version? > HIVE-11826 makes hive unusable in properly secured cluster > -- > > Key: HIVE-12688 > URL: https://issues.apache.org/jira/browse/HIVE-12688 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair >Priority: Blocker > Attachments: HIVE-12688.1.patch > > > HIVE-11826 makes a change to restrict connections to metastore to users who > belong to groups under 'hadoop.proxyuser.hive.groups'. > That property was only a meant to be a hadoop property, which controls what > users the hive user can impersonate. What this change is doing is to enable > use of that to also restrict who can connect to metastore server. This is new > functionality, not a bug fix. There is value to this functionality. > However, this change makes hive unusable in a properly secured cluster. If > 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run > Metastore and Hiveserver2 (instead of a very open "*"), then users will be > able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)