[jira] [Created] (HIVE-16822) Document how to update the errata.txt file
Lefty Leverenz created HIVE-16822: - Summary: Document how to update the errata.txt file Key: HIVE-16822 URL: https://issues.apache.org/jira/browse/HIVE-16822 Project: Hive Issue Type: Bug Components: Documentation Reporter: Lefty Leverenz The wiki should explain when and how to update the errata.txt file. Details belong in How to Commit, and there should be a cross reference in How to Contribute so that non-committers will learn about errata.txt. * [How To Commit -- Commit | https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-Commit] * [How To Contribute -- Contributing Your Work | https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-ContributingYourWork] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15523) Fix broken "Builds" link on Hive home page
Lefty Leverenz created HIVE-15523: - Summary: Fix broken "Builds" link on Hive home page Key: HIVE-15523 URL: https://issues.apache.org/jira/browse/HIVE-15523 Project: Hive Issue Type: Bug Components: Website Reporter: Lefty Leverenz The Builds link in the side bar on hive.apache.org points to http://bigtop01.cloudera.org:8080/view/Hive/, which times out with the message "the server where this page is located isn't responding." This was reported by Laurel Hale (thanks, Laurel). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15515) Remove the docs directory
Lefty Leverenz created HIVE-15515: - Summary: Remove the docs directory Key: HIVE-15515 URL: https://issues.apache.org/jira/browse/HIVE-15515 Project: Hive Issue Type: Bug Components: Documentation Reporter: Lefty Leverenz Hive xdocs have not been used since 2012. The docs directory only holds six xml documents, and their contents are in the wiki. It's past time to remove the docs directory from the Hive code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12841) Document existence of multiple hive-site.xml files
Lefty Leverenz created HIVE-12841: - Summary: Document existence of multiple hive-site.xml files Key: HIVE-12841 URL: https://issues.apache.org/jira/browse/HIVE-12841 Project: Hive Issue Type: Bug Components: Documentation Reporter: Lefty Leverenz The wiki's AdminManual Configuration doc discusses hive-site.xml without giving any filepaths or explaining which files override the others. This needs to be clarified, as well as the relationship between hive-site.xml and HiveConf.java. * [AdminManual -- Configuration -- Configuring Hive | https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfiguringHive] * [AdminManual -- Configuration -- hive-site.xml and hive-default.xml.template | https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-hive-site.xmlandhive-default.xml.template] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10486) Update wiki for switch from svn to git
Lefty Leverenz created HIVE-10486: - Summary: Update wiki for switch from svn to git Key: HIVE-10486 URL: https://issues.apache.org/jira/browse/HIVE-10486 Project: Hive Issue Type: Bug Reporter: Lefty Leverenz The Hive wiki has many svn instructions that need to be changed to their git equivalents. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10160) Give a warning when grouping or ordering by a constant column
Lefty Leverenz created HIVE-10160: - Summary: Give a warning when grouping or ordering by a constant column Key: HIVE-10160 URL: https://issues.apache.org/jira/browse/HIVE-10160 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Lefty Leverenz Priority: Minor To avoid confusion, a warning should be issued when users specify column positions instead of names in a GROUP BY or ORDER BY clause (unless hive.groupby.orderby.position.alias is set to true in Hive 0.11.0 or later). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10124) Add iss...@hive.apache.org to Mailing Lists page
Lefty Leverenz created HIVE-10124: - Summary: Add iss...@hive.apache.org to Mailing Lists page Key: HIVE-10124 URL: https://issues.apache.org/jira/browse/HIVE-10124 Project: Hive Issue Type: Bug Reporter: Lefty Leverenz Now that Hive has a separate mailing list for issue comments and QA messages, it needs to be added to the Mailing Lists page on the website (http://hive.apache.org/mailing_lists.html). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9838) Add issues@hive to Mailing Lists page of Hive website
Lefty Leverenz created HIVE-9838: Summary: Add issues@hive to Mailing Lists page of Hive website Key: HIVE-9838 URL: https://issues.apache.org/jira/browse/HIVE-9838 Project: Hive Issue Type: Bug Reporter: Lefty Leverenz The new Hive mailing list, iss...@hive.apache.org, needs to be included on the Mailing Lists page of the website (http://hive.apache.org/mailing_lists.html). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6977) Delete Hiveserver1
[ https://issues.apache.org/jira/browse/HIVE-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327240#comment-14327240 ] Lefty Leverenz commented on HIVE-6977: -- Was the commit to version 1.0.0 reverted, or should this issue have fix version 1.0.0 as well as 1.1.0? Delete Hiveserver1 -- Key: HIVE-6977 URL: https://issues.apache.org/jira/browse/HIVE-6977 Project: Hive Issue Type: Task Components: JDBC, Server Infrastructure Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC15 Fix For: 1.1.0 Attachments: HIVE-6977.1.patch, HIVE-6977.patch See mailing list discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.
[ https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14327190#comment-14327190 ] Lefty Leverenz commented on HIVE-7100: -- Doc note: This is documented in the wiki for DROP TABLE and ALTER TABLE DROP PARTITION. * [Drop Table | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DropTable] * [Drop Partitions | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DropPartitions] I included DROP PARTITION based on deletePartitionData() in the patch -- please review my changes and correct anything that's not right. If it's okay, the TODOC14 label can be removed. Users of hive should be able to specify skipTrash when dropping tables. --- Key: HIVE-7100 URL: https://issues.apache.org/jira/browse/HIVE-7100 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.13.0 Reporter: Ravi Prakash Assignee: david serafini Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7100.1.patch, HIVE-7100.10.patch, HIVE-7100.11.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, HIVE-7100.4.patch, HIVE-7100.5.patch, HIVE-7100.8.patch, HIVE-7100.9.patch, HIVE-7100.patch Users of our clusters are often running up against their quota limits because of Hive tables. When they drop tables, they have to then manually delete the files from HDFS using skipTrash. This is cumbersome and unnecessary. We should enable users to skipTrash directly when dropping tables. We should also be able to provide this functionality without polluting SQL syntax. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9556) create UDF to calculate the Levenshtein distance between two strings
[ https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9556: - Labels: (was: TODOC1.2) create UDF to calculate the Levenshtein distance between two strings Key: HIVE-9556 URL: https://issues.apache.org/jira/browse/HIVE-9556 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Fix For: 1.2.0 Attachments: HIVE-9556.1.patch, HIVE-9556.2.patch, HIVE-9556.3.patch Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965. Example: The Levenshtein distance between kitten and sitting is 3 1. kitten → sitten (substitution of s for k) 2. sitten → sittin (substitution of i for e) 3. sittin → sitting (insertion of g at the end). {code} select levenshtein('kitten', 'sitting'); 3 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9556) create UDF to calculate the Levenshtein distance between two strings
[ https://issues.apache.org/jira/browse/HIVE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328585#comment-14328585 ] Lefty Leverenz commented on HIVE-9556: -- Thanks for the doc, [~apivovarov]. I removed the TODOC1.2 label. create UDF to calculate the Levenshtein distance between two strings Key: HIVE-9556 URL: https://issues.apache.org/jira/browse/HIVE-9556 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Fix For: 1.2.0 Attachments: HIVE-9556.1.patch, HIVE-9556.2.patch, HIVE-9556.3.patch Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965. Example: The Levenshtein distance between kitten and sitting is 3 1. kitten → sitten (substitution of s for k) 2. sitten → sittin (substitution of i for e) 3. sittin → sitting (insertion of g at the end). {code} select levenshtein('kitten', 'sitting'); 3 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9728) LLAP: add heap mode to allocator (for q files, YARN w/o direct buffer accounting support)
[ https://issues.apache.org/jira/browse/HIVE-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328618#comment-14328618 ] Lefty Leverenz commented on HIVE-9728: -- Doc note: This adds *hive.llap.io.cache.direct* to HiveConf.java in the LLAP branch, so it will need to be documented when the branch gets merged to trunk. LLAP: add heap mode to allocator (for q files, YARN w/o direct buffer accounting support) - Key: HIVE-9728 URL: https://issues.apache.org/jira/browse/HIVE-9728 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9086) Add language support to PURGE data while dropping partitions.
[ https://issues.apache.org/jira/browse/HIVE-9086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328606#comment-14328606 ] Lefty Leverenz commented on HIVE-9086: -- Doc note: Uh oh, I documented this prematurely (for HIVE-7100). But did I get it right, except for the jira attibution and release number? * [DDL -- Drop Partitions | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DropPartitions] Add language support to PURGE data while dropping partitions. - Key: HIVE-9086 URL: https://issues.apache.org/jira/browse/HIVE-9086 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.15.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9086.1.patch HIVE-9083 adds metastore-support to skip-trash while dropping partitions. This patch includes language support to do the same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9581) CBO (Calcite Return Path): Translate Join to Hive Op [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9581: - Labels: TODOC-CBO (was: ) CBO (Calcite Return Path): Translate Join to Hive Op [CBO branch] - Key: HIVE-9581 URL: https://issues.apache.org/jira/browse/HIVE-9581 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Labels: TODOC-CBO Fix For: cbo-branch Attachments: HIVE-9581.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9581) CBO (Calcite Return Path): Translate Join to Hive Op [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328411#comment-14328411 ] Lefty Leverenz commented on HIVE-9581: -- Doc note: This adds *hive.cbo.returnpath.hiveop* to HiveConf.java in the CBO branch, so it will need to be documented in the wiki when the branch gets merged to trunk. The parameter description is rather cryptic (Flag to control calcite plan to hive operator conversion) and the jira comments don't elaborate on usage, so a release note would be helpful. I created a new label for the CBO branch: TODOC-CBO. CBO (Calcite Return Path): Translate Join to Hive Op [CBO branch] - Key: HIVE-9581 URL: https://issues.apache.org/jira/browse/HIVE-9581 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Labels: TODOC-CBO Fix For: cbo-branch Attachments: HIVE-9581.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.
[ https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7100: - Labels: (was: TODOC14) Users of hive should be able to specify skipTrash when dropping tables. --- Key: HIVE-7100 URL: https://issues.apache.org/jira/browse/HIVE-7100 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.13.0 Reporter: Ravi Prakash Assignee: david serafini Fix For: 0.14.0 Attachments: HIVE-7100.1.patch, HIVE-7100.10.patch, HIVE-7100.11.patch, HIVE-7100.2.patch, HIVE-7100.3.patch, HIVE-7100.4.patch, HIVE-7100.5.patch, HIVE-7100.8.patch, HIVE-7100.9.patch, HIVE-7100.patch Users of our clusters are often running up against their quota limits because of Hive tables. When they drop tables, they have to then manually delete the files from HDFS using skipTrash. This is cumbersome and unnecessary. We should enable users to skipTrash directly when dropping tables. We should also be able to provide this functionality without polluting SQL syntax. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9606) Need a tool to export metadata from RDBMS based metastore into HBase
[ https://issues.apache.org/jira/browse/HIVE-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328439#comment-14328439 ] Lefty Leverenz commented on HIVE-9606: -- Does this need documentation? Need a tool to export metadata from RDBMS based metastore into HBase Key: HIVE-9606 URL: https://issues.apache.org/jira/browse/HIVE-9606 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Alan Gates Assignee: Alan Gates Fix For: 1.2.0 Attachments: HIVE-9606.2.patch, HIVE-9606.patch For testing (and eventually for end user use) we need a tool that can take data from an existing RDBMS based metastore and create the corresponding objects in an HBase based metastore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9699) Extend PTFs to provide referenced columns for CP
[ https://issues.apache.org/jira/browse/HIVE-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325584#comment-14325584 ] Lefty Leverenz commented on HIVE-9699: -- Does this need any user documentation? Extend PTFs to provide referenced columns for CP Key: HIVE-9699 URL: https://issues.apache.org/jira/browse/HIVE-9699 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 1.2.0 Attachments: HIVE-9699.1.patch.txt, HIVE-9699.2.patch.txt As described in HIVE-9341, If PTFs can provide referenced column names, column pruner can use that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3781) Index related events should be delivered to metastore event listener
[ https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325560#comment-14325560 ] Lefty Leverenz commented on HIVE-3781: -- Doc done: The wiki has been updated so I removed the TODOC15 label. Version information was not needed, because *hive.exec.drop.ignorenonexistent* has covered DROP INDEX since 0.7.0 when the parameter was created (HIVE-1858). Index related events should be delivered to metastore event listener Key: HIVE-3781 URL: https://issues.apache.org/jira/browse/HIVE-3781 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Sudhanshu Arora Assignee: Navis Fix For: 1.1.0 Attachments: HIVE-3781.5.patch.txt, HIVE-3781.6.patch.txt, HIVE-3781.7.patch.txt, HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch, HIVE-3781.D7731.3.patch, HIVE-3781.D7731.4.patch, hive.3781.3.patch, hive.3781.4.patch An event listener must be called for any DDL activity. For example, create_index, drop_index today does not call metaevent listener. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3781) Index related events should be delivered to metastore event listener
[ https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-3781: - Labels: (was: TODOC15) Index related events should be delivered to metastore event listener Key: HIVE-3781 URL: https://issues.apache.org/jira/browse/HIVE-3781 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Sudhanshu Arora Assignee: Navis Fix For: 1.1.0 Attachments: HIVE-3781.5.patch.txt, HIVE-3781.6.patch.txt, HIVE-3781.7.patch.txt, HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch, HIVE-3781.D7731.3.patch, HIVE-3781.D7731.4.patch, hive.3781.3.patch, hive.3781.4.patch An event listener must be called for any DDL activity. For example, create_index, drop_index today does not call metaevent listener. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325556#comment-14325556 ] Lefty Leverenz commented on HIVE-2573: -- Doc update: The description of *hive.exec.drop.ignorenonexistent* has been updated in the wiki. Does the per-session function registry need to be documented? Create per-session function registry - Key: HIVE-2573 URL: https://issues.apache.org/jira/browse/HIVE-2573 Project: Hive Issue Type: Improvement Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Labels: TODOC1.2 Fix For: 1.2.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt Currently the function registry is shared resource and could be overrided by other users when using HiveServer. If per-session function registry is provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9188) BloomFilter support in ORC
[ https://issues.apache.org/jira/browse/HIVE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325694#comment-14325694 ] Lefty Leverenz commented on HIVE-9188: -- Doc note: [~prasanth_j] documented this in the ORC wikidoc. * [ORC Files -- Bloom Filter Index | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-BloomFilterIndex] BloomFilter support in ORC -- Key: HIVE-9188 URL: https://issues.apache.org/jira/browse/HIVE-9188 Project: Hive Issue Type: New Feature Components: File Formats Affects Versions: 0.15.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Labels: orcfile Fix For: 1.2.0 Attachments: HIVE-9188.1.patch, HIVE-9188.10.patch, HIVE-9188.11.patch, HIVE-9188.2.patch, HIVE-9188.3.patch, HIVE-9188.4.patch, HIVE-9188.5.patch, HIVE-9188.6.patch, HIVE-9188.7.patch, HIVE-9188.8.patch, HIVE-9188.9.patch BloomFilters are well known probabilistic data structure for set membership checking. We can use bloom filters in ORC index for better row group pruning. Currently, ORC row group index uses min/max statistics to eliminate row groups (stripes as well) that do not satisfy predicate condition specified in the query. But in some cases, the efficiency of min/max based elimination is not optimal (unsorted columns with wide range of entries). Bloom filters can be an effective and efficient alternative for row group/split elimination for point queries or queries with IN clause. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326675#comment-14326675 ] Lefty Leverenz commented on HIVE-7292: -- Doc note: See comments on HIVE-9257 and HIVE-9448 for documentation issues. * [HIVE-9257 commit comment with doc notes | https://issues.apache.org/jira/browse/HIVE-9257?focusedCommentId=14273166page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14273166] * HIVE-9448 doc comments ** [list of configuration parameters | https://issues.apache.org/jira/browse/HIVE-9448?focusedCommentId=14292487page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14292487] ** [where documented | https://issues.apache.org/jira/browse/HIVE-9448?focusedCommentId=14298353page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14298353] Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5 Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326644#comment-14326644 ] Lefty Leverenz edited comment on HIVE-7292 at 2/18/15 10:49 PM: Although this issue is still marked Unresolved, the Spark branch has been merged to trunk and is Resolved for the 1.1.0 release (HIVE-9257 and HIVE-9352). (Edit: Also HIVE-9448.) was (Author: leftylev): Although this issue is still marked Unresolved, the Spark branch has been merged to trunk and is Resolved for the 1.1.0 release (HIVE-9257 and HIVE-9352). Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5 Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9703) Merge from Spark branch to trunk 02/16/2015
[ https://issues.apache.org/jira/browse/HIVE-9703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326776#comment-14326776 ] Lefty Leverenz commented on HIVE-9703: -- Does any of this need documentation, or can we assume it's all covered by jiras that patched the Spark branch? Merge from Spark branch to trunk 02/16/2015 --- Key: HIVE-9703 URL: https://issues.apache.org/jira/browse/HIVE-9703 Project: Hive Issue Type: Task Reporter: Xuefu Zhang Assignee: Xuefu Zhang Fix For: 1.2.0 Attachments: HIVE-9703.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326827#comment-14326827 ] Lefty Leverenz commented on HIVE-8807: -- This also needs to be done for release 1.1.0, but I don't think we should have a new Jira for each release. Would it make sense to reopen this issue for each release? Or is there a better way to make sure webhcat-default.xml gets updated? Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Assignee: Eugene Koifman Fix For: 1.0.0 Attachments: HIVE8807.patch The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). no precommit tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9638) Drop Index does not check Index or Table exisit or not
[ https://issues.apache.org/jira/browse/HIVE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324602#comment-14324602 ] Lefty Leverenz commented on HIVE-9638: -- HIVE-6754 also requested setting the default to true for hive.exec.drop.ignorenonexistent. Drop Index does not check Index or Table exisit or not -- Key: HIVE-9638 URL: https://issues.apache.org/jira/browse/HIVE-9638 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.11.0, 0.13.0, 0.14.0, 1.0.0 Reporter: Will Du DROP INDEX index_name ON table_name; statement will be always successful no matter the index_name or table_name exsit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9638) Drop Index does not check Index or Table exisit or not
[ https://issues.apache.org/jira/browse/HIVE-9638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324602#comment-14324602 ] Lefty Leverenz edited comment on HIVE-9638 at 2/17/15 6:13 PM: --- HIVE-6754 also requested setting the default to false for hive.exec.drop.ignorenonexistent. was (Author: le...@hortonworks.com): HIVE-6754 also requested setting the default to true for hive.exec.drop.ignorenonexistent. Drop Index does not check Index or Table exisit or not -- Key: HIVE-9638 URL: https://issues.apache.org/jira/browse/HIVE-9638 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.11.0, 0.13.0, 0.14.0, 1.0.0 Reporter: Will Du DROP INDEX index_name ON table_name; statement will be always successful no matter the index_name or table_name exsit -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8785) HiveServer2 LogDivertAppender should be more selective for beeline getLogs
[ https://issues.apache.org/jira/browse/HIVE-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323947#comment-14323947 ] Lefty Leverenz commented on HIVE-8785: -- Will this get more documentation (see previous comment) or can we remove the TODOC14 label now? HiveServer2 LogDivertAppender should be more selective for beeline getLogs -- Key: HIVE-8785 URL: https://issues.apache.org/jira/browse/HIVE-8785 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Thejas M Nair Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-8785.1.patch, HIVE-8785.2.patch, HIVE-8785.3.patch, HIVE-8785.4.patch, HIVE-8785.4.patch, HIVE-8785.5.patch A simple query run via beeline JDBC like {{explain select count(1) from testing.foo;}} produces 50 lines of output which looks like {code} 0: jdbc:hive2://localhost:10002 explain select count(1) from testing.foo; 14/11/06 00:35:59 INFO log.PerfLogger: PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO log.PerfLogger: PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO parse.ParseDriver: Parsing command: explain select count(1) from testing.foo 14/11/06 00:35:59 INFO parse.ParseDriver: Parse Completed 14/11/06 00:35:59 INFO log.PerfLogger: /PERFLOG method=parse start=1415262959379 end=1415262959380 duration=1 from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO log.PerfLogger: PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Starting Semantic Analysis 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Get metadata for source tables 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Get metadata for subqueries 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Get metadata for destination tables 14/11/06 00:35:59 INFO ql.Context: New scratch dir is hdfs://cn041-10.l42scl.hortonworks.com:8020/tmp/hive/gopal/6b3980f6-3238-4e91-ae53-cb3f54092dab/hive_2014-11-06_00-35-59_379_317426424610374080-1 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Set stats collection dir : hdfs://cn041-10.l42scl.hortonworks.com:8020/tmp/hive/gopal/6b3980f6-3238-4e91-ae53-cb3f54092dab/hive_2014-11-06_00-35-59_379_317426424610374080-1/-ext-10002 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for FS(16) 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for SEL(15) 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for GBY(14) 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for RS(13) 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for GBY(12) 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for SEL(11) 14/11/06 00:35:59 INFO ppd.OpProcFactory: Processing for TS(10) 14/11/06 00:35:59 INFO optimizer.ColumnPrunerProcFactory: RS 13 oldColExprMap: {VALUE._col0=Column[_col0]} 14/11/06 00:35:59 INFO optimizer.ColumnPrunerProcFactory: RS 13 newColExprMap: {VALUE._col0=Column[_col0]} 14/11/06 00:35:59 INFO parse.SemanticAnalyzer: Completed plan generation 14/11/06 00:35:59 INFO ql.Driver: Semantic Analysis Completed 14/11/06 00:35:59 INFO log.PerfLogger: /PERFLOG method=semanticAnalyze start=1415262959381 end=1415262959401 duration=20 from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:Explain, type:string, comment:null)], properties:null) 14/11/06 00:35:59 INFO log.PerfLogger: /PERFLOG method=compile start=1415262959378 end=1415262959402 duration=24 from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO log.PerfLogger: PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver ++--+ | Explain | ++--+ | STAGE DEPENDENCIES:| | Stage-0 is a root stage | || | STAGE PLANS: | | Stage: Stage-0 | | Fetch Operator | | limit: 1 | | Processor Tree: | | ListSink | || ++--+ 10 rows selected (0.1 seconds) 14/11/06 00:35:59 INFO log.PerfLogger: PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager 14/11/06 00:35:59 INFO log.PerfLogger: PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver 14/11/06 00:35:59 INFO ql.Driver: Starting command: explain select count(1) from
[jira] [Commented] (HIVE-9481) allow column list specification in INSERT statement
[ https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321763#comment-14321763 ] Lefty Leverenz commented on HIVE-9481: -- This should be documented in the wiki here: * [LanguageManual DML -- Inserting data into Hive Tables from queries | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries] If it also applies to INSERT ... VALUES, mention it here: * [LanguageManual DML -- Inserting values into tables from SQL | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingvaluesintotablesfromSQL] allow column list specification in INSERT statement --- Key: HIVE-9481 URL: https://issues.apache.org/jira/browse/HIVE-9481 Project: Hive Issue Type: Bug Components: Parser, Query Processor, SQL Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, HIVE-9481.6.patch, HIVE-9481.patch Given a table FOO(a int, b int, c int), ANSI SQL supports insert into FOO(c,b) select x,y from T. The expectation is that 'x' is written to column 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' is NULLABLE. Hive does not support this. In Hive one has to ensure that the data producing statement has a schema that matches target table schema. Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when target schema is explicitly provided, missing columns will be set to NULL if they are NULLABLE, otherwise an error will be raised. If/when DEFAULT clause is supported, this can be enhanced to set default value rather than NULL. Thus, given {noformat} create table source (a int, b int); create table target (x int, y int, z int); create table target2 (x int, y int, z int); {noformat} {noformat}insert into target(y,z) select * from source;{noformat} will mean {noformat}insert into target select null as x, a, b from source;{noformat} and {noformat}insert into target(z,y) select * from source;{noformat} will meant {noformat}insert into target select null as x, b, a from source;{noformat} Also, {noformat} from source insert into target(y,z) select null as x, * insert into target2(y,z) select null as x, source.*; {noformat} and for partitioned tables, given {noformat} Given: CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING) PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC; INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) VALUES ('jsmith', 'mail.com'); {noformat} And dynamic partitioning {noformat} INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) VALUES ('jsmith', '2014-09-23', 'mail.com'); {noformat} In all cases, the schema specification contains columns of the target table which are matched by position to the values produced by VALUES clause/SELECT statement. If the producer side provides values for a dynamic partition column, the column should be in the specified schema. Static partition values are part of the partition spec and thus are not produced by the producer and should not be part of the schema specification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9481) allow column list specification in INSERT statement
[ https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9481: - Labels: TODOC1.2 (was: ) allow column list specification in INSERT statement --- Key: HIVE-9481 URL: https://issues.apache.org/jira/browse/HIVE-9481 Project: Hive Issue Type: Bug Components: Parser, Query Processor, SQL Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.5.patch, HIVE-9481.6.patch, HIVE-9481.patch Given a table FOO(a int, b int, c int), ANSI SQL supports insert into FOO(c,b) select x,y from T. The expectation is that 'x' is written to column 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' is NULLABLE. Hive does not support this. In Hive one has to ensure that the data producing statement has a schema that matches target table schema. Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when target schema is explicitly provided, missing columns will be set to NULL if they are NULLABLE, otherwise an error will be raised. If/when DEFAULT clause is supported, this can be enhanced to set default value rather than NULL. Thus, given {noformat} create table source (a int, b int); create table target (x int, y int, z int); create table target2 (x int, y int, z int); {noformat} {noformat}insert into target(y,z) select * from source;{noformat} will mean {noformat}insert into target select null as x, a, b from source;{noformat} and {noformat}insert into target(z,y) select * from source;{noformat} will meant {noformat}insert into target select null as x, b, a from source;{noformat} Also, {noformat} from source insert into target(y,z) select null as x, * insert into target2(y,z) select null as x, source.*; {noformat} and for partitioned tables, given {noformat} Given: CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING) PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC; INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) VALUES ('jsmith', 'mail.com'); {noformat} And dynamic partitioning {noformat} INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) VALUES ('jsmith', '2014-09-23', 'mail.com'); {noformat} In all cases, the schema specification contains columns of the target table which are matched by position to the values produced by VALUES clause/SELECT statement. If the producer side provides values for a dynamic partition column, the column should be in the specified schema. Static partition values are part of the partition spec and thus are not produced by the producer and should not be part of the schema specification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9425) Add jar/file doesn't work with yarn-cluster mode [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321769#comment-14321769 ] Lefty Leverenz commented on HIVE-9425: -- Should this also be committed to trunk? Add jar/file doesn't work with yarn-cluster mode [Spark Branch] --- Key: HIVE-9425 URL: https://issues.apache.org/jira/browse/HIVE-9425 Project: Hive Issue Type: Sub-task Components: spark-branch Reporter: Xiaomin Zhang Assignee: Rui Li Fix For: spark-branch, 1.1.0 Attachments: HIVE-9425.1-spark.patch {noformat} 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file or directory)), was the --addJars option used? 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or directory)), was the --addJars option used? 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or directory)), was the --addJars option used? 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or directory)), was the --addJars option used? 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or directory)), was the --addJars option used? 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request fef081b0-5408-4804-9531-d131fdd628e6 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job fef081b0-5408-4804-9531-d131fdd628e6 org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: de.bankmark.bigbench.queries.q10.SentimentUDF Serialization trace: genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc) conf (org.apache.hadoop.hive.ql.exec.UDTFOperator) childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) {noformat} It seems the additional Jar files are not uploaded to DistributedCache, so that the Driver cannot access it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9673) Set operationhandle in ATS entities for lookups
[ https://issues.apache.org/jira/browse/HIVE-9673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321765#comment-14321765 ] Lefty Leverenz commented on HIVE-9673: -- Does this need Hive documentation, or maybe YARN documentation? Set operationhandle in ATS entities for lookups --- Key: HIVE-9673 URL: https://issues.apache.org/jira/browse/HIVE-9673 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9673.1.patch, HIVE-9673.2.patch Yarn App Timeline Server (ATS) users can find their query using hive query-id. However, query id is available only through the logs at the moment. Thrift api users such as Hue have another unique id for queries, which the operation handle contains (TExecuteStatementResp.TOperationHandle.THandleIdentifier.guid) . Adding the operationhandle guid to ATS will enable such thrift users to get information from ATS for the queries that they have spawned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319779#comment-14319779 ] Lefty Leverenz commented on HIVE-2573: -- Doc note: This adds Function to the description of *hive.exec.drop.ignorenonexistent* in 1.2.0, so the wiki needs to be updated (with version information). By the way, HIVE-3781 added Index to the description in 1.1.0. * [Configuration Properties -- hive.exec.drop.ignorenonexistent | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.exec.drop.ignorenonexistent] What other documentation does this need? Should there be a release note? Create per-session function registry - Key: HIVE-2573 URL: https://issues.apache.org/jira/browse/HIVE-2573 Project: Hive Issue Type: Improvement Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Labels: TODOC1.2 Fix For: 1.2.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt Currently the function registry is shared resource and could be overrided by other users when using HiveServer. If per-session function registry is provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-2573: - Labels: TODOC1.2 (was: ) Create per-session function registry - Key: HIVE-2573 URL: https://issues.apache.org/jira/browse/HIVE-2573 Project: Hive Issue Type: Improvement Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Labels: TODOC1.2 Fix For: 1.2.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, HIVE-2573.1.patch.txt, HIVE-2573.10.patch.txt, HIVE-2573.11.patch.txt, HIVE-2573.12.patch.txt, HIVE-2573.13.patch.txt, HIVE-2573.14.patch.txt, HIVE-2573.15.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch, HIVE-2573.8.patch.txt, HIVE-2573.9.patch.txt Currently the function registry is shared resource and could be overrided by other users when using HiveServer. If per-session function registry is provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7759) document hive cli authorization behavior when SQL std auth is enabled
[ https://issues.apache.org/jira/browse/HIVE-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7759: - Labels: (was: TODOC14) document hive cli authorization behavior when SQL std auth is enabled - Key: HIVE-7759 URL: https://issues.apache.org/jira/browse/HIVE-7759 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Thejas M Nair Assignee: Thejas M Nair There should a section in sql standard auth doc that highlights how hive-cli behaves with SQL standard authorization turned on. Changes in HIVE-7533 and HIVE-7209 should be documented as part of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9668) Create proper documentation for setting up Kerberos authentication on Hive.
[ https://issues.apache.org/jira/browse/HIVE-9668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9668: - Labels: TODOC1.0 (was: ) Create proper documentation for setting up Kerberos authentication on Hive. --- Key: HIVE-9668 URL: https://issues.apache.org/jira/browse/HIVE-9668 Project: Hive Issue Type: Task Components: Authentication Environment: Apache Hive 1.0.0 Reporter: Anilkumar Kalshetti Priority: Minor Labels: TODOC1.0 Issue: Create proper documentation for setting up Kerberos authentication on Hive. 1. Mention the OS, Hadoop and Hive version . machine hostname, any dependancies required to be installed with proper download links 2. Attach the hadoop , hive properties files to this issue, where changes are made. 3. Attach the screenshot, of hive terminal,which shows- starting hive with kerberos authentication by using service principal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9664) Hive add jar command should be able to download and add jars from a repository
[ https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317840#comment-14317840 ] Lefty Leverenz commented on HIVE-9664: -- Here's where you can find information about Hive's compile syntax and Groovy: * [Hive Commands | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Commands]: {{compile `groovy string` AS GROOVY NAMED name}} * [Nov. 2013 Hive Contributors Meetup Presentations -- Using Dynamic Compilation with Hive | https://cwiki.apache.org/confluence/download/attachments/27362054/HiveContrib-Nov13-groovy_plus_hive.pptx?version=1modificationDate=1385171856000api=v2] Note: The last slide in the presentation has a typo. Instead of HIVE-520 see HIVE-5250 and its subtask HIVE-5252 (Add ql syntax for inline java code creation). Hive add jar command should be able to download and add jars from a repository Key: HIVE-9664 URL: https://issues.apache.org/jira/browse/HIVE-9664 Project: Hive Issue Type: Improvement Reporter: Anant Nag Labels: hive Currently Hive's add jar command takes a local path to the dependency jar. This clutters the local file-system as users may forget to remove this jar later It would be nice if Hive supported a Gradle like notation to download the jar from a repository. Example: add jar org:module:version It should also be backward compatible and should take jar from the local file-system as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9593) ORC Reader should ignore unknown metadata streams
[ https://issues.apache.org/jira/browse/HIVE-9593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317243#comment-14317243 ] Lefty Leverenz commented on HIVE-9593: -- Does this need documentation? ORC Reader should ignore unknown metadata streams -- Key: HIVE-9593 URL: https://issues.apache.org/jira/browse/HIVE-9593 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0, 0.12.0, 0.13.1, 1.0.0, 1.2.0, 1.1.0 Reporter: Gopal V Assignee: Owen O'Malley Fix For: 1.1.0, 1.0.1 Attachments: HIVE-9593.no-autogen.patch, hive-9593.patch ORC readers should ignore metadata streams which are non-essential additions to the main data streams. This will include additional indices, histograms or anything we add as an optional stream. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9594) Add qtests for LAST_DAY udf
[ https://issues.apache.org/jira/browse/HIVE-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317318#comment-14317318 ] Lefty Leverenz commented on HIVE-9594: -- _No-doc note:_ *last_day* is already documented in the wiki for 1.1.0 (HIVE-9358) and individual qtests don't need documentation, so I think this jira doesn't need its TODOC1.2 label. Or am I missing something? * [Hive Operators and UDFs -- Date Functions -- last_day (near end of table) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions] Add qtests for LAST_DAY udf --- Key: HIVE-9594 URL: https://issues.apache.org/jira/browse/HIVE-9594 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9594.1.patch, HIVE-9594.2.patch currently udf_last_day.q contains only {code} DESCRIBE FUNCTION last_day; DESCRIBE FUNCTION EXTENDED last_day; {code} Better to add several function executions to the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317342#comment-14317342 ] Lefty Leverenz commented on HIVE-9039: -- Looks good, thanks [~pxiong]. I'm removing the TODOC1.2 label. Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Components: Query Planning Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.0 Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch CLEAR LIBRARY CACHE Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9656) Create Index Failed without WITH DEFERRED REBUILD
[ https://issues.apache.org/jira/browse/HIVE-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14317383#comment-14317383 ] Lefty Leverenz commented on HIVE-9656: -- CREATE INDEX syntax in the DDL wikidoc also shows WITH DEFERRED REBUILD as optional. * [LanguageManual DDL -- Create Index | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateIndex] Create Index Failed without WITH DEFERRED REBUILD - Key: HIVE-9656 URL: https://issues.apache.org/jira/browse/HIVE-9656 Project: Hive Issue Type: Bug Components: Indexing Affects Versions: 0.13.0, 0.14.0, 1.0.0 Reporter: Will Du Create Hive index without specifying WITH DEFERRED REBUILD is failed with following error. CREATE INDEX table01_index ON TABLE table01 (column2) AS 'COMPACT'; FAILED: Error in metadata: java.lang.RuntimeException: Please specify deferred rebuild using WITH DEFERRED REBUILD . FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask According to the design here https://cwiki.apache.org/confluence/display/Hive/IndexDev WITH DEFERRED REBUILD is optional, correct? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9039: - Labels: (was: TODOC1.2) Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Components: Query Planning Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.0 Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch CLEAR LIBRARY CACHE Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315759#comment-14315759 ] Lefty Leverenz commented on HIVE-9039: -- Nicely done, [~pxiong]. I added emphasis for version changes at the beginning and trimmed the version note at the end. Feel free to change things back again if you disagree. Some minor questions: 1. In this example, should there be a space before a and b (after the parentheses) or is it optional? {code} SELECT key FROM (SELECT key FROM src ORDER BY key LIMIT 10)a UNION SELECT key FROM (SELECT key FROM src1 ORDER BY key LIMIT 10)b {code} 2. Does DISTRIBUTE BY belong in the first half of this sentence, or should it be removed from the second half? And can after the last one be rephrased after the last SELECT of the UNION or something similar? {quote} To apply an ORDER BY, SORT BY, CLUSTER BY or LIMIT clause to the entire UNION result, place the ORDER BY, SORT BY, CLUSTER BY, DISTRIBUTE BY or LIMIT after the last one. {quote} Thanks. Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Components: Query Planning Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch CLEAR LIBRARY CACHE Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9578) Add support for getDatabases and alterDatabase calls [hbase-metastore branch]
[ https://issues.apache.org/jira/browse/HIVE-9578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315184#comment-14315184 ] Lefty Leverenz commented on HIVE-9578: -- No user documentation, right? Add support for getDatabases and alterDatabase calls [hbase-metastore branch] - Key: HIVE-9578 URL: https://issues.apache.org/jira/browse/HIVE-9578 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Alan Gates Assignee: Alan Gates Fix For: 1.2.0 Attachments: HIVE-9578.2.patch, HIVE-9578.patch The initial patch only supporting getting a single database, add database, and drop database. Support needs to be added for alter database, getting all the databases, and getting database names by pattern. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9507) Make LATERAL VIEW inline(expression) mytable tolerant to nulls
[ https://issues.apache.org/jira/browse/HIVE-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313737#comment-14313737 ] Lefty Leverenz commented on HIVE-9507: -- No doc needed, right? Make LATERAL VIEW inline(expression) mytable tolerant to nulls Key: HIVE-9507 URL: https://issues.apache.org/jira/browse/HIVE-9507 Project: Hive Issue Type: Bug Components: Query Processor, UDF Affects Versions: 0.14.0 Environment: hdp 2.2 Windows server 2012 R2 64-bit Reporter: Moustafa Aboul Atta Assignee: Navis Priority: Minor Fix For: 1.2.0 Attachments: HIVE-9507.1.patch.txt, HIVE-9507.2.patch.txt, HIVE-9507.3.patch.txt, parial_log.log I have tweets stored with avro on hdfs with the default twitter status (tweet) schema. There's an object called entities that contains arrays of structs. When I run {{SELECT mytable.*}} {{FROM tweets}} {{LATERAL VIEW INLINE(entities.media) mytable}} I get the exception attached as partial_log.log, however, if I add {{WHERE entities.media IS NOT NULL}} it runs perfectly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9611) Allow SPARK_HOME as well as spark.home to define sparks location
[ https://issues.apache.org/jira/browse/HIVE-9611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313255#comment-14313255 ] Lefty Leverenz commented on HIVE-9611: -- Does this need to be documented? Allow SPARK_HOME as well as spark.home to define sparks location Key: HIVE-9611 URL: https://issues.apache.org/jira/browse/HIVE-9611 Project: Hive Issue Type: Bug Components: Spark Affects Versions: spark-branch, 1.1.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Fix For: 1.1.0 Attachments: HIVE-9611.patch Right now {{SparkClientImpl}} requires {{spark.home}} to be defined. We should allow {{SPARK_HOME}} as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9039: - Labels: TODOC1.2 (was: ) Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Components: Query Planning Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch CLEAR LIBRARY CACHE Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9587) UDF decode should accept STRING_GROUP types for the second parameter
[ https://issues.apache.org/jira/browse/HIVE-9587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312995#comment-14312995 ] Lefty Leverenz commented on HIVE-9587: -- Does this need user documentation? If so, it belongs here: * [Hive Operators and UDFs -- String Functions | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringFunctions] UDF decode should accept STRING_GROUP types for the second parameter Key: HIVE-9587 URL: https://issues.apache.org/jira/browse/HIVE-9587 Project: Hive Issue Type: Bug Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Fix For: 1.2.0 Attachments: HIVE-9587.1.patch, HIVE-9587.2.patch 1. UDF decode should accept STRING_GROUP types for the second parameter 2. Fix error messages. (replace Encode with Decode) {code} select decode(cast('A' as binary), cast('utf-8' as varchar(5))); FAILED: SemanticException [Error 10016]: Line 1:59 Argument type mismatch ''utf-8'': The second argument to Encode() must be a string {code} 3. remove unused imports 4. add udf_decode.q test -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312940#comment-14312940 ] Lefty Leverenz commented on HIVE-9039: -- Doc note: This should be documented in the UNION doc, with version information. The SELECT doc needs to be updated, because currently its link to the UNION doc says UNION ALL. Also, the section ALL and DISTINCT Clauses should link to the UNION doc (and 0.15 should be updated to 1.1.0.) * [LanguageManual -- Union | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union] * [LanguageManual -- Select | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select] ** [Select -- ALL and DISTINCT Clauses | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-ALLandDISTINCTClauses] ** [Select -- More Select Syntax -- UNION ALL | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select#LanguageManualSelect-UNIONALL] And how about a release note? Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Components: Query Planning Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch, HIVE-9039.15.patch, HIVE-9039.16.patch, HIVE-9039.17.patch, HIVE-9039.18.patch, HIVE-9039.19.patch, HIVE-9039.20.patch, HIVE-9039.21.patch, HIVE-9039.22.patch, HIVE-9039.23.patch CLEAR LIBRARY CACHE Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'
[ https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9350: - Labels: TODOC1.2 (was: ) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases' --- Key: HIVE-9350 URL: https://issues.apache.org/jira/browse/HIVE-9350 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, HIVE-9350.4.patch It should be possible for HiveAuthorizer implementations to control if a user is able to see a table or database in results of 'show tables' and 'show databases' respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'
[ https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14310610#comment-14310610 ] Lefty Leverenz commented on HIVE-9350: -- Doc note: This changes the description of *hive.metastore.filter.hook* which was created by HIVE-8612 in 1.1.0 (aka 0.15.0) and documented in the Metastore Administration wikidoc. So the wiki needs to be updated for 1.2.0. * [Hive Metastore Administration -- Additional Configuration Parameters | https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-AdditionalConfigurationParameters] Does this need any other documentation? Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases' --- Key: HIVE-9350 URL: https://issues.apache.org/jira/browse/HIVE-9350 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, HIVE-9350.4.patch It should be possible for HiveAuthorizer implementations to control if a user is able to see a table or database in results of 'show tables' and 'show databases' respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4639) Add has null flag to ORC internal index
[ https://issues.apache.org/jira/browse/HIVE-4639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311024#comment-14311024 ] Lefty Leverenz commented on HIVE-4639: -- Doc note: [~prasanth_j] documented this in the ORC wiki. * [ORC -- Column Statistics | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-ColumnStatistics] But it says the hasNull flag is added in 1.2.0 -- shouldn't that be 1.1.0, since this jira's fix version is 0.15? Add has null flag to ORC internal index --- Key: HIVE-4639 URL: https://issues.apache.org/jira/browse/HIVE-4639 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Owen O'Malley Assignee: Prasanth Jayachandran Fix For: 0.15.0 Attachments: HIVE-4639.1.patch, HIVE-4639.2.patch, HIVE-4639.3.patch It would enable more predicate pushdown if we added a flag to the index entry recording if there were any null values in the column for the 10k rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6099: - Labels: TODOC1.2 count distinct insert multi-insert (was: count distinct insert multi-insert) Multi insert does not work properly with distinct count --- Key: HIVE-6099 URL: https://issues.apache.org/jira/browse/HIVE-6099 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0 Reporter: Pavan Gadam Manohar Assignee: Ashutosh Chauhan Labels: TODOC1.2, count, distinct, insert, multi-insert Fix For: 1.2.0 Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt Need 2 rows to reproduce this Bug. Here are the steps. Step 1) Create a table Table_A CREATE EXTERNAL TABLE Table_A ( user string , type int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/Table_A'; Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 111 and 123. Insert 2 records into the table created above. select * from Table_A; hive select * from table_a; OK tommy 123 2013-12-02 tommy 111 2013-12-02 Step 3) Create 2 destination tables to simulate multi-insert. CREATE EXTERNAL TABLE dest_Table_A ( p_date string , Distinct_Users int , Type111Users int , Type123Users int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/dest_Table_A'; CREATE EXTERNAL TABLE dest_Table_B ( p_date string , Distinct_Users int , Type111Users int , Type123Users int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/dest_Table_B'; Step 4) Multi insert statement from Table_A a INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') select a.dt ,count(distinct a.user) as AllDist ,count(distinct case when a.type = 111 then a.user else null end) as Type111User ,count(distinct case when a.type != 111 then a.user else null end) as Type123User group by a.dt INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') select a.dt ,count(distinct a.user) as AllDist ,count(distinct case when a.type = 111 then a.user else null end) as Type111User ,count(distinct case when a.type != 111 then a.user else null end) as Type123User group by a.dt ; Step 5) Verify results. hive select * from dest_table_a; OK 2013-12-02 2 1 1 2013-12-02 Time taken: 0.116 seconds hive select * from dest_table_b; OK 2013-12-02 2 1 1 2013-12-02 Time taken: 0.13 seconds Conclusion: Hive gives a count of 2 for distinct users although there is only one distinct user. After trying many datasets observed that Hive is doing Type111Users + Typoe123Users = DistinctUsers which is wrong. hive select count(distinct a.user) from table_a a; Gives: Total MapReduce CPU Time Spent: 4 seconds 350 msec OK 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3728) make optimizing multi-group by configurable
[ https://issues.apache.org/jira/browse/HIVE-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308848#comment-14308848 ] Lefty Leverenz commented on HIVE-3728: -- Doc note: This adds configuration parameter *hive.optimize.multigroupby.common.distincts* to HiveConf.java and the template file in version 0.11.0, so it needs to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] HIVE-6099 removes *hive.optimize.multigroupby.common.distincts* in version 1.2.0. make optimizing multi-group by configurable --- Key: HIVE-3728 URL: https://issues.apache.org/jira/browse/HIVE-3728 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Labels: TODOC11 Fix For: 0.11.0 Attachments: hive.3728.2.patch, hive.3728.3.patch This was done as part of https://issues.apache.org/jira/browse/HIVE-609. This should be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9564) Extend HIVE-9298 for JsonSerDe
[ https://issues.apache.org/jira/browse/HIVE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308872#comment-14308872 ] Lefty Leverenz commented on HIVE-9564: -- Doc note: There isn't much documentation for the JSON SerDe, but this could be mentioned (with version information) in the Hive SerDe wikidoc as well as the Timestamps section of Hive Data Types. * [SerDe Overview -- Third-party SerDes | https://cwiki.apache.org/confluence/display/Hive/SerDe#SerDe-Third-partySerDes] * [Hive Data Types -- Timestamps | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps] Extend HIVE-9298 for JsonSerDe -- Key: HIVE-9564 URL: https://issues.apache.org/jira/browse/HIVE-9564 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9564.1.patch Allow alternate timestamp formats for JsonSerDe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308856#comment-14308856 ] Lefty Leverenz commented on HIVE-6099: -- Doc note: This removes configuration parameter *hive.optimize.multigroupby.common.distincts* from HiveConf.java in version 1.2.0. It was introduced in 0.11.0 by HIVE-3728 and hasn't been documented yet. The Configuration Properties wikidoc should specify that it exists in versions 0.11.0 through 1.1.0, with links to HIVE-3728 and this issue (HIVE-6099). * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] Multi insert does not work properly with distinct count --- Key: HIVE-6099 URL: https://issues.apache.org/jira/browse/HIVE-6099 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0 Reporter: Pavan Gadam Manohar Assignee: Ashutosh Chauhan Labels: TODOC1.2, count, distinct, insert, multi-insert Fix For: 1.2.0 Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt Need 2 rows to reproduce this Bug. Here are the steps. Step 1) Create a table Table_A CREATE EXTERNAL TABLE Table_A ( user string , type int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/Table_A'; Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 111 and 123. Insert 2 records into the table created above. select * from Table_A; hive select * from table_a; OK tommy 123 2013-12-02 tommy 111 2013-12-02 Step 3) Create 2 destination tables to simulate multi-insert. CREATE EXTERNAL TABLE dest_Table_A ( p_date string , Distinct_Users int , Type111Users int , Type123Users int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/dest_Table_A'; CREATE EXTERNAL TABLE dest_Table_B ( p_date string , Distinct_Users int , Type111Users int , Type123Users int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/dest_Table_B'; Step 4) Multi insert statement from Table_A a INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') select a.dt ,count(distinct a.user) as AllDist ,count(distinct case when a.type = 111 then a.user else null end) as Type111User ,count(distinct case when a.type != 111 then a.user else null end) as Type123User group by a.dt INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') select a.dt ,count(distinct a.user) as AllDist ,count(distinct case when a.type = 111 then a.user else null end) as Type111User ,count(distinct case when a.type != 111 then a.user else null end) as Type123User group by a.dt ; Step 5) Verify results. hive select * from dest_table_a; OK 2013-12-02 2 1 1 2013-12-02 Time taken: 0.116 seconds hive select * from dest_table_b; OK 2013-12-02 2 1 1 2013-12-02 Time taken: 0.13 seconds Conclusion: Hive gives a count of 2 for distinct users although there is only one distinct user. After trying many datasets observed that Hive is doing Type111Users + Typoe123Users = DistinctUsers which is wrong. hive select count(distinct a.user) from table_a a; Gives: Total MapReduce CPU Time Spent: 4 seconds 350 msec OK 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-3728) make optimizing multi-group by configurable
[ https://issues.apache.org/jira/browse/HIVE-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-3728: - Labels: TODOC11 (was: ) make optimizing multi-group by configurable --- Key: HIVE-3728 URL: https://issues.apache.org/jira/browse/HIVE-3728 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Labels: TODOC11 Fix For: 0.11.0 Attachments: hive.3728.2.patch, hive.3728.3.patch This was done as part of https://issues.apache.org/jira/browse/HIVE-609. This should be configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9564) Extend HIVE-9298 for JsonSerDe
[ https://issues.apache.org/jira/browse/HIVE-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9564: - Labels: TODOC1.2 (was: ) Extend HIVE-9298 for JsonSerDe -- Key: HIVE-9564 URL: https://issues.apache.org/jira/browse/HIVE-9564 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9564.1.patch Allow alternate timestamp formats for JsonSerDe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7998) Enhance JDBC Driver to not require class specification
[ https://issues.apache.org/jira/browse/HIVE-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14306859#comment-14306859 ] Lefty Leverenz commented on HIVE-7998: -- Doc note: This should be documented (with version information) in the JDBC section of HiveServer2 Clients. * [HiveServer2 Clients -- Using JDBC | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-UsingJDBC] Enhance JDBC Driver to not require class specification -- Key: HIVE-7998 URL: https://issues.apache.org/jira/browse/HIVE-7998 Project: Hive Issue Type: New Feature Components: JDBC Reporter: Prateek Rungta Assignee: Alexander Pivovarov Priority: Trivial Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-7998.1.patch The hotspot VM offers a way to avoid having to specify the driver class explicitly when using the JDBC driver. The DriverManager methods getConnection and getDrivers have been enhanced to support the Java Standard Edition Service Provider mechanism. JDBC 4.0 Drivers must include the file META-INF/services/java.sql.Driver. This file contains the name of the JDBC drivers implementation of java.sql.Driver. For example, to load the my.sql.Driver class, the META-INF/services/java.sql.Driver file would contain the entry: `my.sql.Driver` Applications no longer need to explictly load JDBC drivers using Class.forName(). Existing programs which currently load JDBC drivers using Class.forName() will continue to work without modification. via http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7998) Enhance JDBC Driver to not require class specification
[ https://issues.apache.org/jira/browse/HIVE-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7998: - Labels: TODOC1.2 (was: ) Enhance JDBC Driver to not require class specification -- Key: HIVE-7998 URL: https://issues.apache.org/jira/browse/HIVE-7998 Project: Hive Issue Type: New Feature Components: JDBC Reporter: Prateek Rungta Assignee: Alexander Pivovarov Priority: Trivial Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-7998.1.patch The hotspot VM offers a way to avoid having to specify the driver class explicitly when using the JDBC driver. The DriverManager methods getConnection and getDrivers have been enhanced to support the Java Standard Edition Service Provider mechanism. JDBC 4.0 Drivers must include the file META-INF/services/java.sql.Driver. This file contains the name of the JDBC drivers implementation of java.sql.Driver. For example, to load the my.sql.Driver class, the META-INF/services/java.sql.Driver file would contain the entry: `my.sql.Driver` Applications no longer need to explictly load JDBC drivers using Class.forName(). Existing programs which currently load JDBC drivers using Class.forName() will continue to work without modification. via http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7175) Provide password file option to beeline
[ https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7175: - Labels: TODOC1.2 TODOC13 features security (was: TODOC1.2 features security) Provide password file option to beeline --- Key: HIVE-7175 URL: https://issues.apache.org/jira/browse/HIVE-7175 Project: Hive Issue Type: Improvement Components: CLI, Clients Affects Versions: 0.13.0 Reporter: Robert Justice Assignee: Dr. Wendell Urth Labels: TODOC1.2, TODOC13, features, security Fix For: 1.2.0 Attachments: HIVE-7175.1.patch, HIVE-7175.1.patch, HIVE-7175.2.patch, HIVE-7175.branch-13.patch, HIVE-7175.patch For people connecting to Hive Server 2 with LDAP authentication enabled, in order to batch run commands, we currently have to provide the password openly in the command line. They could use some expect scripting, but I think a valid improvement would be to provide a password file option similar to other CLI commands in hadoop (e.g. sqoop) to be more secure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7175) Provide password file option to beeline
[ https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7175: - Labels: TODOC1.2 features security (was: features security) Provide password file option to beeline --- Key: HIVE-7175 URL: https://issues.apache.org/jira/browse/HIVE-7175 Project: Hive Issue Type: Improvement Components: CLI, Clients Affects Versions: 0.13.0 Reporter: Robert Justice Assignee: Dr. Wendell Urth Labels: TODOC1.2, features, security Fix For: 1.2.0 Attachments: HIVE-7175.1.patch, HIVE-7175.1.patch, HIVE-7175.2.patch, HIVE-7175.branch-13.patch, HIVE-7175.patch For people connecting to Hive Server 2 with LDAP authentication enabled, in order to batch run commands, we currently have to provide the password openly in the command line. They could use some expect scripting, but I think a valid improvement would be to provide a password file option similar to other CLI commands in hadoop (e.g. sqoop) to be more secure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9572) Merge from Spark branch to trunk 02/03/2015
[ https://issues.apache.org/jira/browse/HIVE-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308283#comment-14308283 ] Lefty Leverenz commented on HIVE-9572: -- Doc note: This adds configuration parameter *hive.spark.client.rpc.sasl.mechanisms* to HiveConf.java in trunk (originally added to Spark branch by HIVE-9487). So it needs to be documented for version 1.2.0 in Configuration Properties and Hive on Spark: Getting Started. * [Hive on Spark: Getting Started | https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started] * [Configuration Properties -- Spark | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark] Merge from Spark branch to trunk 02/03/2015 --- Key: HIVE-9572 URL: https://issues.apache.org/jira/browse/HIVE-9572 Project: Hive Issue Type: Task Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9572.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9273) Add option to fire metastore event on insert
[ https://issues.apache.org/jira/browse/HIVE-9273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308342#comment-14308342 ] Lefty Leverenz commented on HIVE-9273: -- Doc query: This adds the configuration parameter *hive.metastore.dml.events* to HiveConf.java -- doesn't it need to be documented in the Metastore section of Configuration Properties? * [Configuration Properties -- Metastore | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore] Add option to fire metastore event on insert Key: HIVE-9273 URL: https://issues.apache.org/jira/browse/HIVE-9273 Project: Hive Issue Type: New Feature Reporter: Alan Gates Assignee: Alan Gates Fix For: 1.2.0 Attachments: HIVE-9273.2.patch, HIVE-9273.patch HIVE-9271 adds the ability for the client to request firing metastore events. This can be used in the MoveTask to fire events when an insert is done that does not add partitions to a table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7175) Provide password file option to beeline
[ https://issues.apache.org/jira/browse/HIVE-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308221#comment-14308221 ] Lefty Leverenz commented on HIVE-7175: -- Doc note: This should be documented in the Beeline section of HiveServer2 Clients, with version information for 1.2.0 and a link to this issue for the 0.13 patch. * [HiveServer2 Clients -- Beeline Command Options | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions] Provide password file option to beeline --- Key: HIVE-7175 URL: https://issues.apache.org/jira/browse/HIVE-7175 Project: Hive Issue Type: Improvement Components: CLI, Clients Affects Versions: 0.13.0 Reporter: Robert Justice Assignee: Dr. Wendell Urth Labels: TODOC1.2, features, security Fix For: 1.2.0 Attachments: HIVE-7175.1.patch, HIVE-7175.1.patch, HIVE-7175.2.patch, HIVE-7175.branch-13.patch, HIVE-7175.patch For people connecting to Hive Server 2 with LDAP authentication enabled, in order to batch run commands, we currently have to provide the password openly in the command line. They could use some expect scripting, but I think a valid improvement would be to provide a password file option similar to other CLI commands in hadoop (e.g. sqoop) to be more secure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9572) Merge from Spark branch to trunk 02/03/2015
[ https://issues.apache.org/jira/browse/HIVE-9572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9572: - Labels: TODOC1.2 (was: ) Merge from Spark branch to trunk 02/03/2015 --- Key: HIVE-9572 URL: https://issues.apache.org/jira/browse/HIVE-9572 Project: Hive Issue Type: Task Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9572.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9487: - Labels: TODOC-SPARK (was: TODOC-SPARK TODOC15) Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Labels: TODOC-SPARK Fix For: spark-branch, 1.1.0 Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9453) Initial patch [hbase-metastore branch]
[ https://issues.apache.org/jira/browse/HIVE-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9453: - Labels: TODOC1.2 (was: ) Initial patch [hbase-metastore branch] -- Key: HIVE-9453 URL: https://issues.apache.org/jira/browse/HIVE-9453 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Alan Gates Assignee: Alan Gates Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9453-reviewcomments.pdf, HIVE-9453.2.patch, HIVE-9453.3.patch, HIVE-9453.patch This initial patch has several important features: # HBaseStore, a new implementation of RawStore that stores the data in HBase. # Subclasses of the thrift metastore objects to remove the massive duplication of data where every partition contains a nearly identical storage descriptor. # Caches for catalog objects and statistics so that repeated metastore calls don't result in repeated calls against HBase. Currently this works to the point that load table and select work. I have not tested any other statements, and I suspect most fail. There is no security, no authorization, and a no a lot of other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9453) Initial patch [hbase-metastore branch]
[ https://issues.apache.org/jira/browse/HIVE-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14308330#comment-14308330 ] Lefty Leverenz commented on HIVE-9453: -- Doc note: This adds three configuration properties to HiveConf.java in the HBase-metastore branch, so they will need to be documented in the wiki when the branch gets merged to trunk. Adding a TODOC-1.2 label because that's currently shown as the fix version. * hive.metastore.fastpath * hive.metastore.hbase.cache.size * hive.metastore.hbase.cache.ttl (I have some quibbles with the parameter descriptions, which can be fixed in the trunk merge.) Initial patch [hbase-metastore branch] -- Key: HIVE-9453 URL: https://issues.apache.org/jira/browse/HIVE-9453 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Alan Gates Assignee: Alan Gates Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9453-reviewcomments.pdf, HIVE-9453.2.patch, HIVE-9453.3.patch, HIVE-9453.patch This initial patch has several important features: # HBaseStore, a new implementation of RawStore that stores the data in HBase. # Subclasses of the thrift metastore objects to remove the massive duplication of data where every partition contains a nearly identical storage descriptor. # Caches for catalog objects and statistics so that repeated metastore calls don't result in repeated calls against HBase. Currently this works to the point that load table and select work. I have not tested any other statements, and I suspect most fail. There is no security, no authorization, and a no a lot of other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9521) Drop support for Java6
[ https://issues.apache.org/jira/browse/HIVE-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9521: - Labels: TODOC1.2 (was: ) Drop support for Java6 -- Key: HIVE-9521 URL: https://issues.apache.org/jira/browse/HIVE-9521 Project: Hive Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9521.00.patch As logical continuation of HIVE-4583, let's start using java7 syntax as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9298) Support reading alternate timestamp formats
[ https://issues.apache.org/jira/browse/HIVE-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302946#comment-14302946 ] Lefty Leverenz commented on HIVE-9298: -- Doc note: timestamp.formats can be documented in the Timestamps section of the Hive Data Types wikidoc. (At present we don't have a section on SerDe properties in the wiki, but perhaps we should.) * [Hive Data Types -- Timestamps | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps] * [SerDe Overview | https://cwiki.apache.org/confluence/display/Hive/SerDe#SerDe-SerDeOverview] Support reading alternate timestamp formats --- Key: HIVE-9298 URL: https://issues.apache.org/jira/browse/HIVE-9298 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9298.1.patch, HIVE-9298.2.patch, HIVE-9298.3.patch There are some users who want to be able to parse ISO-8601 timestamps, as well to set their own custom timestamp formats. We may be able to support this in LazySimpleSerDe through the use of a SerDe parameter to specify one or more alternative timestamp patterns to use to parse timestamp values from string. If we are doing this it might also be nice to work in support for HIVE-3844, to parse numeric strings as timestamp by treating the numeric value as millis since Unix epoch. This can be enabled through the SerDe params as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9521) Drop support for Java6
[ https://issues.apache.org/jira/browse/HIVE-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302921#comment-14302921 ] Lefty Leverenz commented on HIVE-9521: -- Doc note: The wiki needs to be updated in two places. * [Getting Started -- Requirements | https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-Requirements] * [Installing Hive | https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingHive] Drop support for Java6 -- Key: HIVE-9521 URL: https://issues.apache.org/jira/browse/HIVE-9521 Project: Hive Issue Type: Improvement Reporter: Nick Dimiduk Assignee: Nick Dimiduk Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9521.00.patch As logical continuation of HIVE-4583, let's start using java7 syntax as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9298) Support reading alternate timestamp formats
[ https://issues.apache.org/jira/browse/HIVE-9298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9298: - Labels: TODOC1.2 (was: ) Support reading alternate timestamp formats --- Key: HIVE-9298 URL: https://issues.apache.org/jira/browse/HIVE-9298 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9298.1.patch, HIVE-9298.2.patch, HIVE-9298.3.patch There are some users who want to be able to parse ISO-8601 timestamps, as well to set their own custom timestamp formats. We may be able to support this in LazySimpleSerDe through the use of a SerDe parameter to specify one or more alternative timestamp patterns to use to parse timestamp values from string. If we are doing this it might also be nice to work in support for HIVE-3844, to parse numeric strings as timestamp by treating the numeric value as millis since Unix epoch. This can be enabled through the SerDe params as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-5472) support a simple scalar which returns the current timestamp
[ https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302905#comment-14302905 ] Lefty Leverenz edited comment on HIVE-5472 at 2/3/15 8:49 AM: -- Does this need any documentation? _Edit:_ [~jdere] documented current_date and current_timestamp in the UDFs wikidoc, so the only doc question is whether the changes to some hive.test.* parameter definitions should be explained somewhere in the wiki. (The changes just added false after parameter descriptions, so my guess is that no doc is needed.) * [Hive Operators and UDFs -- Date Functions | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions] * [Configuration Properties -- Test Properties | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TestProperties] was (Author: le...@hortonworks.com): Does this need any documentation? support a simple scalar which returns the current timestamp --- Key: HIVE-5472 URL: https://issues.apache.org/jira/browse/HIVE-5472 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: N Campbell Assignee: Jason Dere Fix For: 1.2.0 Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, HIVE-5472.4.patch ISO-SQL has two forms of functions local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE and the latter with TIME ZONE select cast ( unix_timestamp() as timestamp ) from T implement a function which computes LOCAL TIMESTAMP which would be the current timestamp for the users session time zone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.
[ https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14304438#comment-14304438 ] Lefty Leverenz commented on HIVE-9500: -- Thanks for the javadoc changes, [~aihuaxu]. Now that you've changed the semicolon, I realize the description was better the original way (a three-part sentence giving each possible case) so if you're making other changes you might want to revert to the semicolon ... but then also make If lowercase in the next clause. Here's a new version with a few other edits: {code} + * To be backward-compatible, initialize the first 3 separators to + * be the given values. The default number of separators is 8; if only + * hive.serialization.extend.nesting.levels is set, extend the number of + * separators to 24; if hive.serialization.extend.additional.nesting.levels + * is set, extend the number of separators to 154. {code} By the way, in the first sentence what does the given values mean? Wait ... I just noticed that the doc comments are on a private method. So there won't be any javadocs. Support nested structs over 24 levels. -- Key: HIVE-9500 URL: https://issues.apache.org/jira/browse/HIVE-9500 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Labels: SerDe Fix For: 1.2.0 Attachments: HIVE-9500.1.patch, HIVE-9500.2.patch Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9500) Support nested structs over 24 levels.
[ https://issues.apache.org/jira/browse/HIVE-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302595#comment-14302595 ] Lefty Leverenz commented on HIVE-9500: -- Doc note: hive.serialization.extend.nesting.levels and hive.serialization.extend.additional.nesting.levels are SerDe properties, not HiveConf properties. Their documentation should go in the Hive Types wikidoc, either in a new section for structs or in the existing Complex Types section, and perhaps in the DDL wikidoc. (One of these days we should add a section on SerDe properties to the Hive SerDes wikidoc.) * [Hive Data Types -- Complex Types | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-ComplexTypes] * [DDL -- Create Table -- Row Format, Storage Format, and SerDe | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RowFormat,StorageFormat,andSerDe] * [DDL -- Alter Table -- Add SerDe Properties | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AddSerDeProperties] * [Hive SerDes | https://cwiki.apache.org/confluence/display/Hive/SerDe] Javadoc descriptions for the properties are in serde/src/java/org/apache/hadoop/hive/serde2/SerDeParameters.java: {code} + /** +* To be backward-compatible, initialize the first 3 separators to + * be the given values. Default number of separators to be 8; If only + * hive.serialization.extend.nesting.levels is set, extend the number of + * separators to be 24; if hive.serialization.extend.additional.nesting.levels + * is set, extend the number of separators to 154. +* @param tbl +*/ {code} Editorial review: Please align the asterisks and change the first semicolon to a period (to be 8; If only...). Support nested structs over 24 levels. -- Key: HIVE-9500 URL: https://issues.apache.org/jira/browse/HIVE-9500 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Labels: SerDe Fix For: 1.2.0 Attachments: HIVE-9500.1.patch Customer has deeply nested avro structure and is receiving the following error when performing queries. 15/01/09 20:59:29 ERROR ql.Driver: FAILED: SemanticException org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting supported for LazySimpleSerde is 23 Unable to work with level 24 Currently we support up to 24 levels of nested structs when hive.serialization.extend.nesting.levels is set to true, while the customers have the requirement to support more than that. It would be better to make the supported levels configurable or completely removed (i.e., we can support any number of levels). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5472) support a simple scalar which returns the current timestamp
[ https://issues.apache.org/jira/browse/HIVE-5472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14302905#comment-14302905 ] Lefty Leverenz commented on HIVE-5472: -- Does this need any documentation? support a simple scalar which returns the current timestamp --- Key: HIVE-5472 URL: https://issues.apache.org/jira/browse/HIVE-5472 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0 Reporter: N Campbell Assignee: Jason Dere Fix For: 1.2.0 Attachments: HIVE-5472.1.patch, HIVE-5472.2.patch, HIVE-5472.3.patch, HIVE-5472.4.patch ISO-SQL has two forms of functions local and current timestamp where the former is a TIMESTAMP WITHOUT TIMEZONE and the latter with TIME ZONE select cast ( unix_timestamp() as timestamp ) from T implement a function which computes LOCAL TIMESTAMP which would be the current timestamp for the users session time zone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9492) Enable caching in MapInput for Spark
[ https://issues.apache.org/jira/browse/HIVE-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300846#comment-14300846 ] Lefty Leverenz commented on HIVE-9492: -- Doc review: Please capitalize Hive and Spark in the parameter description, and change initial If to Whether. And shouldn't mapinput be MapInput? {code} +HIVE_CACHE_MAPINPUT(hive.exec.cache.mapinput, false, +Whether Hive (in Spark mode only) should cache MapInput if it applies.), {code} Enable caching in MapInput for Spark Key: HIVE-9492 URL: https://issues.apache.org/jira/browse/HIVE-9492 Project: Hive Issue Type: Bug Components: Spark Reporter: Xuefu Zhang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-9492.1-spark.patch, HIVE-9492.2-spark.patch, prototype.patch Because of the IOContext problem (HIVE-8920, HIVE-9084), RDD caching is currently disabled in MapInput. Prototyping shows that the problem can solved. Thus, we should formalize the prototype and enable the caching. A good query to test this is: {code} from (select * from dec union all select * from dec2) s insert overwrite table dec3 select s.name, sum(s.value) group by s.name insert overwrite table dec4 select s.name, s.value order by s.value; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8528) Add remote Spark client to Hive [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299721#comment-14299721 ] Lefty Leverenz commented on HIVE-8528: -- Doc note: [~szehon] added a Remote Spark Drive section to Configuration Properties with a nice overview. (Thanks, Szehon.) * [Configuration Properties -- Remote Spark Driver | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-RemoteSparkDriver] Add remote Spark client to Hive [Spark Branch] -- Key: HIVE-8528 URL: https://issues.apache.org/jira/browse/HIVE-8528 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-8528.1-spark-client.patch, HIVE-8528.1-spark.patch, HIVE-8528.2-spark.patch, HIVE-8528.2-spark.patch, HIVE-8528.3-spark.patch For the time being, at least, we've decided to build the Spark client (see SPARK-3215) inside Hive. This task tracks merging the ongoing work into the Spark branch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299731#comment-14299731 ] Lefty Leverenz commented on HIVE-8807: -- Thanks [~ekoifman] and [~thejas] and [~vikram.dixit]. What can be done to make sure webhcat-default.xml gets updated in future releases? Is there a release checklist for things like this? Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Assignee: Eugene Koifman Fix For: 1.0.0 Attachments: HIVE8807.patch The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). no precommit tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9519) Bump up spark client connection timeout
[ https://issues.apache.org/jira/browse/HIVE-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299483#comment-14299483 ] Lefty Leverenz commented on HIVE-9519: -- Doc note: This changes the default value of *hive.spark.client.server.connect.timeout* in HiveConf.java, so the wiki needs to be updated. * [ConfigurationProperties -- hive.spark.client.server.connect.timeout | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.spark.client.server.connect.timeout] _TLDR:_ Since it's a new parameter, version information won't be needed for the change of default. However the wiki currently says this parameter was created in 1.2.0 although Spark merged to trunk in 0.15.0 aka 1.1.0 (HIVE-9448). That's why this issue committed patches to 1.1.0 as well as 1.2.0, so the 1.2.0 version in the wiki needs to be changed to 1.1.0. Bump up spark client connection timeout --- Key: HIVE-9519 URL: https://issues.apache.org/jira/browse/HIVE-9519 Project: Hive Issue Type: Bug Components: Spark Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Priority: Blocker Labels: TODOC15 Fix For: 1.1.0 Attachments: HIVE-9519.patch Yarn apparently needs longer than current timeout. Bumping it up to 90 secs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9519) Bump up spark client connection timeout
[ https://issues.apache.org/jira/browse/HIVE-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9519: - Labels: TODOC15 (was: ) Bump up spark client connection timeout --- Key: HIVE-9519 URL: https://issues.apache.org/jira/browse/HIVE-9519 Project: Hive Issue Type: Bug Components: Spark Reporter: Ashish Kumar Singh Assignee: Ashish Kumar Singh Priority: Blocker Labels: TODOC15 Fix For: 1.1.0 Attachments: HIVE-9519.patch Yarn apparently needs longer than current timeout. Bumping it up to 90 secs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9482) Hive parquet timestamp compatibility
[ https://issues.apache.org/jira/browse/HIVE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299695#comment-14299695 ] Lefty Leverenz commented on HIVE-9482: -- Got your back, [~szehon] -- I changed the version number from 0.14.0 to 1.2.0 (probably a copy--paste error). * [hive.parquet.timestamp.skip.conversion | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.parquet.timestamp.skip.conversion] Hive parquet timestamp compatibility Key: HIVE-9482 URL: https://issues.apache.org/jira/browse/HIVE-9482 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.2.0 Attachments: HIVE-9482.2.patch, HIVE-9482.patch, HIVE-9482.patch, parquet_external_time.parq In current Hive implementation, timestamps are stored in UTC (converted from current timezone), based on original parquet timestamp spec. However, we find this is not compatibility with other tools, and after some investigation it is not the way of the other file formats, or even some databases (Hive Timestamp is more equivalent of 'timestamp without timezone' datatype). This is the first part of the fix, which will restore compatibility with parquet-timestamp files generated by external tools by skipping conversion on reading. Later fix will change the write path to not convert, and stop the read-conversion even for files written by Hive itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9487: - Labels: TODOC-SPARK TODOC15 (was: TODOC-SPARK) Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Labels: TODOC-SPARK, TODOC15 Fix For: spark-branch, 1.1.0 Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298397#comment-14298397 ] Lefty Leverenz commented on HIVE-9487: -- Doc note: This adds configuration parameter *hive.spark.client.rpc.sasl.mechanisms* to HiveConf.java in the Spark branch. So when it gets merged to trunk, it will need to be documented in Configuration Properties and Hive on Spark: Getting Started. * [Hive on Spark: Getting Started | https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started] * [Configuration Properties -- Spark | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark] Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9487) Make Remote Spark Context secure [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9487: - Labels: TODOC-SPARK (was: ) Make Remote Spark Context secure [Spark Branch] --- Key: HIVE-9487 URL: https://issues.apache.org/jira/browse/HIVE-9487 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-9487.1-spark.patch, HIVE-9487.2-spark.patch The RSC currently uses an ad-hoc, insecure authentication mechanism. We should instead use a proper auth mechanism and add encryption to the mix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296557#comment-14296557 ] Lefty Leverenz commented on HIVE-9489: -- +1 ... although two more quibbles could be fixed (@return true if the udf is deterministic - UDF; non deterministic - non-deterministic). Sorry I missed them the first time. add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted
[ https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297960#comment-14297960 ] Lefty Leverenz commented on HIVE-8966: -- Does this also need to be checked into branch-1.1 (formerly known as 0.15)? Delta files created by hive hcatalog streaming cannot be compacted -- Key: HIVE-8966 URL: https://issues.apache.org/jira/browse/HIVE-8966 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Environment: hive Reporter: Jihong Liu Assignee: Alan Gates Priority: Critical Fix For: 1.0.0 Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch hive hcatalog streaming will also create a file like bucket_n_flush_length in each delta directory. Where n is the bucket number. But the compactor.CompactorMR think this file also needs to compact. However this file of course cannot be compacted, so compactor.CompactorMR will not continue to do the compaction. Did a test, after removed the bucket_n_flush_length file, then the alter table partition compact finished successfully. If don't delete that file, nothing will be compacted. This is probably a very severity bug. Both 0.13 and 0.14 have this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called
[ https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298014#comment-14298014 ] Lefty Leverenz commented on HIVE-9473: -- Should this be documented in the SQL Standard Based Hive Authorization wikidoc (along with the configuration parameters created in HIVE-8893 -- *hive.server2.builtin.udf.whitelist* *hive.server2.builtin.udf.blacklist*)? * [SQL Standard Based Hive Authorization | https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization] ** [Configuration | https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization#SQLStandardBasedHiveAuthorization-Configuration] sql std auth should disallow built-in udfs that allow any java methods to be called --- Key: HIVE-9473 URL: https://issues.apache.org/jira/browse/HIVE-9473 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.0.0, 1.2.0 Attachments: HIVE-9473.1.patch As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java methods. This should be disallowed when sql standard authorization is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9489: - Labels: TODOC1.2 (was: ) add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297852#comment-14297852 ] Lefty Leverenz commented on HIVE-9489: -- +1 Thanks for indulging my nitpicks, Thejas. add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297852#comment-14297852 ] Lefty Leverenz edited comment on HIVE-9489 at 1/29/15 11:05 PM: +1 Thanks for indulging my nitpicks, Thejas. _Edit:_ Oops, voted after the commit. Well then I'll just add a TODOC1.2 label to make this comment useful. was (Author: le...@hortonworks.com): +1 Thanks for indulging my nitpicks, Thejas. add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9489.1.patch, HIVE-9489.2.patch, HIVE-9489.3.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9489) add javadoc for UDFType annotation
[ https://issues.apache.org/jira/browse/HIVE-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294894#comment-14294894 ] Lefty Leverenz commented on HIVE-9489: -- Hmph. Not many typos for me to find. ;) {{+ * Certain optimizations should not be applied if UDF is not deterministic}} ... needs a period at end of line. {{+ * don't apply for such UDFS, as they need to be invoked for each record.}} ... UDFs, not UDFS. {{+ * A UDF is considered distinctLike if the udf can be evaluated on just the}} ... udf should be UDF. add javadoc for UDFType annotation -- Key: HIVE-9489 URL: https://issues.apache.org/jira/browse/HIVE-9489 Project: Hive Issue Type: Bug Components: Documentation, UDF Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.2.0 Attachments: HIVE-9489.1.patch It is not clearly described, when a UDF should be marked as deterministic, stateful or distinctLike. Adding javadoc for now. This information should also be incorporated in the wikidoc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted
[ https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294911#comment-14294911 ] Lefty Leverenz commented on HIVE-8966: -- Any documentation needed? Delta files created by hive hcatalog streaming cannot be compacted -- Key: HIVE-8966 URL: https://issues.apache.org/jira/browse/HIVE-8966 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Environment: hive Reporter: Jihong Liu Assignee: Alan Gates Priority: Critical Fix For: 1.0.0 Attachments: HIVE-8966-branch-1.patch, HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.6.patch, HIVE-8966.patch hive hcatalog streaming will also create a file like bucket_n_flush_length in each delta directory. Where n is the bucket number. But the compactor.CompactorMR think this file also needs to compact. However this file of course cannot be compacted, so compactor.CompactorMR will not continue to do the compaction. Did a test, after removed the bucket_n_flush_length file, then the alter table partition compact finished successfully. If don't delete that file, nothing will be compacted. This is probably a very severity bug. Both 0.13 and 0.14 have this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9460) LLAP: Fix some static vars in the operator pipeline
[ https://issues.apache.org/jira/browse/HIVE-9460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294854#comment-14294854 ] Lefty Leverenz commented on HIVE-9460: -- Doc note: This adds configuration parameter *hive.execution.mode* to HiveConf.java, so it will need to be documented in the wiki when the LLAP branch gets merged to trunk. Should we add a TODOC-LLAP label to keep track of these doc issues? LLAP: Fix some static vars in the operator pipeline --- Key: HIVE-9460 URL: https://issues.apache.org/jira/browse/HIVE-9460 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-9460.1.patch There are a few static vars left in the operator pipeline. Can't have those with multi-threaded execution... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7935) Support dynamic service discovery for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295931#comment-14295931 ] Lefty Leverenz commented on HIVE-7935: -- Another doc note: Preliminary documentation is available in HiveServer2DynamicServiceDiscovery.pdf, which is attached to the umbrella jira HIVE-8376. * [HiveServer2DynamicServiceDiscovery.pdf | https://issues.apache.org/jira/secure/attachment/12673832/HiveServer2DynamicServiceDiscovery.pdf] Support dynamic service discovery for HiveServer2 - Key: HIVE-7935 URL: https://issues.apache.org/jira/browse/HIVE-7935 Project: Hive Issue Type: Sub-task Components: HiveServer2, JDBC Affects Versions: 0.14.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-7935.1.patch, HIVE-7935.2.patch, HIVE-7935.3.patch, HIVE-7935.4.patch, HIVE-7935.5.patch, HIVE-7935.6.patch, HIVE-7935.7.patch, HIVE-7935.8.patch To support Rolling Upgrade / HA, we need a mechanism by which a JDBC client can dynamically resolve an HiveServer2 to connect to. *High Level Design:* Whether, dynamic service discovery is supported or not, can be configured by setting HIVE_SERVER2_SUPPORT_DYNAMIC_SERVICE_DISCOVERY. ZooKeeper is used to support this. * When an instance of HiveServer2 comes up, it adds itself as a znode to ZooKeeper under a configurable namespace (HIVE_SERVER2_ZOOKEEPER_NAMESPACE). * A JDBC/ODBC client now specifies the ZooKeeper ensemble in its connection string, instead of pointing to a specific HiveServer2 instance. The JDBC driver, uses the ZooKeeper ensemble to pick an instance of HiveServer2 to connect for the entire session. * When an instance is removed from ZooKeeper, the existing client sessions continue till completion. When the last client session completes, the instance shuts down. * All new client connection pick one of the available HiveServer2 uris from ZooKeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2949) bug with describe and reserved keyword
[ https://issues.apache.org/jira/browse/HIVE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293155#comment-14293155 ] Lefty Leverenz commented on HIVE-2949: -- According to [~navis]'s comment on HIVE-6187 backticks worked for DESCRIBE in trunk on 13/Jan/2014 so that means the fix was included in Hive 0.13.0, released 21/Apr/2014. It may have been fixed by HIVE-6013 (Supporting Quoted Identifiers in Column Names), committed to trunk 19/Dec/2013. bug with describe and reserved keyword -- Key: HIVE-2949 URL: https://issues.apache.org/jira/browse/HIVE-2949 Project: Hive Issue Type: Bug Reporter: Fabian Alenius Priority: Minor Hi, there seems to an issue with the describe statement and using reserved keywords for tablenames. Specifically, describe does not seem to work on a table called 'user' even though it's escaped in the query. So this works: show partitions `user`; But this does not work: describe `user`; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293242#comment-14293242 ] Lefty Leverenz commented on HIVE-6187: -- Documentation done, please review the last information box (Bug fixed in Hive 0.13.0 — quoted identifiers) in this section: * [Language Manual -- DDL -- Describe Table/View/Column | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-DescribeTable/View/Column] Add test to verify that DESCRIBE TABLE works with quoted table names Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Assignee: Carl Steinbach Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-6187.1.patch Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)