[jira] [Updated] (HIVE-6333) Generate vectorized plan for decimal expressions.
[ https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6333: --- Description: Transform non-vector plan to vectorized plan for supported decimal expressions. Generate vectorized plan for decimal expressions. - Key: HIVE-6333 URL: https://issues.apache.org/jira/browse/HIVE-6333 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6333.1.patch Transform non-vector plan to vectorized plan for supported decimal expressions. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6333) Generate vectorized plan for decimal expressions.
[ https://issues.apache.org/jira/browse/HIVE-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6333: --- Attachment: HIVE-6333.1.patch An early version of the patch. Generate vectorized plan for decimal expressions. - Key: HIVE-6333 URL: https://issues.apache.org/jira/browse/HIVE-6333 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6333.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6356: Attachment: HIVE-6356.1.patch.txt Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6356) Dependency injection in hbase storage handler is broken
Navis created HIVE-6356: --- Summary: Dependency injection in hbase storage handler is broken Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6356: Status: Patch Available (was: Open) Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6047) Permanent UDFs in Hive
[ https://issues.apache.org/jira/browse/HIVE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6047: - Attachment: PermanentFunctionsinHive.pdf updating doc to change the jar/file management. Rather than the idea of jar sets, each jar/file would be created as a separate resource, and referenced by the UDF. This would make the metastore changes a bit simpler. Permanent UDFs in Hive -- Key: HIVE-6047 URL: https://issues.apache.org/jira/browse/HIVE-6047 Project: Hive Issue Type: Bug Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: PermanentFunctionsinHive.pdf, PermanentFunctionsinHive.pdf Currently Hive only supports temporary UDFs which must be re-registered when starting up a Hive session. Provide some support to register permanent UDFs with Hive. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-4144) Add select database() command to show the current database
[ https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889299#comment-13889299 ] Hive QA commented on HIVE-4144: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626585/HIVE-4144.12.patch.txt {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 4999 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1159/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1159/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626585 Add select database() command to show the current database Key: HIVE-4144 URL: https://issues.apache.org/jira/browse/HIVE-4144 Project: Hive Issue Type: Bug Components: SQL Reporter: Mark Grover Assignee: Navis Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch A recent hive-user mailing list conversation asked about having a command to show the current database. http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E MySQL seems to have a command to do so: {code} select database(); {code} http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database We should look into having something similar in Hive. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-4144) Add select database() command to show the current database
[ https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889308#comment-13889308 ] Navis commented on HIVE-4144: - Rebased to trunk Add select database() command to show the current database Key: HIVE-4144 URL: https://issues.apache.org/jira/browse/HIVE-4144 Project: Hive Issue Type: Bug Components: SQL Reporter: Mark Grover Assignee: Navis Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch A recent hive-user mailing list conversation asked about having a command to show the current database. http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E MySQL seems to have a command to do so: {code} select database(); {code} http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database We should look into having something similar in Hive. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4144) Add select database() command to show the current database
[ https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4144: Attachment: HIVE-4144.13.patch.txt Add select database() command to show the current database Key: HIVE-4144 URL: https://issues.apache.org/jira/browse/HIVE-4144 Project: Hive Issue Type: Bug Components: SQL Reporter: Mark Grover Assignee: Navis Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch A recent hive-user mailing list conversation asked about having a command to show the current database. http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E MySQL seems to have a command to do so: {code} select database(); {code} http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database We should look into having something similar in Hive. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6267) Explain explain
[ https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889311#comment-13889311 ] Navis commented on HIVE-6267: - It's breaking all of pending patches. Is this that good? Explain explain --- Key: HIVE-6267 URL: https://issues.apache.org/jira/browse/HIVE-6267 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.13.0 Attachments: HIVE-6267.1.partial, HIVE-6267.2.partial, HIVE-6267.3.partial, HIVE-6267.4.patch, HIVE-6267.5.patch, HIVE-6267.6.patch, HIVE-6267.7.patch.gz, HIVE-6267.8.patch I've gotten feedback over time saying that it's very difficult to grok our explain command. There's supposedly a lot of information that mainly matters to developers or the testing framework. Comparing it to other major DBs it does seem like we're packing way more into explain than other folks. I've gone through the explain checking, what could be done to improve readability. Here's a list of things I've found: - AST (unreadable in it's lisp syntax, not really required for end users) - Vectorization (enough to display once per task and only when true) - Expressions representation is very lengthy, could be much more compact - if not exists on DDL (enough to display only on true, or maybe not at all) - bucketing info (enough if displayed only if table is actually bucketed) - external flag (show only if external) - GlobalTableId (don't need in plain explain, maybe in extended) - Position of big table (already clear from plan) - Stats always (Most DBs mostly only show stats in explain, that gives a sense of what the planer thinks will happen) - skew join (only if true should be enough) - limit doesn't show the actual limit - Alias - Map Operator tree - alias is duplicated in TableScan operator - tag is only useful at runtime (move to explain extended) - Some names are camel case or abbreviated, clearer if full name - Tez is missing vertex map (aka edges) - explain formatted (json) is broken right now (swallows some information) Since changing explain results in many golden file updates, i'd like to take a stab at all of these at once. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6357) Extend the orcalltypes test table to include DECIMAL columns
Remus Rusanu created HIVE-6357: -- Summary: Extend the orcalltypes test table to include DECIMAL columns Key: HIVE-6357 URL: https://issues.apache.org/jira/browse/HIVE-6357 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Priority: Minor The orcalltypes table is used in many vectorized clientpositive tests. As we add support for DECIMAL, it would be nice to have the table include a few DECIMAL columns (various scale/precision) to ease writing new test cases. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)
Gunther Hagleitner created HIVE-6358: Summary: filterExpr not printed in explain for tablescan operators (ppd) Key: HIVE-6358 URL: https://issues.apache.org/jira/browse/HIVE-6358 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)
[ https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6358: - Attachment: HIVE-6358.1.patch filterExpr not printed in explain for tablescan operators (ppd) --- Key: HIVE-6358 URL: https://issues.apache.org/jira/browse/HIVE-6358 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6358.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267
[ https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6353: - Attachment: HIVE-6353.1.patch Update hadoop-2 golden files after HIVE-6267 Key: HIVE-6353 URL: https://issues.apache.org/jira/browse/HIVE-6353 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6353.1.patch HIVE-6267 changed explain with lots of changes to golden files. Separate jira because of number of files changed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)
[ https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6358: - Status: Patch Available (was: Open) filterExpr not printed in explain for tablescan operators (ppd) --- Key: HIVE-6358 URL: https://issues.apache.org/jira/browse/HIVE-6358 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6358.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267
[ https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6353: - Status: Patch Available (was: Open) Update hadoop-2 golden files after HIVE-6267 Key: HIVE-6353 URL: https://issues.apache.org/jira/browse/HIVE-6353 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6353.1.patch HIVE-6267 changed explain with lots of changes to golden files. Separate jira because of number of files changed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6357) Extend the orcalltypes test table to include DECIMAL columns
[ https://issues.apache.org/jira/browse/HIVE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6357: --- Description: The alltypesorc table is used in many vectorized clientpositive tests. As we add support for DECIMAL, it would be nice to have the table include a few DECIMAL columns (various scale/precision) to ease writing new test cases. alltypesorc was introduced with HIVE-5314. (was: The orcalltypes table is used in many vectorized clientpositive tests. As we add support for DECIMAL, it would be nice to have the table include a few DECIMAL columns (various scale/precision) to ease writing new test cases.) Extend the orcalltypes test table to include DECIMAL columns Key: HIVE-6357 URL: https://issues.apache.org/jira/browse/HIVE-6357 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Priority: Minor The alltypesorc table is used in many vectorized clientpositive tests. As we add support for DECIMAL, it would be nice to have the table include a few DECIMAL columns (various scale/precision) to ease writing new test cases. alltypesorc was introduced with HIVE-5314. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6357) Extend the alltypesorc test table to include DECIMAL columns
[ https://issues.apache.org/jira/browse/HIVE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6357: --- Summary: Extend the alltypesorc test table to include DECIMAL columns (was: Extend the orcalltypes test table to include DECIMAL columns) Extend the alltypesorc test table to include DECIMAL columns Key: HIVE-6357 URL: https://issues.apache.org/jira/browse/HIVE-6357 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Priority: Minor The alltypesorc table is used in many vectorized clientpositive tests. As we add support for DECIMAL, it would be nice to have the table include a few DECIMAL columns (various scale/precision) to ease writing new test cases. alltypesorc was introduced with HIVE-5314. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5859) Create view does not captures inputs
[ https://issues.apache.org/jira/browse/HIVE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889354#comment-13889354 ] Hive QA commented on HIVE-5859: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626588/HIVE-5859.5.patch.txt {color:green}SUCCESS:{color} +1 4997 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1160/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1160/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12626588 Create view does not captures inputs Key: HIVE-5859 URL: https://issues.apache.org/jira/browse/HIVE-5859 Project: Hive Issue Type: Bug Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: D14235.1.patch, HIVE-5859.2.patch.txt, HIVE-5859.3.patch.txt, HIVE-5859.4.patch.txt, HIVE-5859.5.patch.txt For example, CREATE VIEW view_j5jbymsx8e_1 as SELECT * FROM tbl_j5jbymsx8e; should capture default.tbl_j5jbymsx8e as input entity for authorization process but currently it's not. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6267) Explain explain
[ https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889380#comment-13889380 ] Gunther Hagleitner commented on HIVE-6267: -- Sorry [~navis]. I tried to keep the disruption as minimal as possible. I collected all things I thought need fixing together before making the changes. Then, I waited until the weekend and a time when the queue is empty and tried to get everything back in shape before people start working again. I think it's worth it, otherwise I wouldn't have spent so much time on it. As I mentioned above, I've gotten feedback multiple times about seeing if we can improve explain. Unfortunately, that means tons of golden files. If you can think of a better way I can back out and try again. But it's not clear to me how to avoid changing that many golden files, since we rely on it so heavily in the q files.. Explain explain --- Key: HIVE-6267 URL: https://issues.apache.org/jira/browse/HIVE-6267 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.13.0 Attachments: HIVE-6267.1.partial, HIVE-6267.2.partial, HIVE-6267.3.partial, HIVE-6267.4.patch, HIVE-6267.5.patch, HIVE-6267.6.patch, HIVE-6267.7.patch.gz, HIVE-6267.8.patch I've gotten feedback over time saying that it's very difficult to grok our explain command. There's supposedly a lot of information that mainly matters to developers or the testing framework. Comparing it to other major DBs it does seem like we're packing way more into explain than other folks. I've gone through the explain checking, what could be done to improve readability. Here's a list of things I've found: - AST (unreadable in it's lisp syntax, not really required for end users) - Vectorization (enough to display once per task and only when true) - Expressions representation is very lengthy, could be much more compact - if not exists on DDL (enough to display only on true, or maybe not at all) - bucketing info (enough if displayed only if table is actually bucketed) - external flag (show only if external) - GlobalTableId (don't need in plain explain, maybe in extended) - Position of big table (already clear from plan) - Stats always (Most DBs mostly only show stats in explain, that gives a sense of what the planer thinks will happen) - skew join (only if true should be enough) - limit doesn't show the actual limit - Alias - Map Operator tree - alias is duplicated in TableScan operator - tag is only useful at runtime (move to explain extended) - Some names are camel case or abbreviated, clearer if full name - Tez is missing vertex map (aka edges) - explain formatted (json) is broken right now (swallows some information) Since changing explain results in many golden file updates, i'd like to take a stab at all of these at once. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf
[ https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889402#comment-13889402 ] Lefty Leverenz commented on HIVE-6037: -- bq. It's not auto-generated. All by hand (was tedious). Wow, that's a labor of love. Let's not abandon it, let's use it as a one-time fix to get all the parameter descriptions into HiveConf. Then later we can figure out how to generate hive-default.xml.template from HiveConf for each new release. However there's a problem with the default values, since HiveConf sets them so those are the correct values. Also, updating would be needed for changes since December 15th. But that's easier once the two files are synchronized. Hmm ... if HiveConf already has a comment that's different from the description in hive-default.xml.template, should both be kept? Ideally they'd be merged, and sometimes that's an easy edit but other times it requires expert information. To muddy the waters further, the wiki has some release information and miscellaneous notes that don't have to be merged with HiveConf but shouldn't get lost if we eventually generate the wiki from one of the code files. I've been wanting to go through the list and and fill in the Added In: information by grepping the config names in a directory of HiveConf files for all the branches. (Another labor of love, but it's less important than making sure all the parameters are listed in the wiki hive-default.xml.template.) [~cwsteinbach] created the Configuration Properties wikidoc in the first place -- how was that done? Synchronize HiveConf with hive-default.xml.template and support show conf - Key: HIVE-6037 URL: https://issues.apache.org/jira/browse/HIVE-6037 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Priority: Minor Attachments: HIVE-6037.1.patch.txt see HIVE-5879 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6354) Some index test golden files produce non-deterministic stats in explain
[ https://issues.apache.org/jira/browse/HIVE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889409#comment-13889409 ] Hive QA commented on HIVE-6354: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626589/HIVE-6354.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4994 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1162/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1162/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626589 Some index test golden files produce non-deterministic stats in explain --- Key: HIVE-6354 URL: https://issues.apache.org/jira/browse/HIVE-6354 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6354.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6329) Support column level encryption/decryption
[ https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889457#comment-13889457 ] Hive QA commented on HIVE-6329: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626591/HIVE-6329.3.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4998 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1163/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1163/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626591 Support column level encryption/decryption -- Key: HIVE-6329 URL: https://issues.apache.org/jira/browse/HIVE-6329 Project: Hive Issue Type: New Feature Components: Security, Serializers/Deserializers Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6329.1.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt Receiving some requirements on encryption recently but hive is not supporting it. Before the full implementation via HIVE-5207, this might be useful for some cases. {noformat} hive create table encode_test(id int, name STRING, phone STRING, address STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('column.encode.indices'='2,3', 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') STORED AS TEXTFILE; OK Time taken: 0.584 seconds hive insert into table encode_test select 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows); .. OK Time taken: 5.121 seconds hive select * from encode_test; OK 100 navis MDEwLTAwMDAtMDAwMA== U2VvdWwsIFNlb2Nobw== Time taken: 0.078 seconds, Fetched: 1 row(s) hive {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6336) Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe
[ https://issues.apache.org/jira/browse/HIVE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889488#comment-13889488 ] Andy Jefferson commented on HIVE-6336: -- @Nigel Savage, The most recent release is as follows : datanucleus-core-3.2.12, datanucleus-api-jdo-3.2.8, datanucleus-api-rdbms-3.2.11. Note that HIVE-5218 requires datanucleus-rdbms-3.2.7 or later, and HIVE-6136 requires datanucleus-rdbms-3.2.11 too. Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe -- Key: HIVE-6336 URL: https://issues.apache.org/jira/browse/HIVE-6336 Project: Hive Issue Type: Wish Components: HiveServer2 Affects Versions: 0.12.0 Environment: Hadoop 2.2 local derby Meatastore embedded Reporter: Nigel Savage Priority: Blocker Labels: HADOOP There is an with hive 12 datanucleus incompatability which seems to have invompatibility with org.apache.hadoop.hive.contrib.serde2.RegexSerDe The main question: *IF hive 0.12.0 and datanucleus are compatabile, then what is the version of datanucleus I should be using with Hive 12 and Hadoop 2.2?* The error which Im getting (this blocks me from properly running hive queries invoked from the test phase of a maven project) *To reproduce* I have hadoop and hive running as a pseudo cluster local mode and derby as the metastore I have the following environment variables {noformat} HADOOP_HOME=/home/ubu/hadoop JAVA_HOME=/usr/lib/jvm/java-7-oracle {noformat} I have the RegexSerDe declared in the hive-site.xml {noformat} property namehive.aux.jars.path/name valuefile:///home/ubu/hadoop/lib/hive-contrib-0.12.0.jar /value descriptionThis JAR file available to all users for alljobs/description /property {noformat} If I run with {noformat} datanucleus.version3.0.2/datanucleus.version {noformat} I get the following 1 exception only 'java.lang.ClassNotFoundException...org.datanucleus.store.types.backed.Ma' HOWEVER, If I run with {noformat} datanucleus.version3.2.0-release/datanucleus.version {noformat} I get the following 1 exception exception only java.lang.ClassNotFoundException: org/apache/hadoop/hive/contrib/serde2/RegexSerDe EXPLANATION The RegexSerDe class is picked up at run time but the datanucleus Map class is not available, I have checked in the datanucleus-core 3.0.2 jar and it is missing, Upgrading to the first datanucleus above 3.0.2 that includes the Map class throws the ClassNotFoundException for RegexSerDe. The earlier *3.0.2* datanucleus, code fails with the missing Map class but the RegexSerDe class is found, then when I upgrade to the 3.2.0-release the Map class is found but for some unkown reason the code/Hive no longer finds the RegexSerDe class I started using the same datanucleus dependencies found in this hive pom http://maven-repository.com/artifact/org.apache.hive/hive-metastore/0.12.0/pom below are the dependencies my latest attempts to get a functioning pom {noformat} dependency groupIdorg.apache.hbase/groupId artifactIdhbase-server/artifactId version0.96.0-hadoop2/version /dependency dependency groupIdorg.apache.hbase/groupId artifactIdhbase-client/artifactId version0.96.0-hadoop2/version /dependency !-- misc -- dependency groupIdorg.apache.commons/groupId artifactIdcommons-lang3/artifactId version3.1/version /dependency dependency groupIdcom.google.guava/groupId artifactIdguava/artifactId version${guava.version}/version /dependency dependency groupIdorg.apache.derby/groupId artifactIdderby/artifactId version${derby.version}/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-core/artifactId version${datanucleus.version}/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-rdbms/artifactId version${datanucleus-rdbms.version}/version /dependency dependency groupIdjavax.jdo/groupId artifactIdjdo-api/artifactId version3.0.1/version /dependency dependency groupIdorg.datanucleus/groupId artifactIddatanucleus-api-jdo/artifactId version${datanucleus.jdo.version}/version exclusions exclusion
[jira] [Commented] (HIVE-5252) Add ql syntax for inline java code creation
[ https://issues.apache.org/jira/browse/HIVE-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889489#comment-13889489 ] Lefty Leverenz commented on HIVE-5252: -- I added compile to the list of values for hive.security.command.whitelist in the wiki, but it needs to be added to *hive-default.xml.template* too. * [Configuration Properties: hive.security.command.whitelist |https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.security.command.whitelist] Also, the wiki needs to explain inline Java code creation with a few examples. (I'd need more to go on than compile_processor.q in the patch.) Does it belong in the SELECT wikidoc or a new wikidoc, or with the UDFs? How about a new wikidoc under the CLI doc? * [Language Manual |https://cwiki.apache.org/confluence/display/Hive/LanguageManual] * [SELECT |https://cwiki.apache.org/confluence/display/Hive/LanguageManual Select] * [Operators and UDFs |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF] * [CLI |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] Or should documentation wait until the parent jira HIVE-5250 gets committed? Add ql syntax for inline java code creation --- Key: HIVE-5252 URL: https://issues.apache.org/jira/browse/HIVE-5252 Project: Hive Issue Type: Sub-task Reporter: Edward Capriolo Assignee: Edward Capriolo Fix For: 0.13.0 Attachments: HIVE-5252.1.patch.txt, HIVE-5252.2.patch.txt Something to the effect of compile 'my code here' using 'groovycompiler'. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/ --- Review request for hive. Bugs: HIVE-6327 https://issues.apache.org/jira/browse/HIVE-6327 Repository: hive-git Description --- Added methods for those UDFs such that decimal type can be accepted and evaluated. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION Diff: https://reviews.apache.org/r/17661/diff/ Testing --- New unit test cases are added to test all the UDFs regarding decimal input. Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889509#comment-13889509 ] Brock Noland commented on HIVE-5783: That test failure is unrelated to the patch. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format
[ https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889527#comment-13889527 ] Hive QA commented on HIVE-6204: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626608/HIVE-6204.1.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4997 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_keyword_1 org.apache.hadoop.hive.jdbc.TestJdbcDriver.testShowGrant {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1164/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1164/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626608 The result of show grant / show role should be tabular format - Key: HIVE-6204 URL: https://issues.apache.org/jira/browse/HIVE-6204 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6204.1.patch.txt {noformat} hive show grant role role1 on all; OK database default table src principalName role1 principalType ROLE privilege Create grantTime Wed Dec 18 14:17:56 KST 2013 grantor navis database default table srcpart principalName role1 principalType ROLE privilege Update grantTime Wed Dec 18 14:18:28 KST 2013 grantor navis {noformat} This should be something like below, especially for JDBC clients. {noformat} hive show grant role role1 on all; OK default src role1 ROLECreate false 1387343876000 navis default srcpart role1 ROLEUpdate false 1387343908000 navis {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf
[ https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889529#comment-13889529 ] Brock Noland commented on HIVE-6037: bq. Then would hive-default.xml get used when hive-site.xml doesn't exist? I believe the HiveConf class would then use hive-default.xml to load defaults and descriptions. bq. Wow, that's a labor of love. Let's not abandon it, Agreed, I think we should go forward with this patch bq. let's use it as a one-time fix to get all the parameter descriptions into HiveConf. Then later we can figure out how to generate hive-default.xml.template from HiveConf for each new release. However there's a problem with the default values, since HiveConf sets them so those are the correct values. AFAICT this patch moves the descriptions and defaults to hive-default.xml.template. I think the next step is generating hive-default.xml from the template. Then the HiveConf class uses hive-default.xml to load defaults and descriptions. Synchronize HiveConf with hive-default.xml.template and support show conf - Key: HIVE-6037 URL: https://issues.apache.org/jira/browse/HIVE-6037 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Priority: Minor Attachments: HIVE-6037.1.patch.txt see HIVE-5879 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889605#comment-13889605 ] Hive QA commented on HIVE-6356: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626614/HIVE-6356.1.patch.txt {color:green}SUCCESS:{color} +1 4997 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1165/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1165/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12626614 Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format
[ https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889611#comment-13889611 ] Ashutosh Chauhan commented on HIVE-6204: Patch looks good. Failure seems to be related to update of .q.out file. I wonder for grant time, is there any value in showing that at all. Shall we not display it ever? I want to avoid test conf variable, since current usage looks fine, but in future once its in, folks may abuse it. The result of show grant / show role should be tabular format - Key: HIVE-6204 URL: https://issues.apache.org/jira/browse/HIVE-6204 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6204.1.patch.txt {noformat} hive show grant role role1 on all; OK database default table src principalName role1 principalType ROLE privilege Create grantTime Wed Dec 18 14:17:56 KST 2013 grantor navis database default table srcpart principalName role1 principalType ROLE privilege Update grantTime Wed Dec 18 14:18:28 KST 2013 grantor navis {noformat} This should be something like below, especially for JDBC clients. {noformat} hive show grant role role1 on all; OK default src role1 ROLECreate false 1387343876000 navis default srcpart role1 ROLEUpdate false 1387343908000 navis {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6122) Implement show grant on resource
[ https://issues.apache.org/jira/browse/HIVE-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889618#comment-13889618 ] Ashutosh Chauhan commented on HIVE-6122: Certainly resolving conflicts was easier than writing patch in first place : ) Thanks for all your work! Implement show grant on resource -- Key: HIVE-6122 URL: https://issues.apache.org/jira/browse/HIVE-6122 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6122.1.patch.txt, HIVE-6122.2.patch.txt, HIVE-6122.3.patch.txt, HIVE-6122.4.patch, HIVE-6122.4.patch, HIVE-6122.5.patch, HIVE-6122.6.patch Currently, hive shows privileges owned by a principal. Reverse API is also needed, which shows all principals for a resource. {noformat} show grant user hive_test_user on database default; show grant user hive_test_user on table dummy; show grant user hive_test_user on all; {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889639#comment-13889639 ] Ashutosh Chauhan commented on HIVE-6356: Is htrace strictly required ? If so, than don't we need to make sure its jar is available at run-time (currently it seems we don't have it in our lib/ dir of package?) [~ndimiduk] Can you also take a look? Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6359) beeline -f fails on scripts with tabs in them.
Carter Shanklin created HIVE-6359: - Summary: beeline -f fails on scripts with tabs in them. Key: HIVE-6359 URL: https://issues.apache.org/jira/browse/HIVE-6359 Project: Hive Issue Type: Bug Reporter: Carter Shanklin Priority: Minor On a recent trunk build I used beeline -f on a script with tabs in it. Beeline rather unhelpfully attempts to perform tab expansion on the tabs and the query fails. Here's a screendump. {code} Connecting to jdbc:hive2://mymachine:1/mydb Connected to: Apache Hive (version 0.13.0-SNAPSHOT) Driver: Hive JDBC (version 0.13.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.13.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://mymachine:1/mydb select i_brand_id as brand_id, i_brand as brand, . . . . . . . . . . . . . . . . . . . . . . . Display all 560 possibilities? (y or n) . . . . . . . . . . . . . . . . . . . . . . . ager_id=36 . . . . . . . . . . . . . . . . . . . . . . . Display all 560 possibilities? (y or n) . . . . . . . . . . . . . . . . . . . . . . . d d_moy=12 . . . . . . . . . . . . . . . . . . . . . . . Display all 560 possibilities? (y or n) . . . . . . . . . . . . . . . . . . . . . . . d d_year=2001 . . . . . . . . . . . . . . . . . . . . . . . and ss_sold_date between '2001-12-01' and '2001-12-31' . . . . . . . . . . . . . . . . . . . . . . . group by i_brand, i_brand_id . . . . . . . . . . . . . . . . . . . . . . . order by ext_price desc, brand_id . . . . . . . . . . . . . . . . . . . . . . . limit 100 ; Error: Error while compiling statement: FAILED: ParseException line 1:65 missing FROM at 'd_moy' near 'd' in from source (state=42000,code=4) Closing: org.apache.hive.jdbc.HiveConnection {code} The same query works fine if I replace tabs with some spaces. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17622/#review33378 --- Looks good to me. See one comment inline. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java https://reviews.apache.org/r/17622/#comment62885 Please add a comment why you are using decimal.* and why it's different than the others. - Eric Hanson On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17622/ --- (Updated Jan. 31, 2014, 10:19 p.m.) Review request for hive and Eric Hanson. Repository: hive-git Description --- VectorExpressionWriter for date and decimal datatypes. Diffs - common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java f513188 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java e5c3aa4 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java a242fef ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java ad96fa5 ql/src/test/queries/clientpositive/vectorization_decimal_date.q PRE-CREATION ql/src/test/results/clientpositive/vectorization_decimal_date.q.out PRE-CREATION Diff: https://reviews.apache.org/r/17622/diff/ Testing --- Thanks, Jitendra Pandey
[jira] [Commented] (HIVE-4144) Add select database() command to show the current database
[ https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889712#comment-13889712 ] Hive QA commented on HIVE-4144: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626617/HIVE-4144.13.patch.txt {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4999 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hcatalog.hbase.snapshot.lock.TestWriteLock.testRun {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1166/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1166/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626617 Add select database() command to show the current database Key: HIVE-4144 URL: https://issues.apache.org/jira/browse/HIVE-4144 Project: Hive Issue Type: Bug Components: SQL Reporter: Mark Grover Assignee: Navis Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch A recent hive-user mailing list conversation asked about having a command to show the current database. http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E MySQL seems to have a command to do so: {code} select database(); {code} http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database We should look into having something similar in Hive. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889708#comment-13889708 ] Nick Dimiduk commented on HIVE-6356: I stumbled into this recently as well. HTrace is now a required runtime dependency, even when it's not used. This patch is incorrect, however. Because Hive is using the mapred namespace classes, the correct API is to invoke o.a.h.hbase.mapred.TableMapReduceUtil#addDependencyJars(JobConf). This will wire in all of HBase's runtime dependencies for you, and also attempt to auto-detect additional dependencies based on the JobConf (output classes, partitioners, formats, etc). If you want more fine-grained control over these dependencies (as Pig did, see PIG-3285), there are additional static methods in the o.a.h.hbase.mapreduce.TableMapReduceUtil class. For Hive's purpose, I think you'll be fine with just calling mapred.TableMapReduceUtil#addDependencyJars(JobConf). Having a smoke test that runs in pseudo-distributed mode would be helpful in verifying all requirements are met. Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Work stopped] (HIVE-6234) Implement fast vectorized InputFormat extension for text files
[ https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-6234 stopped by Eric Hanson. Implement fast vectorized InputFormat extension for text files -- Key: HIVE-6234 URL: https://issues.apache.org/jira/browse/HIVE-6234 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text InputFormat design.docx, Vectorized Text InputFormat design.pdf, state-diagram.jpg Implement support for vectorized scan input of text files (plain text with configurable record and field separators). This should work for CSV files, tab delimited files, etc. The goal is to provide high-performance reading of these files using vectorized scans, and also to do it as an extension of existing Hive. Then, if vectorized query is enabled, existing tables based on text files will be able to benefit immediately without the need to use a different input format. After upgrading to new Hive bits that support this, faster, vectorized processing over existing text tables should just work, when vectorization is enabled. Another goal is to go beyond a simple layering of vectorized row batch iterator over the top of the existing row iterator. It should be possible to, say, read a chunk of data into a byte buffer (several thousand or even million rows), and then read data from it into vectorized row batches directly. Object creations should be minimized to save allocation time and GC overhead. If it is possible to save CPU for values like dates and numbers by caching the translation from string to the final data type, that should ideally be implemented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HIVE-6232) allow user to control out-of-range values in HCatStorer
[ https://issues.apache.org/jira/browse/HIVE-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman resolved HIVE-6232. -- Resolution: Won't Fix This was rolled into the patch for HIVE-5814 allow user to control out-of-range values in HCatStorer --- Key: HIVE-6232 URL: https://issues.apache.org/jira/browse/HIVE-6232 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Pig values support wider range than Hive. e.g. Pig BIGDECIMAL vs Hive DECIMAL. When storing Pig data into Hive table, if the value is out of range there are 2 options: 1. throw an exception. 2. write NULL instead of the value The 1st has the drawback that it may kill the process that loads 100M rows after 90M rows have been loaded. But the 2nd may not be appropriate for all use cases. Should add support for additional parameters in HCatStorer where the user can specify an option to controll this. see org.apache.pig.backend.hadoop.hbase.HBaseStorage for examples -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6316) Document support for new types in HCat
[ https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6316: - Description: HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. was: HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Document support for new types in HCat -- Key: HIVE-6316 URL: https://issues.apache.org/jira/browse/HIVE-6316 Project: Hive Issue Type: Sub-task Components: Documentation, HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Proposal to un-fork Sqlline
As you probably know, Hive’s SQL command-line interface Beeline was created by forking Sqlline [1] [2]. At the time it was a useful but low-activity project languishing on SourceForge without an active owner. Around the same time, I independently picked up the Sqlline code, moved it to github [3], put in place a maven build process, and gave it some love. Now several projects are using it, including Apache Drill, Apache Phoenix, Cascading Lingual and Optiq. So, now we have two active forks of Sqlline. I propose to merge these development forks. This will achieve a few things. We should be able to fix more bugs, and add more features, and get more people using sqlline. (Just today, someone ran into a bug that Drill was not saving/restoring command history, then noticed that it was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug still exists in Hive’s beeline.) I propose the following: 1. Move the parts of hive-beeline module that do not depend upon Hive (about 90% of the code) into a new module in the hive repo, hive-sqlline. 2. What remains in the hive-beeline module is Beeline.java (a derived class of Sqlline.java) and Hive-specific extensions. The hive-beeline module depends upon the hive-sqlline module. 3. Make sure that the new Hive sqlline module contains all fixes and useful changes from both forks. 4. Release sqlline as a maven artifact, say {groupId=org.apache.hive, artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate to it. 5. Longer term, consider moving hive-sqlline out of Hive, but still within Apache. This achieves continuity for Hive’s users, gives the users of the non-Hive sqlline a version with minimal dependencies, unifies the two code lines, and brings everything under the Apache roof. Please let me know if this sounds like a good proposal. I’ll log a jira case, then start work on a patch. Julian [1] https://issues.apache.org/jira/browse/HIVE-987 [2] https://issues.apache.org/jira/browse/HIVE-3100 [3] https://github.com/julianhyde/sqlline [4] https://github.com/julianhyde/sqlline/issues/19 [5] https://issues.apache.org/jira/browse/DRILL-327
[jira] [Commented] (HIVE-6268) Network resource leak with HiveClientCache when using HCatInputFormat
[ https://issues.apache.org/jira/browse/HIVE-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889741#comment-13889741 ] Sushanth Sowmyan commented on HIVE-6268: Hi Lefty, Yes, I think we should document it in a release note. I'm planning on finishing HIVE-6332 in a week but it's good to have it in a release note as well. Network resource leak with HiveClientCache when using HCatInputFormat - Key: HIVE-6268 URL: https://issues.apache.org/jira/browse/HIVE-6268 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.13.0 Attachments: HIVE-6268.2.patch, HIVE-6268.3.patch, HIVE-6268.patch HCatInputFormat has a cache feature that allows HCat to cache hive client connections to the metastore, so as to not keep reinstantiating a new hive server every single time. This uses a guava cache of hive clients, which only evicts entries from cache on the next write, or by manually managing the cache. So, in a single threaded case, where we reuse the hive client, the cache works well, but in a massively multithreaded case, where each thread might perform one action, and then is never used, there are no more writes to the cache, and all the clients stay alive, thus keeping ports open. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6342) hive drop partitions should use standard expr filter instead of some custom class
[ https://issues.apache.org/jira/browse/HIVE-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889751#comment-13889751 ] Sergey Shelukhin commented on HIVE-6342: parallel_orderby appears to be flaky, unrelated failure hive drop partitions should use standard expr filter instead of some custom class -- Key: HIVE-6342 URL: https://issues.apache.org/jira/browse/HIVE-6342 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6342.01.patch, HIVE-6342.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Proposal to un-fork Sqlline
Hi Julian, Thanks for sharing your thought. I'm certainly on board on code sharing among project. However, I don't see immediate benefits for Hive by separating Beeline into two modules. Instead, it requires additional work and potentially creates instability, while code sharing isn't achieved until the proposed hive-sqlline module is promoted to an independent project. On the other hand, I'm thinking if it makes more sense to fork sqlline directly into Apache. upon its completion, Hive gets rid of its copy of sqlline and creates a dependency on the forked sqlline instead. I guess this is a top-down approach and the benefits are immediate across multiple projects. Thanks, Xuefu On Mon, Feb 3, 2014 at 10:49 AM, Julian Hyde julianh...@gmail.com wrote: As you probably know, Hive's SQL command-line interface Beeline was created by forking Sqlline [1] [2]. At the time it was a useful but low-activity project languishing on SourceForge without an active owner. Around the same time, I independently picked up the Sqlline code, moved it to github [3], put in place a maven build process, and gave it some love. Now several projects are using it, including Apache Drill, Apache Phoenix, Cascading Lingual and Optiq. So, now we have two active forks of Sqlline. I propose to merge these development forks. This will achieve a few things. We should be able to fix more bugs, and add more features, and get more people using sqlline. (Just today, someone ran into a bug that Drill was not saving/restoring command history, then noticed that it was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug still exists in Hive's beeline.) I propose the following: 1. Move the parts of hive-beeline module that do not depend upon Hive (about 90% of the code) into a new module in the hive repo, hive-sqlline. 2. What remains in the hive-beeline module is Beeline.java (a derived class of Sqlline.java) and Hive-specific extensions. The hive-beeline module depends upon the hive-sqlline module. 3. Make sure that the new Hive sqlline module contains all fixes and useful changes from both forks. 4. Release sqlline as a maven artifact, say {groupId=org.apache.hive, artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate to it. 5. Longer term, consider moving hive-sqlline out of Hive, but still within Apache. This achieves continuity for Hive's users, gives the users of the non-Hive sqlline a version with minimal dependencies, unifies the two code lines, and brings everything under the Apache roof. Please let me know if this sounds like a good proposal. I'll log a jira case, then start work on a patch. Julian [1] https://issues.apache.org/jira/browse/HIVE-987 [2] https://issues.apache.org/jira/browse/HIVE-3100 [3] https://github.com/julianhyde/sqlline [4] https://github.com/julianhyde/sqlline/issues/19 [5] https://issues.apache.org/jira/browse/DRILL-327
Re: Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/#review33459 --- I think this looks fine. ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java https://reviews.apache.org/r/17661/#comment62895 Are all 4 permutations of evaluate() with double/decimal args necessary here? Could we just do eval(double, double) and eval(decimal, decimal)? Though if double and decimal arg is passed in, that would result in conversion from double - decimal - double in the case of the double arg, which I suppose is less than ideal. - Jason Dere On Feb. 3, 2014, 2:26 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/ --- (Updated Feb. 3, 2014, 2:26 p.m.) Review request for hive. Bugs: HIVE-6327 https://issues.apache.org/jira/browse/HIVE-6327 Repository: hive-git Description --- Added methods for those UDFs such that decimal type can be accepted and evaluated. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION Diff: https://reviews.apache.org/r/17661/diff/ Testing --- New unit test cases are added to test all the UDFs regarding decimal input. Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-6327) A few mathematic functions don't take decimal input
[ https://issues.apache.org/jira/browse/HIVE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889807#comment-13889807 ] Jason Dere commented on HIVE-6327: -- +1 A few mathematic functions don't take decimal input --- Key: HIVE-6327 URL: https://issues.apache.org/jira/browse/HIVE-6327 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0, 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6327.patch A few mathematical functions, such as sin() cos(), etc. don't take decimal as argument. {code} hive show tables; OK Time taken: 0.534 seconds hive create table test(d decimal(5,2)); OK Time taken: 0.351 seconds hive select sin(d) from test; FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'd': No matching method for class org.apache.hadoop.hive.ql.udf.UDFSin with (decimal(5,2)). Possible choices: _FUNC_(double) {code} HIVE-6246 covers only sign() function. The remaining ones, including sin, cos, tan, asin, acos, atan, exp, ln, log, log10, log2, radians, and sqrt. These are non-generic UDFs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6349) Column name map is broken
[ https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889809#comment-13889809 ] Sergey Shelukhin commented on HIVE-6349: Linking the previous jira. HIVE-5817.00-broken.patch is an unfinished (close-to-finished iirc) patch there that makes the columns in the map be by operator, w/lineage tracked so that they can also be retrieved depending on operator. Column name map is broken -- Key: HIVE-6349 URL: https://issues.apache.org/jira/browse/HIVE-6349 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Following query results in exception at run time in vector mode. {code} explain select n_name from supplier_orc s join ( select n_name, n_nationkey from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey; {code} Here n_name is a string and all other fields are int. The stack trace: {code} java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.LongColumnVector at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234) ... 8 more {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/#review33478 --- Looks clean. Minor comments to consider. ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java https://reviews.apache.org/r/17661/#comment62920 Is it required? ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java https://reviews.apache.org/r/17661/#comment62921 is it required? ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java https://reviews.apache.org/r/17661/#comment62923 Please remove this. ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java https://reviews.apache.org/r/17661/#comment62925 does the base 1.0 should be changed to base = 1.0 as done in the old code? ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java https://reviews.apache.org/r/17661/#comment62922 same here - Mohammad Islam On Feb. 3, 2014, 2:26 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/ --- (Updated Feb. 3, 2014, 2:26 p.m.) Review request for hive. Bugs: HIVE-6327 https://issues.apache.org/jira/browse/HIVE-6327 Repository: hive-git Description --- Added methods for those UDFs such that decimal type can be accepted and evaluated. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION Diff: https://reviews.apache.org/r/17661/diff/ Testing --- New unit test cases are added to test all the UDFs regarding decimal input. Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-6327) A few mathematic functions don't take decimal input
[ https://issues.apache.org/jira/browse/HIVE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889827#comment-13889827 ] Mohammad Kamrul Islam commented on HIVE-6327: - Left few minor comments in RB. A few mathematic functions don't take decimal input --- Key: HIVE-6327 URL: https://issues.apache.org/jira/browse/HIVE-6327 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0, 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6327.patch A few mathematical functions, such as sin() cos(), etc. don't take decimal as argument. {code} hive show tables; OK Time taken: 0.534 seconds hive create table test(d decimal(5,2)); OK Time taken: 0.351 seconds hive select sin(d) from test; FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'd': No matching method for class org.apache.hadoop.hive.ql.udf.UDFSin with (decimal(5,2)). Possible choices: _FUNC_(double) {code} HIVE-6246 covers only sign() function. The remaining ones, including sin, cos, tan, asin, acos, atan, exp, ln, log, log10, log2, radians, and sqrt. These are non-generic UDFs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267
[ https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889854#comment-13889854 ] Hive QA commented on HIVE-6353: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626622/HIVE-6353.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4997 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16 {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1168/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1168/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626622 Update hadoop-2 golden files after HIVE-6267 Key: HIVE-6353 URL: https://issues.apache.org/jira/browse/HIVE-6353 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6353.1.patch HIVE-6267 changed explain with lots of changes to golden files. Separate jira because of number of files changed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/#review33497 --- ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java https://reviews.apache.org/r/17661/#comment62962 javadoc ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java https://reviews.apache.org/r/17661/#comment62973 Tabs here. Need to convert to spaces. ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java https://reviews.apache.org/r/17661/#comment62974 Tabs. - Swarnim Kulkarni On Feb. 3, 2014, 2:26 p.m., Xuefu Zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17661/ --- (Updated Feb. 3, 2014, 2:26 p.m.) Review request for hive. Bugs: HIVE-6327 https://issues.apache.org/jira/browse/HIVE-6327 Repository: hive-git Description --- Added methods for those UDFs such that decimal type can be accepted and evaluated. Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION Diff: https://reviews.apache.org/r/17661/diff/ Testing --- New unit test cases are added to test all the UDFs regarding decimal input. Thanks, Xuefu Zhang
[jira] [Commented] (HIVE-6325) Enable using multiple concurrent sessions in tez
[ https://issues.apache.org/jira/browse/HIVE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889923#comment-13889923 ] Gunther Hagleitner commented on HIVE-6325: -- Looks good so far. Left some comments on rb. Enable using multiple concurrent sessions in tez Key: HIVE-6325 URL: https://issues.apache.org/jira/browse/HIVE-6325 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-6325.1.patch We would like to enable multiple concurrent sessions in tez via hive server 2. This will enable users to make efficient use of the cluster when it has been partitioned using yarn queues. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17471: HIVE-6325: Enable using multiple concurrent sessions in tez
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17471/#review33496 --- ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionState.java https://reviews.apache.org/r/17471/#comment62960 I believe that file is in the wrong location. Should be in ql/test, right? ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62963 Everything is static in this class. I think it'd be better to have a singleton and non-static members. This way we could have multiple pools if desired. Also should make testing easier. ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62964 BlockingQueue should be able to tell you length, right? ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62965 Can't sessionType denote an actual type? Class? is extremely general and there are no comments explaining the use. ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62961 nit: some ws issues ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62966 this should come from a site file not be hard coded right? ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62967 don't think this is needed ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62968 is this the right name? shouldn't that be a yarn var? ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62969 comment doesn't match signature ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java https://reviews.apache.org/r/17471/#comment62976 It doesn't look like you're keeping track of this sessionstate here. I think we should. The user should always get/return sessions and we handle the alloc/dealloc. (why can't return close the session for non default for instance?) service/src/java/org/apache/hive/service/server/HiveServer2.java https://reviews.apache.org/r/17471/#comment62977 need to handle exception properly - Gunther Hagleitner On Jan. 28, 2014, 10:34 p.m., Vikram Dixit Kumaraswamy wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17471/ --- (Updated Jan. 28, 2014, 10:34 p.m.) Review request for hive. Bugs: HIVE-6325 https://issues.apache.org/jira/browse/HIVE-6325 Repository: hive-git Description --- Enable using multiple concurrent sessions in tez. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84ee78f itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 9ad5986 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionState.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java b8552a3 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionStateFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java c6f431c ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java d7edda1 ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java PRE-CREATION service/src/java/org/apache/hive/service/server/HiveServer2.java fa13783 Diff: https://reviews.apache.org/r/17471/diff/ Testing --- Added multi-threaded junit tests. Thanks, Vikram Dixit Kumaraswamy
[jira] [Commented] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)
[ https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889962#comment-13889962 ] Hive QA commented on HIVE-6358: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12626621/HIVE-6358.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4997 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1169/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1169/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12626621 filterExpr not printed in explain for tablescan operators (ppd) --- Key: HIVE-6358 URL: https://issues.apache.org/jira/browse/HIVE-6358 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6358.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6298) Add config flag to turn off fetching partition stats
[ https://issues.apache.org/jira/browse/HIVE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6298: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~sershe], [~prasanth_j] and [~leftylev] for the reviews! Add config flag to turn off fetching partition stats Key: HIVE-6298 URL: https://issues.apache.org/jira/browse/HIVE-6298 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6298.1.patch, HIVE-6298.2.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17632: HDFS ZeroCopy Shims for Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17632/#review33519 --- pom.xml https://reviews.apache.org/r/17632/#comment63001 I don't think we need another version, do we? for the branch we can just temporarily make the 23 version 2.4.0 until that one is released. then we switch everything over. Is there another reason to keep both? shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java https://reviews.apache.org/r/17632/#comment63004 nit: lots of trailing ws. - Gunther Hagleitner On Feb. 1, 2014, 3:05 a.m., Gopal V wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17632/ --- (Updated Feb. 1, 2014, 3:05 a.m.) Review request for hive, Gunther Hagleitner and Owen O'Malley. Bugs: HIVE-6346 https://issues.apache.org/jira/browse/HIVE-6346 Repository: hive-git Description --- Hive Shims for ZeroCopy FS read and Direct ByteBuffer decompression (hadoop/branch-2 changes) Diffs - pom.xml 41f5337 ql/pom.xml 7087a4c shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java ec1f18e shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java d0ff7d4 shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 54c38ee shims/0.23C/pom.xml PRE-CREATION shims/0.23C/src/main/java/org/apache/hadoop/hive/shims/Hadoop23CShims.java PRE-CREATION shims/aggregator/pom.xml 7aa8c4c shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 2b3c6c1 shims/common/src/main/java/org/apache/hadoop/hive/shims/ShimLoader.java bf9c84f shims/pom.xml 9843836 Diff: https://reviews.apache.org/r/17632/diff/ Testing --- TPC-DS queries. Thanks, Gopal V
[jira] [Commented] (HIVE-6346) Add Hadoop-2.4.0 shims to hive-tez
[ https://issues.apache.org/jira/browse/HIVE-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890015#comment-13890015 ] Gunther Hagleitner commented on HIVE-6346: -- [~t3rmin4t0r]Comment on rb. Looks good so far. Biggest question I have is whether we need another shim version for that. Seems we could just upgrade 23 when 2.4 is out. Also: The test failure is unrelated. Tests have been successful. Add Hadoop-2.4.0 shims to hive-tez -- Key: HIVE-6346 URL: https://issues.apache.org/jira/browse/HIVE-6346 Project: Hive Issue Type: Bug Components: Shims Affects Versions: tez-branch Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-6346.1.patch, HIVE-6346.2.patch The HadoopShims needs a 0.23C shims to add extra HDFS Caching functionality which is not available in 2.2.0 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6320) Row-based ORC reader with PPD turned on dies on BufferUnderFlowException
[ https://issues.apache.org/jira/browse/HIVE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890053#comment-13890053 ] Owen O'Malley commented on HIVE-6320: - Actually, you always need the next 2 compression blocks regardless of whether the compression blocks are the same for the two row groups. The rest of the patch looks good. Row-based ORC reader with PPD turned on dies on BufferUnderFlowException - Key: HIVE-6320 URL: https://issues.apache.org/jira/browse/HIVE-6320 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6320.1.patch ORC data reader crashes out on a BufferUnderflowException, while trying to read data row-by-row with the predicate push-down enabled on current trunk. {code} Caused by: java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:472) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101) {code} The query run is {code} set hive.vectorized.execution.enabled=false; set hive.optimize.index.filter=true; insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey is not null; {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Status: Patch Available (was: Open) Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch, HIVE-5728-10.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, HIVE-5728-8.patch, HIVE-5728-9.patch, HIVE-5728.10.patch, HIVE-5728.11.patch, HIVE-5728.12.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Status: Open (was: Patch Available) Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch, HIVE-5728-10.patch, HIVE-5728-2.patch, HIVE-5728-3.patch, HIVE-5728-4.patch, HIVE-5728-5.patch, HIVE-5728-6.patch, HIVE-5728-7.patch, HIVE-5728-8.patch, HIVE-5728-9.patch, HIVE-5728.10.patch, HIVE-5728.11.patch, HIVE-5728.12.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17678: HIVE-4996 unbalanced calls to openTransaction/commitTransaction
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17678/ --- Review request for hive. Bugs: HIVE-4996 https://issues.apache.org/jira/browse/HIVE-4996 Repository: hive-git Description --- Background: First issue: There are two levels of retrying in case of transient JDO/CP/DB errors: the RetryingHMSHandler and RetryingMetaStore. But the RetryingMetaStore is flawed in the case of a nested transaction of a larger RetryingHMSHandler transaction (which is majority of cases). Consider the following sample RetryingHMSHandler call, where variable ms is a RetryingRawStore. HMSHandler.createTable() ms.open() //openTx = 1 ms.getTable() // openTx = 2, then openTx = 1 upon intermediate commit ms.createTable() //openTx = 2, then openTx = 1 upon intermediate commit ms.commit(); //openTx = 0 If there is any transient error in any intermediate operation and RetryingRawStore tries again, there will always be an unbalanced transaction, like: HMSHandler.createTable() ms.open() //openTx = 1 ms.getTable() // openTx = 2, transient error, then openTx=0 upon rollback. After a retry, openTx=1, then openTx=0 upon successful intermediate commit ms.createTable() //openTx = 1, then openTx = 0 upon intermediate commit ms.commit(); //unbalanced transaction! Retrying RawStore operations doesn't make sense in nested transaction cases, as the first part of the transaction is rolled-back upon transient error, and retry logic only saves a second half which may not make sense without the first. It makes much more sense to retry the entire transaction from the top, which is what RetryingHMSHandler would already be doing if the RetryingMetaStore did not interfere. Second issue: The recent upgrade to BoneCP 0.8.0 seemed to cause more transient errors that triggered this problem. In these cases, in-use connections are finalized, as follows: WARN bonecp.ConnectionPartition (ConnectionPartition.java:finalizeReferent(162)) - BoneCP detected an unclosed connection and will now attempt to close it for you. You should be closing this connection in your application - enable connectionWatch for additional debugging assistance or set disableConnectionTracking to true to disable this feature entirely. The retry of this operation seems to get a good connection and allow the operation to proceed. Reading forums, it seems some others have hit this issue after the upgrade, and switching back to 0.7.1 in our environment eliminated this issue for us. But that reversion is outside the scope of this JIRA, and would be better-done in either the original or follow-up JIRA that upgraded the version. This fix targets the first issue only, as anyway it is needed for any sort of transient error, not just the BoneCP one that I observed. Changes: 1. Removes RetryingRawStore in favor of RetryingHMSHandler, and removes the configuration property of the former. 2. Addresses the resultant holes in retry, in particular in the RetryingHMSHandler's construction of RawStore (before, RetryingRawStore would have retried failures like in creating the defaultDB). It didn't seem necessary to increase the default RetryingHMSHandler retries to 2 to compensate, but I am open to that as well. 3. Contribute the instrumentation code that helped me to find the issue. This includes printing missing stacks of exceptions that triggered retry, and adding debug-level tracing of ObjectStore calls to give better correlation with other errors/warnings in hive log. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22bb22d itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java d7854fe itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRawStoreTxn.java 0b87077 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 2d8e483 metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java fb70589 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java dcf97ec Diff: https://reviews.apache.org/r/17678/diff/ Testing --- Thanks, Szehon Ho
[jira] [Commented] (HIVE-6002) Create new ORC write version to address the changes to RLEv2
[ https://issues.apache.org/jira/browse/HIVE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890067#comment-13890067 ] Owen O'Malley commented on HIVE-6002: - Rather than introduce a new version, let's add some metadata. add a key name ORC.FIXED.JIRA and make a comma separated list of the fixed jiras. So in this case, HIVE-5994. Create new ORC write version to address the changes to RLEv2 Key: HIVE-6002 URL: https://issues.apache.org/jira/browse/HIVE-6002 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6002.1.patch, HIVE-6002.2.patch HIVE-5994 encodes large negative big integers wrongly. This results in loss of original data that is being written using orc write version 0.12. Bump up the version number to differentiate the bad writes by 0.12 and the good writes by this new version (0.12.1?). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-4996: Attachment: HIVE-4996.patch Attaching a fix. unbalanced calls to openTransaction/commitTransaction - Key: HIVE-4996 URL: https://issues.apache.org/jira/browse/HIVE-4996 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0, 0.11.0, 0.12.0 Environment: hiveserver1 Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: wangfeng Assignee: Szehon Ho Priority: Critical Labels: hive, metastore Attachments: HIVE-4996.patch, hive-4996.path Original Estimate: 504h Remaining Estimate: 504h when we used hiveserver1 based on hive-0.10.0, we found the Exception thrown.It was: FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0. This probably indicates that the re are unbalanced calls to openTransaction/commitTransaction) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask help -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6247) select count(distinct) should be MRR in Tez
[ https://issues.apache.org/jira/browse/HIVE-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890075#comment-13890075 ] Gunther Hagleitner commented on HIVE-6247: -- Dug a little bit into. I think the idea makes good sense, but the description about MR is not correct. At least I wasn't able to make MR not use a single reducer for the query cited. You can rewrite the query though using a subquery to get the result you want. There are two more flags to consider (when rewriting): a) set hive.optimize.reducededuplication.min.reducer: If this is set to 1 you will have a single reducer regardless of rewrite. b) hive.fetch.task.aggr If this one is true the final count will happen on the client. This is more important in MR than Tez (because it would start a new job in MR, in tez it's just another stage in the DAG). select count(distinct) should be MRR in Tez --- Key: HIVE-6247 URL: https://issues.apache.org/jira/browse/HIVE-6247 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Gunther Hagleitner The MR query plan for select count(distinct) fires off multiple reducers, with a local work task to perform final aggregation. The Tez version fires off exactly 1 reducer for the entire data-set which chokes and dies/slows down massively. To reproduce on a TPC-DS database (meaningless query) {code} select count(distinct ss_net_profit) from store_sales ss join store s on ss.ss_store_sk = s.s_store_sk; {code} This spins up Map 1, Map 2 (for the dim table + fact table) Reducer 1 which is always 0/1. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17678: HIVE-4996 unbalanced calls to openTransaction/commitTransaction
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17678/ --- (Updated Feb. 3, 2014, 11:10 p.m.) Review request for hive. Bugs: HIVE-4996 https://issues.apache.org/jira/browse/HIVE-4996 Repository: hive-git Description (updated) --- Background: First issue: There are two levels of retrying in case of transient JDO/CP/DB errors: the RetryingHMSHandler and RetryingRawStore. But the RetryingRawStore is flawed in the case of a nested transaction of a larger RetryingHMSHandler transaction (which is majority of cases). Consider the following sample RetryingHMSHandler call, where variable ms is a RetryingRawStore. HMSHandler.createTable() ms.open() //openTx = 1 ms.getTable() // openTx = 2, then openTx = 1 upon intermediate commit ms.createTable() //openTx = 2, then openTx = 1 upon intermediate commit ms.commit(); //openTx = 0 If there is any transient error in any intermediate operation and RetryingRawStore tries again, there will always be an unbalanced transaction, like: HMSHandler.createTable() ms.open() //openTx = 1 ms.getTable() // openTx = 2, transient error, then openTx=0 upon rollback. After a retry, openTx=1, then openTx=0 upon successful intermediate commit ms.createTable() //openTx = 1, then openTx = 0 upon intermediate commit ms.commit(); //unbalanced transaction! Retrying RawStore operations doesn't make sense in nested transaction cases, as the first part of the transaction is rolled-back upon transient error, and retry logic only saves a second half which may not make sense without the first. It makes much more sense to retry the entire transaction from the top, which is what RetryingHMSHandler would already be doing if the RetryingRawStore did not interfere. Second issue: The recent upgrade to BoneCP 0.8.0 seemed to cause more transient errors that triggered this problem. In these cases, in-use connections are finalized, as follows: WARN bonecp.ConnectionPartition (ConnectionPartition.java:finalizeReferent(162)) - BoneCP detected an unclosed connection and will now attempt to close it for you. You should be closing this connection in your application - enable connectionWatch for additional debugging assistance or set disableConnectionTracking to true to disable this feature entirely. The retry of this operation seems to get a good connection and allow the operation to proceed. Reading forums, it seems some others have hit this issue after the upgrade, and switching back to 0.7.1 in our environment eliminated this issue for us. But that reversion is outside the scope of this JIRA, and would be better-done in either the original or follow-up JIRA that upgraded the version. This fix targets the first issue only, as anyway it is needed for any sort of transient error, not just the BoneCP one that I observed. Changes: 1. Removes RetryingRawStore in favor of RetryingHMSHandler, and removes the configuration property of the former. 2. Addresses the resultant holes in retry, in particular in the RetryingHMSHandler's construction of RawStore (before, RetryingRawStore would have retried failures like in creating the defaultDB). It didn't seem necessary to increase the default RetryingHMSHandler retries to 2 to compensate, but I am open to that as well. 3. Contribute the instrumentation code that helped me to find the issue. This includes printing missing stacks of exceptions that triggered retry, including 'unbalanced calls' errors to hive log, and adding debug-level tracing of ObjectStore calls to give better correlation with other errors/warnings in hive log. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22bb22d itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java d7854fe itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRawStoreTxn.java 0b87077 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 2d8e483 metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java fb70589 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java dcf97ec Diff: https://reviews.apache.org/r/17678/diff/ Testing --- Thanks, Szehon Ho
[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-4996: Status: Open (was: Patch Available) unbalanced calls to openTransaction/commitTransaction - Key: HIVE-4996 URL: https://issues.apache.org/jira/browse/HIVE-4996 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0, 0.11.0, 0.10.0 Environment: hiveserver1 Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: wangfeng Assignee: Szehon Ho Priority: Critical Labels: hive, metastore Attachments: HIVE-4996.patch, hive-4996.path Original Estimate: 504h Remaining Estimate: 504h when we used hiveserver1 based on hive-0.10.0, we found the Exception thrown.It was: FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0. This probably indicates that the re are unbalanced calls to openTransaction/commitTransaction) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask help -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6320) Row-based ORC reader with PPD turned on dies on BufferUnderFlowException
[ https://issues.apache.org/jira/browse/HIVE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6320: - Attachment: HIVE-6320.2.patch Addressed [~owen.omalley] and [~gopalv]'s code review comments. Row-based ORC reader with PPD turned on dies on BufferUnderFlowException - Key: HIVE-6320 URL: https://issues.apache.org/jira/browse/HIVE-6320 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Gopal V Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6320.1.patch, HIVE-6320.2.patch ORC data reader crashes out on a BufferUnderflowException, while trying to read data row-by-row with the predicate push-down enabled on current trunk. {code} Caused by: java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:472) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101) {code} The query run is {code} set hive.vectorized.execution.enabled=false; set hive.optimize.index.filter=true; insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey is not null; {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17678: HIVE-4996 unbalanced calls to openTransaction/commitTransaction
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17678/ --- (Updated Feb. 3, 2014, 11:24 p.m.) Review request for hive. Changes --- Fixing redundant logging for NoSuchObjectException. Bugs: HIVE-4996 https://issues.apache.org/jira/browse/HIVE-4996 Repository: hive-git Description --- Background: First issue: There are two levels of retrying in case of transient JDO/CP/DB errors: the RetryingHMSHandler and RetryingRawStore. But the RetryingRawStore is flawed in the case of a nested transaction of a larger RetryingHMSHandler transaction (which is majority of cases). Consider the following sample RetryingHMSHandler call, where variable ms is a RetryingRawStore. HMSHandler.createTable() ms.open() //openTx = 1 ms.getTable() // openTx = 2, then openTx = 1 upon intermediate commit ms.createTable() //openTx = 2, then openTx = 1 upon intermediate commit ms.commit(); //openTx = 0 If there is any transient error in any intermediate operation and RetryingRawStore tries again, there will always be an unbalanced transaction, like: HMSHandler.createTable() ms.open() //openTx = 1 ms.getTable() // openTx = 2, transient error, then openTx=0 upon rollback. After a retry, openTx=1, then openTx=0 upon successful intermediate commit ms.createTable() //openTx = 1, then openTx = 0 upon intermediate commit ms.commit(); //unbalanced transaction! Retrying RawStore operations doesn't make sense in nested transaction cases, as the first part of the transaction is rolled-back upon transient error, and retry logic only saves a second half which may not make sense without the first. It makes much more sense to retry the entire transaction from the top, which is what RetryingHMSHandler would already be doing if the RetryingRawStore did not interfere. Second issue: The recent upgrade to BoneCP 0.8.0 seemed to cause more transient errors that triggered this problem. In these cases, in-use connections are finalized, as follows: WARN bonecp.ConnectionPartition (ConnectionPartition.java:finalizeReferent(162)) - BoneCP detected an unclosed connection and will now attempt to close it for you. You should be closing this connection in your application - enable connectionWatch for additional debugging assistance or set disableConnectionTracking to true to disable this feature entirely. The retry of this operation seems to get a good connection and allow the operation to proceed. Reading forums, it seems some others have hit this issue after the upgrade, and switching back to 0.7.1 in our environment eliminated this issue for us. But that reversion is outside the scope of this JIRA, and would be better-done in either the original or follow-up JIRA that upgraded the version. This fix targets the first issue only, as anyway it is needed for any sort of transient error, not just the BoneCP one that I observed. Changes: 1. Removes RetryingRawStore in favor of RetryingHMSHandler, and removes the configuration property of the former. 2. Addresses the resultant holes in retry, in particular in the RetryingHMSHandler's construction of RawStore (before, RetryingRawStore would have retried failures like in creating the defaultDB). It didn't seem necessary to increase the default RetryingHMSHandler retries to 2 to compensate, but I am open to that as well. 3. Contribute the instrumentation code that helped me to find the issue. This includes printing missing stacks of exceptions that triggered retry, including 'unbalanced calls' errors to hive log, and adding debug-level tracing of ObjectStore calls to give better correlation with other errors/warnings in hive log. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22bb22d itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java d7854fe itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRawStoreTxn.java 0b87077 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 2d8e483 metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java fb70589 metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java dcf97ec Diff: https://reviews.apache.org/r/17678/diff/ Testing --- Thanks, Szehon Ho
[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)
[ https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6358: - Status: Open (was: Patch Available) filterExpr not printed in explain for tablescan operators (ppd) --- Key: HIVE-6358 URL: https://issues.apache.org/jira/browse/HIVE-6358 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6358.1.patch, HIVE-6358.2.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267
[ https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890104#comment-13890104 ] Gunther Hagleitner commented on HIVE-6353: -- failure is unrelated. Update hadoop-2 golden files after HIVE-6267 Key: HIVE-6353 URL: https://issues.apache.org/jira/browse/HIVE-6353 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6353.1.patch HIVE-6267 changed explain with lots of changes to golden files. Separate jira because of number of files changed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-4996: Status: Patch Available (was: Open) unbalanced calls to openTransaction/commitTransaction - Key: HIVE-4996 URL: https://issues.apache.org/jira/browse/HIVE-4996 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.12.0, 0.11.0, 0.10.0 Environment: hiveserver1 Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: wangfeng Assignee: Szehon Ho Priority: Critical Labels: hive, metastore Attachments: HIVE-4996.1.patch, HIVE-4996.patch, hive-4996.path Original Estimate: 504h Remaining Estimate: 504h when we used hiveserver1 based on hive-0.10.0, we found the Exception thrown.It was: FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0. This probably indicates that the re are unbalanced calls to openTransaction/commitTransaction) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask help -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)
[ https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6358: - Attachment: HIVE-6358.2.patch .2 fixes test from precommit (missed one golden file) filterExpr not printed in explain for tablescan operators (ppd) --- Key: HIVE-6358 URL: https://issues.apache.org/jira/browse/HIVE-6358 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6358.1.patch, HIVE-6358.2.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction
[ https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-4996: Attachment: HIVE-4996.1.patch unbalanced calls to openTransaction/commitTransaction - Key: HIVE-4996 URL: https://issues.apache.org/jira/browse/HIVE-4996 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0, 0.11.0, 0.12.0 Environment: hiveserver1 Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) Reporter: wangfeng Assignee: Szehon Ho Priority: Critical Labels: hive, metastore Attachments: HIVE-4996.1.patch, HIVE-4996.patch, hive-4996.path Original Estimate: 504h Remaining Estimate: 504h when we used hiveserver1 based on hive-0.10.0, we found the Exception thrown.It was: FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: commitTransaction was called but openTransactionCalls = 0. This probably indicates that the re are unbalanced calls to openTransaction/commitTransaction) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask help -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6354) Some index test golden files produce non-deterministic stats in explain
[ https://issues.apache.org/jira/browse/HIVE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890105#comment-13890105 ] Gunther Hagleitner commented on HIVE-6354: -- failed tests are flaky - unrelated to this check in. Some index test golden files produce non-deterministic stats in explain --- Key: HIVE-6354 URL: https://issues.apache.org/jira/browse/HIVE-6354 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-6354.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6255) Change Hive to not pass MRSplitsProto in MRHelpers.createMRInputPayloadWithGrouping
[ https://issues.apache.org/jira/browse/HIVE-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6255: - Assignee: Thejas M Nair Change Hive to not pass MRSplitsProto in MRHelpers.createMRInputPayloadWithGrouping --- Key: HIVE-6255 URL: https://issues.apache.org/jira/browse/HIVE-6255 Project: Hive Issue Type: Task Reporter: Bikas Saha Assignee: Thejas M Nair Attachments: HIVE-6255.1.patch TEZ-650 removed this superfluous parameter since splits dont need to be passed to the AM when doing split calculation on the AM. This is needed after Hive builds against TEZ 0.3. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6360) Hadoop 2.3 + Tez 0.3
Gunther Hagleitner created HIVE-6360: Summary: Hadoop 2.3 + Tez 0.3 Key: HIVE-6360 URL: https://issues.apache.org/jira/browse/HIVE-6360 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not released yet, but will be soon. I'm proposing to collect these in the tez branch and do a merge back once these components have been released at that version. The things depending on 0.3 or hadoop 2.3 are: - Zero Copy read for ORC - Unions in Tez - Tez on secure clusters - Changes to DagUtils to reflect tez 0.2 - 0.3 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6360) Hadoop 2.3 + Tez 0.3
[ https://issues.apache.org/jira/browse/HIVE-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-6360: - Description: There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not released yet, but will be soon. I'm proposing to collect these in the tez branch and do a merge back once these components have been released at that version. The things depending on 0.3 or hadoop 2.3 are: - Zero Copy read for ORC - Unions in Tez - Tez on secure clusters - Changes to DagUtils to reflect tez 0.2 - 0.3 - Prewarm containers was: There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not released yet, but will be soon. I'm proposing to collect these in the tez branch and do a merge back once these components have been released at that version. The things depending on 0.3 or hadoop 2.3 are: - Zero Copy read for ORC - Unions in Tez - Tez on secure clusters - Changes to DagUtils to reflect tez 0.2 - 0.3 Hadoop 2.3 + Tez 0.3 Key: HIVE-6360 URL: https://issues.apache.org/jira/browse/HIVE-6360 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not released yet, but will be soon. I'm proposing to collect these in the tez branch and do a merge back once these components have been released at that version. The things depending on 0.3 or hadoop 2.3 are: - Zero Copy read for ORC - Unions in Tez - Tez on secure clusters - Changes to DagUtils to reflect tez 0.2 - 0.3 - Prewarm containers -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17061: HIVE-5783 - Native Parquet Support in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17061/#review33473 --- ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java https://reviews.apache.org/r/17061/#comment62912 This doesn't seem being used anywhere. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java https://reviews.apache.org/r/17061/#comment62911 This doesn't seem necessary to be public. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java https://reviews.apache.org/r/17061/#comment62935 If doubt exists, it's probably better to address it. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java https://reviews.apache.org/r/17061/#comment62937 Please remove if not used. Make it private otherwise. The same applies to all code. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java https://reviews.apache.org/r/17061/#comment62943 It's better to not to hard code those string consts here. They are probably defined somewhere. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java https://reviews.apache.org/r/17061/#comment62951 I don't understand what the conversion is about: string - path - uri - path - string? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java https://reviews.apache.org/r/17061/#comment62953 Either put comments or log msg. Commented out code isn't comment. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java https://reviews.apache.org/r/17061/#comment62954 Same as above. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java https://reviews.apache.org/r/17061/#comment62957 Could you put more comments about what sort of refactoring? Can we log a JIRA for it also? ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java https://reviews.apache.org/r/17061/#comment62991 The if ... else ... here doesn't seem terribly different. Please refeactor the code. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java https://reviews.apache.org/r/17061/#comment62996 It seems that for (int i = 0; i selectFieldCount; i++) is better for this loop. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java https://reviews.apache.org/r/17061/#comment62997 Please remove the extra blank lines, which are not necessary. The same applies to all code changes. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java https://reviews.apache.org/r/17061/#comment62999 Decimal treated as double? I don't think that's acceptable. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java https://reviews.apache.org/r/17061/#comment63000 Please change to public static ... ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java https://reviews.apache.org/r/17061/#comment63003 Please change to public static, across all code changes. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java https://reviews.apache.org/r/17061/#comment63007 Please wrap long lines. Applicable to all code changes. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java https://reviews.apache.org/r/17061/#comment63011 1. private static? 2. use string constant instead of literal ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java https://reviews.apache.org/r/17061/#comment63021 These string constant should be defined globally and referred here (and anywhere else). ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java https://reviews.apache.org/r/17061/#comment63020 If not supposed to be call, it's better to throw an exception. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java https://reviews.apache.org/r/17061/#comment63022 Same as above. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java https://reviews.apache.org/r/17061/#comment63023 long too long ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java https://reviews.apache.org/r/17061/#comment63024 long lines. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java https://reviews.apache.org/r/17061/#comment63030 I don't understand why Hive would inspect an inspected result. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java https://reviews.apache.org/r/17061/#comment63029 I don't understand why Hive would inspect an inspected result.
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890138#comment-13890138 ] Xuefu Zhang commented on HIVE-5783: --- Some comments are posted on RB. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 17061: HIVE-5783 - Native Parquet Support in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17061/#review33533 --- Thanks for the comments, Justin, if you see this, I can address these tomorrow. - Brock Noland On Jan. 30, 2014, 2:48 p.m., Brock Noland wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17061/ --- (Updated Jan. 30, 2014, 2:48 p.m.) Review request for hive. Bugs: HIVE-5783 https://issues.apache.org/jira/browse/HIVE-5783 Repository: hive-git Description --- Adds native parquet support hive Diffs - data/files/parquet_create.txt PRE-CREATION data/files/parquet_partitioned.txt PRE-CREATION pom.xml 41f5337 ql/pom.xml 7087a4c ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableRecordConverter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveGroupConverter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/DeepParquetHiveMapInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveArrayInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/StandardParquetHiveMapInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetByteInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetPrimitiveInspectorFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetShortInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetStringInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/writable/BigDecimalWritable.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/writable/BinaryWritable.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 13d0a56 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f83c15d ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 010e04f ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 538b2b0 ql/src/java/parquet/hive/DeprecatedParquetInputFormat.java PRE-CREATION ql/src/java/parquet/hive/DeprecatedParquetOutputFormat.java PRE-CREATION ql/src/java/parquet/hive/MapredParquetInputFormat.java PRE-CREATION ql/src/java/parquet/hive/MapredParquetOutputFormat.java PRE-CREATION ql/src/java/parquet/hive/serde/ParquetHiveSerDe.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetInputFormat.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestAbstractParquetMapInspector.java PRE-CREATION
[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive
[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890146#comment-13890146 ] Brock Noland commented on HIVE-5783: Thanks Xuefu. Justin, I can address these items tomorrow and have an updated patch. Native Parquet Support in Hive -- Key: HIVE-5783 URL: https://issues.apache.org/jira/browse/HIVE-5783 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Justin Coffey Assignee: Justin Coffey Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch Problem Statement: Hive would be easier to use if it had native Parquet support. Our organization, Criteo, uses Hive extensively. Therefore we built the Parquet Hive integration and would like to now contribute that integration to Hive. About Parquet: Parquet is a columnar storage format for Hadoop and integrates with many Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native Parquet integration. Changes Details: Parquet was built with dependency management in mind and therefore only a single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-4144) Add select database() command to show the current database
[ https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890151#comment-13890151 ] Navis commented on HIVE-4144: - Cannot reproduce. Seemed not related. Add select database() command to show the current database Key: HIVE-4144 URL: https://issues.apache.org/jira/browse/HIVE-4144 Project: Hive Issue Type: Bug Components: SQL Reporter: Mark Grover Assignee: Navis Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch A recent hive-user mailing list conversation asked about having a command to show the current database. http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E MySQL seems to have a command to do so: {code} select database(); {code} http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database We should look into having something similar in Hive. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890156#comment-13890156 ] Navis commented on HIVE-6356: - Right. I've forgot there are two version of TableMapReduceUtil. HIVE-3603 changed import clause of TableMapReduceUtil to mapred, which cause this problem. I'll fix this shortly after. Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6356: Status: Open (was: Patch Available) Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6356: Attachment: HIVE-6356.2.patch.txt Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6356: Status: Patch Available (was: Open) Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format
[ https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890173#comment-13890173 ] Navis commented on HIVE-6204: - I believe security related metrics always concerns timestamp (There are two time metric in role and the first patch is missing 'create time'. Will be fixed in next patch). So we should find a way to mask time parts without introducing test conf var. Any idea? The result of show grant / show role should be tabular format - Key: HIVE-6204 URL: https://issues.apache.org/jira/browse/HIVE-6204 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6204.1.patch.txt {noformat} hive show grant role role1 on all; OK database default table src principalName role1 principalType ROLE privilege Create grantTime Wed Dec 18 14:17:56 KST 2013 grantor navis database default table srcpart principalName role1 principalType ROLE privilege Update grantTime Wed Dec 18 14:18:28 KST 2013 grantor navis {noformat} This should be something like below, especially for JDBC clients. {noformat} hive show grant role role1 on all; OK default src role1 ROLECreate false 1387343876000 navis default srcpart role1 ROLEUpdate false 1387343908000 navis {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6329) Support column level encryption/decryption
[ https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890182#comment-13890182 ] Navis commented on HIVE-6329: - Failures seemed not related to this. Support column level encryption/decryption -- Key: HIVE-6329 URL: https://issues.apache.org/jira/browse/HIVE-6329 Project: Hive Issue Type: New Feature Components: Security, Serializers/Deserializers Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6329.1.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt Receiving some requirements on encryption recently but hive is not supporting it. Before the full implementation via HIVE-5207, this might be useful for some cases. {noformat} hive create table encode_test(id int, name STRING, phone STRING, address STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ('column.encode.indices'='2,3', 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') STORED AS TEXTFILE; OK Time taken: 0.584 seconds hive insert into table encode_test select 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows); .. OK Time taken: 5.121 seconds hive select * from encode_test; OK 100 navis MDEwLTAwMDAtMDAwMA== U2VvdWwsIFNlb2Nobw== Time taken: 0.078 seconds, Fetched: 1 row(s) hive {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6267) Explain explain
[ https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890199#comment-13890199 ] Navis commented on HIVE-6267: - [~hagleitn] Generally looks good to me. Simple and concise but things like Position of Big Table should not be removed, imho. I think I was a little pissed off yesterday, which was Sunday for you but Monday for me. I apologize for the rudeness. Explain explain --- Key: HIVE-6267 URL: https://issues.apache.org/jira/browse/HIVE-6267 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.13.0 Attachments: HIVE-6267.1.partial, HIVE-6267.2.partial, HIVE-6267.3.partial, HIVE-6267.4.patch, HIVE-6267.5.patch, HIVE-6267.6.patch, HIVE-6267.7.patch.gz, HIVE-6267.8.patch I've gotten feedback over time saying that it's very difficult to grok our explain command. There's supposedly a lot of information that mainly matters to developers or the testing framework. Comparing it to other major DBs it does seem like we're packing way more into explain than other folks. I've gone through the explain checking, what could be done to improve readability. Here's a list of things I've found: - AST (unreadable in it's lisp syntax, not really required for end users) - Vectorization (enough to display once per task and only when true) - Expressions representation is very lengthy, could be much more compact - if not exists on DDL (enough to display only on true, or maybe not at all) - bucketing info (enough if displayed only if table is actually bucketed) - external flag (show only if external) - GlobalTableId (don't need in plain explain, maybe in extended) - Position of big table (already clear from plan) - Stats always (Most DBs mostly only show stats in explain, that gives a sense of what the planer thinks will happen) - skew join (only if true should be enough) - limit doesn't show the actual limit - Alias - Map Operator tree - alias is duplicated in TableScan operator - tag is only useful at runtime (move to explain extended) - Some names are camel case or abbreviated, clearer if full name - Tez is missing vertex map (aka edges) - explain formatted (json) is broken right now (swallows some information) Since changing explain results in many golden file updates, i'd like to take a stab at all of these at once. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format
[ https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890217#comment-13890217 ] Thejas M Nair commented on HIVE-6204: - I think, ideally we should not have test specific conditions in the main code. One way to workaround it would be to use another configurable class that returns the timestamp. In case of tests, we use a class that just returns a hard-coded time. This would be something similar to dependency injection. But this is probably something that can also be done in a separate jira. If hive had the ability to have metadata statements such as this in subquery, we could have solved this by selecting the appropriate fields for testing. Navis, can you please create a reviewboard link as well ? The result of show grant / show role should be tabular format - Key: HIVE-6204 URL: https://issues.apache.org/jira/browse/HIVE-6204 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6204.1.patch.txt {noformat} hive show grant role role1 on all; OK database default table src principalName role1 principalType ROLE privilege Create grantTime Wed Dec 18 14:17:56 KST 2013 grantor navis database default table srcpart principalName role1 principalType ROLE privilege Update grantTime Wed Dec 18 14:18:28 KST 2013 grantor navis {noformat} This should be something like below, especially for JDBC clients. {noformat} hive show grant role role1 on all; OK default src role1 ROLECreate false 1387343876000 navis default srcpart role1 ROLEUpdate false 1387343908000 navis {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format
[ https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890218#comment-13890218 ] Thejas M Nair commented on HIVE-6204: - I mean a configurable class could be used instead of System.currentTimeMillis()/1000; in ObjectStore. The result of show grant / show role should be tabular format - Key: HIVE-6204 URL: https://issues.apache.org/jira/browse/HIVE-6204 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6204.1.patch.txt {noformat} hive show grant role role1 on all; OK database default table src principalName role1 principalType ROLE privilege Create grantTime Wed Dec 18 14:17:56 KST 2013 grantor navis database default table srcpart principalName role1 principalType ROLE privilege Update grantTime Wed Dec 18 14:18:28 KST 2013 grantor navis {noformat} This should be something like below, especially for JDBC clients. {noformat} hive show grant role role1 on all; OK default src role1 ROLECreate false 1387343876000 navis default srcpart role1 ROLEUpdate false 1387343908000 navis {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)
[ https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6256: --- Attachment: HIVE-6256.nogen.patch HIVE-6256.patch Patch. drop* tests seem to pass add batch dropping of partitions to Hive metastore (as well as to dropTable) Key: HIVE-6256 URL: https://issues.apache.org/jira/browse/HIVE-6256 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch Metastore drop partitions call drops one partition; when many are being dropped this can be slow. Partitions could be dropped in batch instead, if multiple are dropped via one command. Drop table can also use that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6361) Un-fork Sqlline
Julian Hyde created HIVE-6361: - Summary: Un-fork Sqlline Key: HIVE-6361 URL: https://issues.apache.org/jira/browse/HIVE-6361 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.12.0 Reporter: Julian Hyde I propose to merge the two development forks of sqlline: Hive's beeline module, and the fork at https://github.com/julianhyde/sqlline. How did the forks come about? Hive’s SQL command-line interface Beeline was created by forking Sqlline (see HIVE-987, HIVE-3100), which at the time it was a useful but low-activity project languishing on SourceForge without an active owner. Around the same time, Julian Hyde independently started a github repo based on the same code base. Now several projects are using Julian Hyde's sqlline, including Apache Drill, Apache Phoenix, Cascading Lingual and Optiq. Merging these two forks will allow us to pool our resources. (Case in point: Drill issue DRILL-327 had already been fixed in a later version of sqlline; it still exists in beeline.) I propose the following steps: 1. Copy Julian Hyde's sqlline as a new Hive module, hive-sqlline. 2. Port fixes to hive-beeline into hive-sqlline. 3. Make hive-beeline depend on hive-sqlline, and remove code that is identical. What remains in the hive-beeline module is Beeline.java (a derived class of Sqlline.java) and Hive-specific extensions. 4. Make the hive-sqlline the official successor to Julian Hyde's sqlline. This achieves continuity for Hive’s users, gives the users of the non-Hive sqlline a version with minimal dependencies, unifies the two code lines, and brings everything under the Apache roof. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Proposal to un-fork Sqlline
On Feb 3, 2014, at 11:15 AM, Xuefu Zhang xzh...@cloudera.com wrote: I'm thinking if it makes more sense to fork sqlline directly into Apache. upon its completion, Hive gets rid of its copy of sqlline and creates a dependency on the forked sqlline instead. I guess this is a top-down approach and the benefits are immediate across multiple projects. You’re basically suggesting that I do step 3 before 1 and 2. It makes sense, because it reduces risk. I have logged https://issues.apache.org/jira/browse/HIVE-6361 with an updated proposal. Julian
[jira] [Commented] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)
[ https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890245#comment-13890245 ] Sergey Shelukhin commented on HIVE-6256: Oh. this patch incorporates heavily changed HIVE-6342. The approach used is sending expressions to metastore. Unfortunately, to populate results if semantic analyzer (which seems to be of dubious value) we still need to fetch partitions from client also. Also, JDO requires fetching objects to delete them, which is sad... we may need to do followup patch to do direct SQL deletes, but that may not be safe wrt other code going thru DN. add batch dropping of partitions to Hive metastore (as well as to dropTable) Key: HIVE-6256 URL: https://issues.apache.org/jira/browse/HIVE-6256 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch Metastore drop partitions call drops one partition; when many are being dropped this can be slow. Partitions could be dropped in batch instead, if multiple are dropped via one command. Drop table can also use that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5859) Create view does not captures inputs
[ https://issues.apache.org/jira/browse/HIVE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890244#comment-13890244 ] Thejas M Nair commented on HIVE-5859: - +1 Create view does not captures inputs Key: HIVE-5859 URL: https://issues.apache.org/jira/browse/HIVE-5859 Project: Hive Issue Type: Bug Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: D14235.1.patch, HIVE-5859.2.patch.txt, HIVE-5859.3.patch.txt, HIVE-5859.4.patch.txt, HIVE-5859.5.patch.txt For example, CREATE VIEW view_j5jbymsx8e_1 as SELECT * FROM tbl_j5jbymsx8e; should capture default.tbl_j5jbymsx8e as input entity for authorization process but currently it's not. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)
[ https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890248#comment-13890248 ] Sergey Shelukhin commented on HIVE-6256: [~hagleitn] [~ashutoshc] can you please review? add batch dropping of partitions to Hive metastore (as well as to dropTable) Key: HIVE-6256 URL: https://issues.apache.org/jira/browse/HIVE-6256 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch Metastore drop partitions call drops one partition; when many are being dropped this can be slow. Partitions could be dropped in batch instead, if multiple are dropped via one command. Drop table can also use that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6327) A few mathematic functions don't take decimal input
[ https://issues.apache.org/jira/browse/HIVE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6327: -- Attachment: HIVE-6327.1.patch Patch #1 updated: 1. removed a tab 2. if condition, changed to = 1.0 A few mathematic functions don't take decimal input --- Key: HIVE-6327 URL: https://issues.apache.org/jira/browse/HIVE-6327 Project: Hive Issue Type: Improvement Affects Versions: 0.11.0, 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6327.1.patch, HIVE-6327.patch A few mathematical functions, such as sin() cos(), etc. don't take decimal as argument. {code} hive show tables; OK Time taken: 0.534 seconds hive create table test(d decimal(5,2)); OK Time taken: 0.351 seconds hive select sin(d) from test; FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'd': No matching method for class org.apache.hadoop.hive.ql.udf.UDFSin with (decimal(5,2)). Possible choices: _FUNC_(double) {code} HIVE-6246 covers only sign() function. The remaining ones, including sin, cos, tan, asin, acos, atan, exp, ln, log, log10, log2, radians, and sqrt. These are non-generic UDFs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 17687: HIVE-6256 add batch dropping of partitions to Hive metastore (as well as to dropTable)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/17687/ --- Review request for hive, Ashutosh Chauhan and Gunther Hagleitner. Repository: hive-git Description --- See jira. Diffs - hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java 8bb4045 hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java 75f54e2 metastore/if/hive_metastore.thrift e327e2a metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 2d8e483 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java bcbb52e metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 377709f metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java e18e13f metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 2e3b6da metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java 6998b43 metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java f54ae53 ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java 598be11 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a926f1e ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e59decc ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 46f96ce ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java c51e998 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java f4d9a83 ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java 831aefc ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionSpec.java 5a6553f ql/src/test/queries/clientnegative/drop_partition_filter_failure2.q 4d238d7 ql/src/test/results/clientnegative/drop_partition_failure.q.out 5db9d92 ql/src/test/results/clientnegative/drop_partition_filter_failure.q.out 863d821 Diff: https://reviews.apache.org/r/17687/diff/ Testing --- Thanks, Sergey Shelukhin
[jira] [Commented] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)
[ https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890253#comment-13890253 ] Sergey Shelukhin commented on HIVE-6256: actually let me also look at DN delete by query in the same patch after all. This will be a small scope change, so it's still ready for review. add batch dropping of partitions to Hive metastore (as well as to dropTable) Key: HIVE-6256 URL: https://issues.apache.org/jira/browse/HIVE-6256 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Minor Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch Metastore drop partitions call drops one partition; when many are being dropped this can be slow. Partitions could be dropped in batch instead, if multiple are dropped via one command. Drop table can also use that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)