date:20140203


 [ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6356:


Attachment: HIVE-6356.1.patch.txt

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6356) Dependency injection in hbase storage handler is broken

Navis created HIVE-6356:
---

 Summary: Dependency injection in hbase storage handler is broken
 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt

Dependent jars for hbase is not added to tmpjars, which is caused by the change 
of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6356:


Status: Patch Available  (was: Open)

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6047) Permanent UDFs in Hive

2014-02-03 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6047:
-

Attachment: PermanentFunctionsinHive.pdf

updating doc to change the jar/file management.  Rather than the idea of jar 
sets, each jar/file would be created as a separate resource, and referenced by 
the UDF. This would make the metastore changes a bit simpler.

 Permanent UDFs in Hive
 --

 Key: HIVE-6047
 URL: https://issues.apache.org/jira/browse/HIVE-6047
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: PermanentFunctionsinHive.pdf, 
 PermanentFunctionsinHive.pdf


 Currently Hive only supports temporary UDFs which must be re-registered when 
 starting up a Hive session. Provide some support to register permanent UDFs 
 with Hive. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-4144) Add select database() command to show the current database


[ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889299#comment-13889299
 ] 

Hive QA commented on HIVE-4144:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626585/HIVE-4144.12.patch.txt

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 4999 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1159/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1159/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626585

 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, 
 HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.6.patch.txt, 
 HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, 
 HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, 
 HIVE-4144.D9597.4.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-4144) Add select database() command to show the current database


[ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889308#comment-13889308
 ] 

Navis commented on HIVE-4144:
-

Rebased to trunk

 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, 
 HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, 
 HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, 
 HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, 
 HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-4144) Add select database() command to show the current database


 [ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4144:


Attachment: HIVE-4144.13.patch.txt

 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, 
 HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, 
 HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, 
 HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, 
 HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6267) Explain explain


[ 
https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889311#comment-13889311
 ] 

Navis commented on HIVE-6267:
-

It's breaking all of pending patches. Is this that good?

 Explain explain
 ---

 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6267.1.partial, HIVE-6267.2.partial, 
 HIVE-6267.3.partial, HIVE-6267.4.patch, HIVE-6267.5.patch, HIVE-6267.6.patch, 
 HIVE-6267.7.patch.gz, HIVE-6267.8.patch


 I've gotten feedback over time saying that it's very difficult to grok our 
 explain command. There's supposedly a lot of information that mainly matters 
 to developers or the testing framework. Comparing it to other major DBs it 
 does seem like we're packing way more into explain than other folks.
 I've gone through the explain checking, what could be done to improve 
 readability. Here's a list of things I've found:
 - AST (unreadable in it's lisp syntax, not really required for end users)
 - Vectorization (enough to display once per task and only when true)
 - Expressions representation is very lengthy, could be much more compact
 - if not exists on DDL (enough to display only on true, or maybe not at all)
 - bucketing info (enough if displayed only if table is actually bucketed)
 - external flag (show only if external)
 - GlobalTableId (don't need in plain explain, maybe in extended)
 - Position of big table (already clear from plan)
 - Stats always (Most DBs mostly only show stats in explain, that gives a 
 sense of what the planer thinks will happen)
 - skew join (only if true should be enough)
 - limit doesn't show the actual limit
 - Alias - Map Operator tree - alias is duplicated in TableScan operator
 - tag is only useful at runtime (move to explain extended)
 - Some names are camel case or abbreviated, clearer if full name
 - Tez is missing vertex map (aka edges)
 - explain formatted (json) is broken right now (swallows some information)
 Since changing explain results in many golden file updates, i'd like to take 
 a stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6357) Extend the orcalltypes test table to include DECIMAL columns

2014-02-03 Thread Remus Rusanu (JIRA)

Remus Rusanu created HIVE-6357:
--

 Summary: Extend the orcalltypes test table to include DECIMAL 
columns
 Key: HIVE-6357
 URL: https://issues.apache.org/jira/browse/HIVE-6357
 Project: Hive
  Issue Type: Sub-task
Reporter: Remus Rusanu
Priority: Minor


The orcalltypes table is used in many vectorized clientpositive tests. As we 
add support for DECIMAL, it would be nice to have the table include a few 
DECIMAL columns (various scale/precision) to ease writing new test cases.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)

Gunther Hagleitner created HIVE-6358:


 Summary: filterExpr not printed in explain for tablescan operators 
(ppd)
 Key: HIVE-6358
 URL: https://issues.apache.org/jira/browse/HIVE-6358
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)


 [ 
https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6358:
-

Attachment: HIVE-6358.1.patch

 filterExpr not printed in explain for tablescan operators (ppd)
 ---

 Key: HIVE-6358
 URL: https://issues.apache.org/jira/browse/HIVE-6358
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6358.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267


 [ 
https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6353:
-

Attachment: HIVE-6353.1.patch

 Update hadoop-2 golden files after HIVE-6267
 

 Key: HIVE-6353
 URL: https://issues.apache.org/jira/browse/HIVE-6353
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6353.1.patch


 HIVE-6267 changed explain with lots of changes to golden files. Separate jira 
 because of number of files changed. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)


 [ 
https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6358:
-

Status: Patch Available  (was: Open)

 filterExpr not printed in explain for tablescan operators (ppd)
 ---

 Key: HIVE-6358
 URL: https://issues.apache.org/jira/browse/HIVE-6358
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6358.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267


 [ 
https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6353:
-

Status: Patch Available  (was: Open)

 Update hadoop-2 golden files after HIVE-6267
 

 Key: HIVE-6353
 URL: https://issues.apache.org/jira/browse/HIVE-6353
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6353.1.patch


 HIVE-6267 changed explain with lots of changes to golden files. Separate jira 
 because of number of files changed. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6357) Extend the orcalltypes test table to include DECIMAL columns

2014-02-03 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6357:
---

Description: The alltypesorc table is used in many vectorized 
clientpositive tests. As we add support for DECIMAL, it would be nice to have 
the table include a few DECIMAL columns (various scale/precision) to ease 
writing new test cases. alltypesorc was introduced with HIVE-5314.  (was: The 
orcalltypes table is used in many vectorized clientpositive tests. As we add 
support for DECIMAL, it would be nice to have the table include a few DECIMAL 
columns (various scale/precision) to ease writing new test cases.)

 Extend the orcalltypes test table to include DECIMAL columns
 

 Key: HIVE-6357
 URL: https://issues.apache.org/jira/browse/HIVE-6357
 Project: Hive
  Issue Type: Sub-task
Reporter: Remus Rusanu
Priority: Minor

 The alltypesorc table is used in many vectorized clientpositive tests. As we 
 add support for DECIMAL, it would be nice to have the table include a few 
 DECIMAL columns (various scale/precision) to ease writing new test cases. 
 alltypesorc was introduced with HIVE-5314.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6357) Extend the alltypesorc test table to include DECIMAL columns

2014-02-03 Thread Remus Rusanu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6357:
---

Summary: Extend the alltypesorc test table to include DECIMAL columns  
(was: Extend the orcalltypes test table to include DECIMAL columns)

 Extend the alltypesorc test table to include DECIMAL columns
 

 Key: HIVE-6357
 URL: https://issues.apache.org/jira/browse/HIVE-6357
 Project: Hive
  Issue Type: Sub-task
Reporter: Remus Rusanu
Priority: Minor

 The alltypesorc table is used in many vectorized clientpositive tests. As we 
 add support for DECIMAL, it would be nice to have the table include a few 
 DECIMAL columns (various scale/precision) to ease writing new test cases. 
 alltypesorc was introduced with HIVE-5314.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5859) Create view does not captures inputs


[ 
https://issues.apache.org/jira/browse/HIVE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889354#comment-13889354
 ] 

Hive QA commented on HIVE-5859:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626588/HIVE-5859.5.patch.txt

{color:green}SUCCESS:{color} +1 4997 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1160/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1160/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626588

 Create view does not captures inputs
 

 Key: HIVE-5859
 URL: https://issues.apache.org/jira/browse/HIVE-5859
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D14235.1.patch, HIVE-5859.2.patch.txt, 
 HIVE-5859.3.patch.txt, HIVE-5859.4.patch.txt, HIVE-5859.5.patch.txt


 For example, 
 CREATE VIEW view_j5jbymsx8e_1 as SELECT * FROM tbl_j5jbymsx8e;
 should capture default.tbl_j5jbymsx8e as input entity for authorization 
 process but currently it's not.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6267) Explain explain


[ 
https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889380#comment-13889380
 ] 

Gunther Hagleitner commented on HIVE-6267:
--

Sorry [~navis]. I tried to keep the disruption as minimal as possible. I 
collected all things I thought need fixing together before making the changes. 
Then, I waited until the weekend and a time when the queue is empty and tried 
to get everything back in shape before people start working again. 

I think it's worth it, otherwise I wouldn't have spent so much time on it. As I 
mentioned above, I've gotten feedback multiple times about seeing if we can 
improve explain. Unfortunately, that means tons of golden files. If you can 
think of a better way I can back out and try again. But it's not clear to me 
how to avoid changing that many golden files, since we rely on it so heavily in 
the q files..

 Explain explain
 ---

 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6267.1.partial, HIVE-6267.2.partial, 
 HIVE-6267.3.partial, HIVE-6267.4.patch, HIVE-6267.5.patch, HIVE-6267.6.patch, 
 HIVE-6267.7.patch.gz, HIVE-6267.8.patch


 I've gotten feedback over time saying that it's very difficult to grok our 
 explain command. There's supposedly a lot of information that mainly matters 
 to developers or the testing framework. Comparing it to other major DBs it 
 does seem like we're packing way more into explain than other folks.
 I've gone through the explain checking, what could be done to improve 
 readability. Here's a list of things I've found:
 - AST (unreadable in it's lisp syntax, not really required for end users)
 - Vectorization (enough to display once per task and only when true)
 - Expressions representation is very lengthy, could be much more compact
 - if not exists on DDL (enough to display only on true, or maybe not at all)
 - bucketing info (enough if displayed only if table is actually bucketed)
 - external flag (show only if external)
 - GlobalTableId (don't need in plain explain, maybe in extended)
 - Position of big table (already clear from plan)
 - Stats always (Most DBs mostly only show stats in explain, that gives a 
 sense of what the planer thinks will happen)
 - skew join (only if true should be enough)
 - limit doesn't show the actual limit
 - Alias - Map Operator tree - alias is duplicated in TableScan operator
 - tag is only useful at runtime (move to explain extended)
 - Some names are camel case or abbreviated, clearer if full name
 - Tez is missing vertex map (aka edges)
 - explain formatted (json) is broken right now (swallows some information)
 Since changing explain results in many golden file updates, i'd like to take 
 a stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf

2014-02-03 Thread Lefty Leverenz (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889402#comment-13889402
]

Lefty Leverenz commented on HIVE-6037:
--

bq. It's not auto-generated. All by hand (was tedious).

Wow, that's a labor of love. Let's not abandon it, let's use it as a one-time
fix to get all the parameter descriptions into HiveConf. Then later we can
figure out how to generate hive-default.xml.template from HiveConf for each new
release. However there's a problem with the default values, since HiveConf
sets them so those are the correct values.

Also, updating would be needed for changes since December 15th. But that's
easier once the two files are synchronized.

Hmm ... if HiveConf already has a comment that's different from the description
in hive-default.xml.template, should both be kept? Ideally they'd be merged,
and sometimes that's an easy edit but other times it requires expert
information.

To muddy the waters further, the wiki has some release information and
miscellaneous notes that don't have to be merged with HiveConf but shouldn't
get lost if we eventually generate the wiki from one of the code files. I've
been wanting to go through the list and and fill in the Added In: information
by grepping the config names in a directory of HiveConf files for all the
branches. (Another labor of love, but it's less important than making sure all
the parameters are listed in the wiki hive-default.xml.template.)

[~cwsteinbach] created the Configuration Properties wikidoc in the first place
-- how was that done?

Synchronize HiveConf with hive-default.xml.template and support show conf
-

Key: HIVE-6037
URL: https://issues.apache.org/jira/browse/HIVE-6037
Project: Hive
Issue Type: Improvement
Components: Configuration
Reporter: Navis
Priority: Minor
Attachments: HIVE-6037.1.patch.txt

see HIVE-5879

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6354) Some index test golden files produce non-deterministic stats in explain


[ 
https://issues.apache.org/jira/browse/HIVE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889409#comment-13889409
 ] 

Hive QA commented on HIVE-6354:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626589/HIVE-6354.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4994 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1162/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1162/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626589

 Some index test golden files produce non-deterministic stats in explain
 ---

 Key: HIVE-6354
 URL: https://issues.apache.org/jira/browse/HIVE-6354
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6354.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6329) Support column level encryption/decryption


[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889457#comment-13889457
 ] 

Hive QA commented on HIVE-6329:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626591/HIVE-6329.3.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4998 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1163/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1163/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626591

 Support column level encryption/decryption
 --

 Key: HIVE-6329
 URL: https://issues.apache.org/jira/browse/HIVE-6329
 Project: Hive
  Issue Type: New Feature
  Components: Security, Serializers/Deserializers
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6329.1.patch.txt, HIVE-6329.2.patch.txt, 
 HIVE-6329.3.patch.txt


 Receiving some requirements on encryption recently but hive is not supporting 
 it. Before the full implementation via HIVE-5207, this might be useful for 
 some cases.
 {noformat}
 hive create table encode_test(id int, name STRING, phone STRING, address 
 STRING) 
  ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
  WITH SERDEPROPERTIES ('column.encode.indices'='2,3', 
 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
 STORED AS TEXTFILE;
 OK
 Time taken: 0.584 seconds
 hive insert into table encode_test select 
 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
 ..
 OK
 Time taken: 5.121 seconds
 hive select * from encode_test;
 OK
 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
 Time taken: 0.078 seconds, Fetched: 1 row(s)
 hive 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6336) Issue is hive 12 datanucleus incompatability with org.apache.hadoop.hive.contrib.serde2.RegexSerDe

2014-02-03 Thread Andy Jefferson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889488#comment-13889488
 ] 

Andy Jefferson commented on HIVE-6336:
--

@Nigel Savage, 
The most recent release is as follows : datanucleus-core-3.2.12, 
datanucleus-api-jdo-3.2.8, datanucleus-api-rdbms-3.2.11.
Note that HIVE-5218 requires datanucleus-rdbms-3.2.7 or later, and HIVE-6136 
requires datanucleus-rdbms-3.2.11 too. 

 Issue is hive 12 datanucleus incompatability with 
 org.apache.hadoop.hive.contrib.serde2.RegexSerDe
 --

 Key: HIVE-6336
 URL: https://issues.apache.org/jira/browse/HIVE-6336
 Project: Hive
  Issue Type: Wish
  Components: HiveServer2
Affects Versions: 0.12.0
 Environment:  Hadoop 2.2  local derby Meatastore embedded
Reporter: Nigel Savage
Priority: Blocker
  Labels: HADOOP

 There is an with hive 12 datanucleus incompatability which seems to have 
 invompatibility with org.apache.hadoop.hive.contrib.serde2.RegexSerDe
 The main question: 
 *IF hive 0.12.0  and datanucleus are compatabile, then what is the version of 
 datanucleus I should be using with Hive 12 and Hadoop 2.2?*
 The error which Im getting (this blocks me from properly running hive queries 
 invoked from the test phase of a maven project)
 *To reproduce*
 I have hadoop and hive  running as a pseudo cluster local mode and derby as 
 the metastore
 I have the following environment variables
 {noformat}
 HADOOP_HOME=/home/ubu/hadoop
 JAVA_HOME=/usr/lib/jvm/java-7-oracle
 {noformat}
 I have the RegexSerDe declared in the hive-site.xml
 {noformat}
 property
 namehive.aux.jars.path/name
 valuefile:///home/ubu/hadoop/lib/hive-contrib-0.12.0.jar /value
 descriptionThis JAR file  available to all users for 
 alljobs/description
 /property
 {noformat}
 If I run with
 {noformat} 
 datanucleus.version3.0.2/datanucleus.version 
 {noformat}
 I get the following 1 exception only
 'java.lang.ClassNotFoundException...org.datanucleus.store.types.backed.Ma'  
 HOWEVER, If I run with  
 {noformat}
 datanucleus.version3.2.0-release/datanucleus.version 
 {noformat}
 I get the following 1 exception exception only
 java.lang.ClassNotFoundException:
 org/apache/hadoop/hive/contrib/serde2/RegexSerDe 
 EXPLANATION 
 The RegexSerDe class is picked up at run time but the datanucleus Map class 
 is not available, I have checked in the datanucleus-core 3.0.2 jar and it is 
 missing,  Upgrading to the first datanucleus above 3.0.2 that includes the 
 Map class throws the ClassNotFoundException for RegexSerDe. 
 The earlier *3.0.2* datanucleus, code fails with the missing Map class but 
 the RegexSerDe class is found, then when I upgrade to the 
 3.2.0-release the Map class is found but for some unkown reason the code/Hive 
 no longer finds the RegexSerDe class
 I started using the same datanucleus dependencies found in this hive pom
 http://maven-repository.com/artifact/org.apache.hive/hive-metastore/0.12.0/pom
 below are the dependencies my latest attempts to get a functioning pom
 {noformat}
 dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-server/artifactId
 version0.96.0-hadoop2/version
 /dependency
 dependency
 groupIdorg.apache.hbase/groupId
 artifactIdhbase-client/artifactId
 version0.96.0-hadoop2/version
 /dependency
 !-- misc --
 dependency
 groupIdorg.apache.commons/groupId
 artifactIdcommons-lang3/artifactId
 version3.1/version
 /dependency
 dependency
 groupIdcom.google.guava/groupId
 artifactIdguava/artifactId
 version${guava.version}/version
 /dependency
 dependency
 groupIdorg.apache.derby/groupId
 artifactIdderby/artifactId
 version${derby.version}/version
 /dependency
 dependency
 groupIdorg.datanucleus/groupId
 artifactIddatanucleus-core/artifactId
 version${datanucleus.version}/version
 /dependency
 dependency
 groupIdorg.datanucleus/groupId
 artifactIddatanucleus-rdbms/artifactId
 version${datanucleus-rdbms.version}/version
 /dependency
 dependency
 groupIdjavax.jdo/groupId
 artifactIdjdo-api/artifactId
 version3.0.1/version
 /dependency
 dependency
 groupIdorg.datanucleus/groupId
 artifactIddatanucleus-api-jdo/artifactId
 version${datanucleus.jdo.version}/version
 exclusions
 exclusion

[jira] [Commented] (HIVE-5252) Add ql syntax for inline java code creation

2014-02-03 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889489#comment-13889489
 ] 

Lefty Leverenz commented on HIVE-5252:
--

I added compile to the list of values for hive.security.command.whitelist in 
the wiki, but it needs to be added to *hive-default.xml.template* too.

* [Configuration Properties:  hive.security.command.whitelist 
|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.security.command.whitelist]

Also, the wiki needs to explain inline Java code creation with a few examples. 
(I'd need more to go on than compile_processor.q in the patch.) Does it belong 
in the SELECT wikidoc or a new wikidoc, or with the UDFs?  How about a new 
wikidoc under the CLI doc?

* [Language Manual 
|https://cwiki.apache.org/confluence/display/Hive/LanguageManual]
* [SELECT |https://cwiki.apache.org/confluence/display/Hive/LanguageManual 
Select]
* [Operators and UDFs 
|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF]
* [CLI |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli]

Or should documentation wait until the parent jira HIVE-5250 gets committed?

 Add ql syntax for inline java code creation
 ---

 Key: HIVE-5252
 URL: https://issues.apache.org/jira/browse/HIVE-5252
 Project: Hive
  Issue Type: Sub-task
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 0.13.0

 Attachments: HIVE-5252.1.patch.txt, HIVE-5252.2.patch.txt


 Something to the effect of compile 'my code here' using 'groovycompiler'.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input

2014-02-03 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17661/
---

Review request for hive.


Bugs: HIVE-6327
https://issues.apache.org/jira/browse/HIVE-6327


Repository: hive-git


Description
---

Added methods for those UDFs such that decimal type can be accepted and 
evaluated.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 
  ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION 

Diff: https://reviews.apache.org/r/17661/diff/


Testing
---

New unit test cases are added to test all the UDFs regarding decimal input.


Thanks,

Xuefu Zhang

[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-02-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889509#comment-13889509
 ] 

Brock Noland commented on HIVE-5783:


That test failure is unrelated to the patch.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format


[ 
https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889527#comment-13889527
 ] 

Hive QA commented on HIVE-6204:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626608/HIVE-6204.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 4997 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_keyword_1
org.apache.hadoop.hive.jdbc.TestJdbcDriver.testShowGrant
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1164/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1164/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626608

 The result of show grant / show role should be tabular format
 -

 Key: HIVE-6204
 URL: https://issues.apache.org/jira/browse/HIVE-6204
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6204.1.patch.txt


 {noformat}
 hive show grant role role1 on all;
 OK
 database  default
 table src
 principalName role1
 principalType ROLE
 privilege Create
 grantTime Wed Dec 18 14:17:56 KST 2013
 grantor   navis
 database  default
 table srcpart
 principalName role1
 principalType ROLE
 privilege Update
 grantTime Wed Dec 18 14:18:28 KST 2013
 grantor   navis
 {noformat}
 This should be something like below, especially for JDBC clients.
 {noformat}
 hive show grant role role1 on all;
 OK
 default   src role1   ROLECreate  false   
 1387343876000   navis
 default   srcpart role1   ROLEUpdate  false   
 1387343908000   navis
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf

2014-02-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889529#comment-13889529
 ] 

Brock Noland commented on HIVE-6037:


bq. Then would hive-default.xml get used when hive-site.xml doesn't exist?

I believe the HiveConf class  would then use hive-default.xml to load defaults 
and descriptions.

bq. Wow, that's a labor of love. Let's not abandon it, 

Agreed, I think we should go forward with this patch

bq. let's use it as a one-time fix to get all the parameter descriptions into 
HiveConf. Then later we can figure out how to generate 
hive-default.xml.template from HiveConf for each new release. However there's a 
problem with the default values, since HiveConf sets them so those are the 
correct values.

AFAICT this patch moves the descriptions and defaults to 
hive-default.xml.template. I think the next step is generating hive-default.xml 
from the template. Then the HiveConf class uses hive-default.xml to load 
defaults and descriptions.

 Synchronize HiveConf with hive-default.xml.template and support show conf
 -

 Key: HIVE-6037
 URL: https://issues.apache.org/jira/browse/HIVE-6037
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Priority: Minor
 Attachments: HIVE-6037.1.patch.txt


 see HIVE-5879



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken


[ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889605#comment-13889605
 ] 

Hive QA commented on HIVE-6356:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626614/HIVE-6356.1.patch.txt

{color:green}SUCCESS:{color} +1 4997 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1165/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1165/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626614

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format

2014-02-03 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889611#comment-13889611
 ] 

Ashutosh Chauhan commented on HIVE-6204:


Patch looks good. Failure seems to be related to update of .q.out file.
I wonder for grant time, is there any value in showing that at all. Shall we 
not display it ever? I want to avoid test conf variable, since current usage 
looks fine, but in future once its in, folks may abuse it.

 The result of show grant / show role should be tabular format
 -

 Key: HIVE-6204
 URL: https://issues.apache.org/jira/browse/HIVE-6204
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6204.1.patch.txt


 {noformat}
 hive show grant role role1 on all;
 OK
 database  default
 table src
 principalName role1
 principalType ROLE
 privilege Create
 grantTime Wed Dec 18 14:17:56 KST 2013
 grantor   navis
 database  default
 table srcpart
 principalName role1
 principalType ROLE
 privilege Update
 grantTime Wed Dec 18 14:18:28 KST 2013
 grantor   navis
 {noformat}
 This should be something like below, especially for JDBC clients.
 {noformat}
 hive show grant role role1 on all;
 OK
 default   src role1   ROLECreate  false   
 1387343876000   navis
 default   srcpart role1   ROLEUpdate  false   
 1387343908000   navis
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6122) Implement show grant on resource

2014-02-03 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889618#comment-13889618
 ] 

Ashutosh Chauhan commented on HIVE-6122:


Certainly resolving conflicts was easier than writing patch in first place : ) 
Thanks for all your work!

 Implement show grant on resource
 --

 Key: HIVE-6122
 URL: https://issues.apache.org/jira/browse/HIVE-6122
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6122.1.patch.txt, HIVE-6122.2.patch.txt, 
 HIVE-6122.3.patch.txt, HIVE-6122.4.patch, HIVE-6122.4.patch, 
 HIVE-6122.5.patch, HIVE-6122.6.patch


 Currently, hive shows privileges owned by a principal. Reverse API is also 
 needed, which shows all principals for a resource. 
 {noformat}
 show grant user hive_test_user on database default;
 show grant user hive_test_user on table dummy;
 show grant user hive_test_user on all;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken

2014-02-03 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889639#comment-13889639
 ] 

Ashutosh Chauhan commented on HIVE-6356:


Is htrace strictly required ? If so, than don't we need to make sure its jar is 
available at run-time (currently it seems we don't have it in our lib/ dir of 
package?) [~ndimiduk] Can you also take a look?

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6359) beeline -f fails on scripts with tabs in them.

2014-02-03 Thread Carter Shanklin (JIRA)

Carter Shanklin created HIVE-6359:
-

 Summary: beeline -f fails on scripts with tabs in them.
 Key: HIVE-6359
 URL: https://issues.apache.org/jira/browse/HIVE-6359
 Project: Hive
  Issue Type: Bug
Reporter: Carter Shanklin
Priority: Minor


On a recent trunk build I used beeline -f on a script with tabs in it.

Beeline rather unhelpfully attempts to perform tab expansion on the tabs and 
the query fails. Here's a screendump.

{code}
Connecting to jdbc:hive2://mymachine:1/mydb
Connected to: Apache Hive (version 0.13.0-SNAPSHOT)
Driver: Hive JDBC (version 0.13.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 0.13.0-SNAPSHOT by Apache Hive
0: jdbc:hive2://mymachine:1/mydb select  i_brand_id as brand_id, i_brand 
as brand,
. . . . . . . . . . . . . . . . . . . . . . .  
Display all 560 possibilities? (y or n) 
. . . . . . . . . . . . . . . . . . . . . . .  ager_id=36
. . . . . . . . . . . . . . . . . . . . . . .  
Display all 560 possibilities? (y or n) 
. . . . . . . . . . . . . . . . . . . . . . .  d d_moy=12
. . . . . . . . . . . . . . . . . . . . . . .  
Display all 560 possibilities? (y or n) 
. . . . . . . . . . . . . . . . . . . . . . .  d d_year=2001
. . . . . . . . . . . . . . . . . . . . . . . and ss_sold_date between 
'2001-12-01' and '2001-12-31'
. . . . . . . . . . . . . . . . . . . . . . .  group by i_brand, i_brand_id
. . . . . . . . . . . . . . . . . . . . . . .  order by ext_price desc, 
brand_id
. . . . . . . . . . . . . . . . . . . . . . . limit 100 ;
Error: Error while compiling statement: FAILED: ParseException line 1:65 
missing FROM at 'd_moy' near 'd' in from source (state=42000,code=4)
Closing: org.apache.hive.jdbc.HiveConnection
{code}

The same query works fine if I replace tabs with some spaces.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17622: VectorExpressionWriter for date and decimal datatypes.

2014-02-03 Thread Eric Hanson


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17622/#review33378
---


Looks good to me. See one comment inline.


ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
https://reviews.apache.org/r/17622/#comment62885

Please add a comment why you are using decimal.* and why it's different 
than the others.


- Eric Hanson


On Jan. 31, 2014, 10:19 p.m., Jitendra Pandey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17622/
 ---
 
 (Updated Jan. 31, 2014, 10:19 p.m.)
 
 
 Review request for hive and Eric Hanson.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 VectorExpressionWriter for date and decimal datatypes.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 729908a 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java 
 f513188 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriter.java
  e5c3aa4 
   
 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
  a242fef 
   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
 ad96fa5 
   ql/src/test/queries/clientpositive/vectorization_decimal_date.q 
 PRE-CREATION 
   ql/src/test/results/clientpositive/vectorization_decimal_date.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17622/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jitendra Pandey

[jira] [Commented] (HIVE-4144) Add select database() command to show the current database


[ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889712#comment-13889712
 ] 

Hive QA commented on HIVE-4144:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626617/HIVE-4144.13.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 4999 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hcatalog.hbase.snapshot.lock.TestWriteLock.testRun
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1166/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1166/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626617

 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, 
 HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, 
 HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, 
 HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, 
 HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken

2014-02-03 Thread Nick Dimiduk (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889708#comment-13889708
 ] 

Nick Dimiduk commented on HIVE-6356:


I stumbled into this recently as well. HTrace is now a required runtime 
dependency, even when it's not used. This patch is incorrect, however. Because 
Hive is using the mapred namespace classes, the correct API is to invoke 
o.a.h.hbase.mapred.TableMapReduceUtil#addDependencyJars(JobConf). This will 
wire in all of HBase's runtime dependencies for you, and also attempt to 
auto-detect additional dependencies based on the JobConf (output classes, 
partitioners, formats, etc). If you want more fine-grained control over these 
dependencies (as Pig did, see PIG-3285), there are additional static methods in 
the o.a.h.hbase.mapreduce.TableMapReduceUtil class.

For Hive's purpose, I think you'll be fine with just calling 
mapred.TableMapReduceUtil#addDependencyJars(JobConf). Having a smoke test that 
runs in pseudo-distributed mode would be helpful in verifying all requirements 
are met.

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Work stopped] (HIVE-6234) Implement fast vectorized InputFormat extension for text files

2014-02-03 Thread Eric Hanson (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Work on HIVE-6234 stopped by Eric Hanson.

Implement fast vectorized InputFormat extension for text files
--

Key: HIVE-6234
URL: https://issues.apache.org/jira/browse/HIVE-6234
Project: Hive
Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Eric Hanson
Attachments: HIVE-6234.02.patch, HIVE-6234.03.patch, Vectorized Text
InputFormat design.docx, Vectorized Text InputFormat design.pdf,
state-diagram.jpg

Implement support for vectorized scan input of text files (plain text with
configurable record and field separators). This should work for CSV files,
tab delimited files, etc.
The goal is to provide high-performance reading of these files using
vectorized scans, and also to do it as an extension of existing Hive. Then,
if vectorized query is enabled, existing tables based on text files will be
able to benefit immediately without the need to use a different input format.
After upgrading to new Hive bits that support this, faster, vectorized
processing over existing text tables should just work, when vectorization is
enabled.
Another goal is to go beyond a simple layering of vectorized row batch
iterator over the top of the existing row iterator. It should be possible to,
say, read a chunk of data into a byte buffer (several thousand or even
million rows), and then read data from it into vectorized row batches
directly. Object creations should be minimized to save allocation time and GC
overhead. If it is possible to save CPU for values like dates and numbers by
caching the translation from string to the final data type, that should
ideally be implemented.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Resolved] (HIVE-6232) allow user to control out-of-range values in HCatStorer

2014-02-03 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-6232.
--

Resolution: Won't Fix

This was rolled into the patch for HIVE-5814

 allow user to control out-of-range values in HCatStorer
 ---

 Key: HIVE-6232
 URL: https://issues.apache.org/jira/browse/HIVE-6232
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 Pig values support wider range than Hive.  e.g. Pig BIGDECIMAL vs Hive 
 DECIMAL.  When storing Pig data into Hive table, if the value is out of range 
 there are 2 options:
 1. throw an exception.
 2. write NULL instead of the value
 The 1st has the drawback that it may kill the process that loads 100M rows 
 after 90M rows have been loaded.  But the 2nd may not be appropriate for all 
 use cases.
 Should add support for additional parameters in HCatStorer where the user can 
 specify an option to controll this.
 see org.apache.pig.backend.hadoop.hbase.HBaseStorage for examples



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6316) Document support for new types in HCat

2014-02-03 Thread Eugene Koifman (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-6316:
-

Description:
HIVE-5814 added support for new types in HCat. The PDF file in that bug
explains exactly how these map to Pig types. This should be added to the Wiki
somewhere (probably here
https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).

In particular it should be highlighted that copying data from Hive TIMESTAMP to
Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now
takes new parameter which is described in the PDF doc.

was:
HIVE-5814 added support for new types in HCat. The PDF file in that bug
explains exactly how these map to Pig types. This should be added to the Wiki
somewhere (probably here
https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).

In particular it should be highlighted that copying data from Hive TIMESTAMP to
Pig DATETIME, any 'nanos' in the timestamp will be lost.

Document support for new types in HCat
--

Key: HIVE-6316
URL: https://issues.apache.org/jira/browse/HIVE-6316
Project: Hive
Issue Type: Sub-task
Components: Documentation, HCatalog
Affects Versions: 0.13.0
Reporter: Eugene Koifman

HIVE-5814 added support for new types in HCat. The PDF file in that bug
explains exactly how these map to Pig types. This should be added to the
Wiki somewhere (probably here
https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore).
In particular it should be highlighted that copying data from Hive TIMESTAMP
to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer
now takes new parameter which is described in the PDF doc.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Proposal to un-fork Sqlline

2014-02-03 Thread Julian Hyde

As you probably know, Hive’s SQL command-line interface Beeline was created by 
forking Sqlline [1] [2]. At the time it was a useful but low-activity project 
languishing on SourceForge without an active owner. Around the same time, I 
independently picked up the Sqlline code, moved it to github [3], put in place 
a maven build process, and gave it some love. Now several projects are using 
it, including Apache Drill, Apache Phoenix, Cascading Lingual and Optiq. So, 
now we have two active forks of Sqlline.

I propose to merge these development forks.

This will achieve a few things. We should be able to fix more bugs, and add 
more features, and get more people using sqlline. (Just today, someone ran into 
a bug that Drill was not saving/restoring command history, then noticed that it 
was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug still exists in 
Hive’s beeline.)

I propose the following:
1. Move the parts of hive-beeline module that do not depend upon Hive (about 
90% of the code) into a new module in the hive repo, hive-sqlline.
2. What remains in the hive-beeline module is Beeline.java (a derived class of 
Sqlline.java) and Hive-specific extensions. The hive-beeline module depends 
upon the hive-sqlline module.
3. Make sure that the new Hive sqlline module contains all fixes and useful 
changes from both forks.
4. Release sqlline as a maven artifact, say {groupId=org.apache.hive, 
artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate to 
it.
5. Longer term, consider moving hive-sqlline out of Hive, but still within 
Apache.

This achieves continuity for Hive’s users, gives the users of the non-Hive 
sqlline a version with minimal dependencies, unifies the two code lines, and 
brings everything under the Apache roof.

Please let me know if this sounds like a good proposal. I’ll log a jira case, 
then start work on a patch.

Julian

[1] https://issues.apache.org/jira/browse/HIVE-987
[2] https://issues.apache.org/jira/browse/HIVE-3100
[3] https://github.com/julianhyde/sqlline
[4] https://github.com/julianhyde/sqlline/issues/19
[5] https://issues.apache.org/jira/browse/DRILL-327

[jira] [Commented] (HIVE-6268) Network resource leak with HiveClientCache when using HCatInputFormat

2014-02-03 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889741#comment-13889741
 ] 

Sushanth Sowmyan commented on HIVE-6268:


Hi Lefty, Yes, I think we should document it in a release note. I'm planning on 
finishing HIVE-6332 in a week but it's good to have it in a release note as 
well.

 Network resource leak with HiveClientCache when using HCatInputFormat
 -

 Key: HIVE-6268
 URL: https://issues.apache.org/jira/browse/HIVE-6268
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.13.0

 Attachments: HIVE-6268.2.patch, HIVE-6268.3.patch, HIVE-6268.patch


 HCatInputFormat has a cache feature that allows HCat to cache hive client 
 connections to the metastore, so as to not keep reinstantiating a new hive 
 server every single time. This uses a guava cache of hive clients, which only 
 evicts entries from cache on the next write, or by manually managing the 
 cache.
 So, in a single threaded case, where we reuse the hive client, the cache 
 works well, but in a massively multithreaded case, where each thread might 
 perform one action, and then is never used, there are no more writes to the 
 cache, and all the clients stay alive, thus keeping ports open.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6342) hive drop partitions should use standard expr filter instead of some custom class


[ 
https://issues.apache.org/jira/browse/HIVE-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889751#comment-13889751
 ] 

Sergey Shelukhin commented on HIVE-6342:


parallel_orderby appears to be flaky, unrelated failure

 hive drop partitions should use standard expr filter instead of some custom 
 class 
 --

 Key: HIVE-6342
 URL: https://issues.apache.org/jira/browse/HIVE-6342
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6342.01.patch, HIVE-6342.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Proposal to un-fork Sqlline

2014-02-03 Thread Xuefu Zhang

Hi Julian,

Thanks for sharing your thought. I'm certainly on board on code sharing
among project. However, I don't see immediate benefits for Hive by
separating Beeline into two modules. Instead, it requires additional work
and potentially creates instability, while code sharing isn't achieved
until the proposed hive-sqlline module is promoted to an independent
project.

On the other hand, I'm thinking if it makes more sense to fork sqlline
directly into Apache. upon its completion, Hive gets rid of its copy of
sqlline and creates a dependency on the forked sqlline instead. I guess
this is a top-down approach and the benefits are immediate across multiple
projects.

Thanks,
Xuefu


On Mon, Feb 3, 2014 at 10:49 AM, Julian Hyde julianh...@gmail.com wrote:

 As you probably know, Hive's SQL command-line interface Beeline was
 created by forking Sqlline [1] [2]. At the time it was a useful but
 low-activity project languishing on SourceForge without an active owner.
 Around the same time, I independently picked up the Sqlline code, moved it
 to github [3], put in place a maven build process, and gave it some love.
 Now several projects are using it, including Apache Drill, Apache Phoenix,
 Cascading Lingual and Optiq. So, now we have two active forks of Sqlline.

 I propose to merge these development forks.

 This will achieve a few things. We should be able to fix more bugs, and
 add more features, and get more people using sqlline. (Just today, someone
 ran into a bug that Drill was not saving/restoring command history, then
 noticed that it was fixed in sqlline-1.1.3 [4] [5]. It seems that that bug
 still exists in Hive's beeline.)

 I propose the following:
 1. Move the parts of hive-beeline module that do not depend upon Hive
 (about 90% of the code) into a new module in the hive repo, hive-sqlline.
 2. What remains in the hive-beeline module is Beeline.java (a derived
 class of Sqlline.java) and Hive-specific extensions. The hive-beeline
 module depends upon the hive-sqlline module.
 3. Make sure that the new Hive sqlline module contains all fixes and
 useful changes from both forks.
 4. Release sqlline as a maven artifact, say {groupId=org.apache.hive,
 artifactId=hive-sqlline} and tell clients of julianhyde-sqlline to migrate
 to it.
 5. Longer term, consider moving hive-sqlline out of Hive, but still within
 Apache.

 This achieves continuity for Hive's users, gives the users of the non-Hive
 sqlline a version with minimal dependencies, unifies the two code lines,
 and brings everything under the Apache roof.

 Please let me know if this sounds like a good proposal. I'll log a jira
 case, then start work on a patch.

 Julian

 [1] https://issues.apache.org/jira/browse/HIVE-987
 [2] https://issues.apache.org/jira/browse/HIVE-3100
 [3] https://github.com/julianhyde/sqlline
 [4] https://github.com/julianhyde/sqlline/issues/19
 [5] https://issues.apache.org/jira/browse/DRILL-327

Re: Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input

2014-02-03 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17661/#review33459
---


I think this looks fine.


ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java
https://reviews.apache.org/r/17661/#comment62895

Are all 4 permutations of evaluate() with double/decimal args necessary 
here? Could we just do eval(double, double) and eval(decimal, decimal)? Though 
if double and decimal arg is passed in, that would result in conversion from 
double - decimal - double in the case of the double arg, which I suppose is 
less than ideal.


- Jason Dere


On Feb. 3, 2014, 2:26 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17661/
 ---
 
 (Updated Feb. 3, 2014, 2:26 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6327
 https://issues.apache.org/jira/browse/HIVE-6327
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Added methods for those UDFs such that decimal type can be accepted and 
 evaluated.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17661/diff/
 
 
 Testing
 ---
 
 New unit test cases are added to test all the UDFs regarding decimal input.
 
 
 Thanks,
 
 Xuefu Zhang

[jira] [Commented] (HIVE-6327) A few mathematic functions don't take decimal input

2014-02-03 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889807#comment-13889807
 ] 

Jason Dere commented on HIVE-6327:
--

+1

 A few mathematic functions don't take decimal input
 ---

 Key: HIVE-6327
 URL: https://issues.apache.org/jira/browse/HIVE-6327
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6327.patch


 A few mathematical functions, such as sin() cos(), etc. don't take decimal as 
 argument.
 {code}
 hive show tables;
 OK
 Time taken: 0.534 seconds
 hive create table test(d decimal(5,2));
 OK
 Time taken: 0.351 seconds
 hive select sin(d) from test;
 FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'd': No 
 matching method for class org.apache.hadoop.hive.ql.udf.UDFSin with 
 (decimal(5,2)). Possible choices: _FUNC_(double)  
 {code}
 HIVE-6246 covers only sign() function. The remaining ones, including sin, 
 cos, tan, asin, acos, atan, exp, ln, log, log10, log2, radians, and sqrt. 
 These are non-generic UDFs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6349) Column name map is broken


[ 
https://issues.apache.org/jira/browse/HIVE-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889809#comment-13889809
 ] 

Sergey Shelukhin commented on HIVE-6349:


Linking the previous jira. 
HIVE-5817.00-broken.patch is an unfinished (close-to-finished iirc) patch there 
that makes the columns in the map be by operator, w/lineage tracked so that 
they can also be retrieved depending on operator.

 Column name map is broken 
 --

 Key: HIVE-6349
 URL: https://issues.apache.org/jira/browse/HIVE-6349
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey

 Following query results in exception at run time in vector mode.
 {code}
 explain select n_name from supplier_orc s join ( select n_name, n_nationkey 
 from nation_orc n join region_orc r on n.n_regionkey = r.r_regionkey and 
 r.r_name = 'XYZ') n1 on s.s_nationkey = n1.n_nationkey;
 {code}
 Here n_name is a string and all other fields are int.
 The stack trace:
 {code}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:260)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector cannot be cast to 
 org.apache.hadoop.hive.ql.exec.vector.LongColumnVector
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:116)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.processOp(VectorMapJoinOperator.java:280)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.flushOutput(VectorMapJoinOperator.java:246)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.closeOp(VectorMapJoinOperator.java:253)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:574)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:585)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:234)
   ... 8 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input

2014-02-03 Thread Mohammad Islam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17661/#review33478
---


Looks clean.
Minor comments to consider.


ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java
https://reviews.apache.org/r/17661/#comment62920

Is it required?



ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java
https://reviews.apache.org/r/17661/#comment62921

is it required?



ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java
https://reviews.apache.org/r/17661/#comment62923

Please remove this.



ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java
https://reviews.apache.org/r/17661/#comment62925

does the base  1.0 should be changed to base = 1.0  as done in the old 
code?



ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java
https://reviews.apache.org/r/17661/#comment62922

same here


- Mohammad Islam


On Feb. 3, 2014, 2:26 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17661/
 ---
 
 (Updated Feb. 3, 2014, 2:26 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6327
 https://issues.apache.org/jira/browse/HIVE-6327
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Added methods for those UDFs such that decimal type can be accepted and 
 evaluated.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17661/diff/
 
 
 Testing
 ---
 
 New unit test cases are added to test all the UDFs regarding decimal input.
 
 
 Thanks,
 
 Xuefu Zhang

[jira] [Commented] (HIVE-6327) A few mathematic functions don't take decimal input

2014-02-03 Thread Mohammad Kamrul Islam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889827#comment-13889827
 ] 

Mohammad Kamrul Islam commented on HIVE-6327:
-

Left few minor comments in RB.

 A few mathematic functions don't take decimal input
 ---

 Key: HIVE-6327
 URL: https://issues.apache.org/jira/browse/HIVE-6327
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6327.patch


 A few mathematical functions, such as sin() cos(), etc. don't take decimal as 
 argument.
 {code}
 hive show tables;
 OK
 Time taken: 0.534 seconds
 hive create table test(d decimal(5,2));
 OK
 Time taken: 0.351 seconds
 hive select sin(d) from test;
 FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'd': No 
 matching method for class org.apache.hadoop.hive.ql.udf.UDFSin with 
 (decimal(5,2)). Possible choices: _FUNC_(double)  
 {code}
 HIVE-6246 covers only sign() function. The remaining ones, including sin, 
 cos, tan, asin, acos, atan, exp, ln, log, log10, log2, radians, and sqrt. 
 These are non-generic UDFs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267


[ 
https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889854#comment-13889854
 ] 

Hive QA commented on HIVE-6353:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626622/HIVE-6353.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4997 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1168/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1168/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626622

 Update hadoop-2 golden files after HIVE-6267
 

 Key: HIVE-6353
 URL: https://issues.apache.org/jira/browse/HIVE-6353
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6353.1.patch


 HIVE-6267 changed explain with lots of changes to golden files. Separate jira 
 because of number of files changed. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17661: HIVE-6327: A few mathematic functions don't take decimal input

2014-02-03 Thread Swarnim Kulkarni


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17661/#review33497
---



ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java
https://reviews.apache.org/r/17661/#comment62962

javadoc



ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java
https://reviews.apache.org/r/17661/#comment62973

Tabs here. Need to convert to spaces.



ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java
https://reviews.apache.org/r/17661/#comment62974

Tabs.


- Swarnim Kulkarni


On Feb. 3, 2014, 2:26 p.m., Xuefu Zhang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17661/
 ---
 
 (Updated Feb. 3, 2014, 2:26 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6327
 https://issues.apache.org/jira/browse/HIVE-6327
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Added methods for those UDFs such that decimal type can be accepted and 
 evaluated.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAcos.java 4e2b50e 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAsin.java f7adce4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAtan.java fb861a5 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCos.java d9ce12d 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFExp.java 9c8c836 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLn.java 883b541 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog.java a90e622 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog10.java 1cc70a4 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLog2.java e3f2026 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFMath.java PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRadians.java 47e18ee 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSin.java e714c17 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSqrt.java 4eefc8c 
   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFTan.java 8ebd727 
   ql/src/test/org/apache/hadoop/hive/ql/udf/TestUDFMath.java PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/17661/diff/
 
 
 Testing
 ---
 
 New unit test cases are added to test all the UDFs regarding decimal input.
 
 
 Thanks,
 
 Xuefu Zhang

[jira] [Commented] (HIVE-6325) Enable using multiple concurrent sessions in tez


[ 
https://issues.apache.org/jira/browse/HIVE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889923#comment-13889923
 ] 

Gunther Hagleitner commented on HIVE-6325:
--

Looks good so far. Left some comments on rb.

 Enable using multiple concurrent sessions in tez
 

 Key: HIVE-6325
 URL: https://issues.apache.org/jira/browse/HIVE-6325
 Project: Hive
  Issue Type: Improvement
  Components: Tez
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-6325.1.patch


 We would like to enable multiple concurrent sessions in tez via hive server 
 2. This will enable users to make efficient use of the cluster when it has 
 been partitioned using yarn queues.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17471: HIVE-6325: Enable using multiple concurrent sessions in tez

2014-02-03 Thread Gunther Hagleitner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17471/#review33496
---



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionState.java
https://reviews.apache.org/r/17471/#comment62960

I believe that file is in the wrong location. Should be in ql/test, right?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62963

Everything is static in this class. I think it'd be better to have a 
singleton and non-static members. This way we could have multiple pools if 
desired. Also should make testing easier.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62964

BlockingQueue should be able to tell you length, right?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62965

Can't sessionType denote an actual type? Class? is extremely general and 
there are no comments explaining the use.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62961

nit: some ws issues



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62966

this should come from a site file not be hard coded right?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62967

don't think this is needed



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62968

is this the right name? shouldn't that be a yarn var?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62969

comment doesn't match signature



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java
https://reviews.apache.org/r/17471/#comment62976

It doesn't look like you're keeping track of this sessionstate here. I 
think we should. The user should always get/return sessions and we handle the 
alloc/dealloc. (why can't return close the session for non default for 
instance?)



service/src/java/org/apache/hive/service/server/HiveServer2.java
https://reviews.apache.org/r/17471/#comment62977

need to handle exception properly


- Gunther Hagleitner


On Jan. 28, 2014, 10:34 p.m., Vikram Dixit Kumaraswamy wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17471/
 ---
 
 (Updated Jan. 28, 2014, 10:34 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6325
 https://issues.apache.org/jira/browse/HIVE-6325
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Enable using multiple concurrent sessions in tez.
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84ee78f 
   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 9ad5986 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionState.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java b8552a3 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionStateFactory.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java c6f431c 
   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java d7edda1 
   ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezSessionPool.java 
 PRE-CREATION 
   service/src/java/org/apache/hive/service/server/HiveServer2.java fa13783 
 
 Diff: https://reviews.apache.org/r/17471/diff/
 
 
 Testing
 ---
 
 Added multi-threaded junit tests.
 
 
 Thanks,
 
 Vikram Dixit Kumaraswamy

[jira] [Commented] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)


[ 
https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889962#comment-13889962
 ] 

Hive QA commented on HIVE-6358:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12626621/HIVE-6358.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 4997 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1169/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1169/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12626621

 filterExpr not printed in explain for tablescan operators (ppd)
 ---

 Key: HIVE-6358
 URL: https://issues.apache.org/jira/browse/HIVE-6358
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6358.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6298) Add config flag to turn off fetching partition stats


 [ 
https://issues.apache.org/jira/browse/HIVE-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6298:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~sershe], [~prasanth_j] and [~leftylev] for the 
reviews!

 Add config flag to turn off fetching partition stats
 

 Key: HIVE-6298
 URL: https://issues.apache.org/jira/browse/HIVE-6298
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6298.1.patch, HIVE-6298.2.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17632: HDFS ZeroCopy Shims for Hive

2014-02-03 Thread Gunther Hagleitner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17632/#review33519
---



pom.xml
https://reviews.apache.org/r/17632/#comment63001

I don't think we need another version, do we? for the branch we can just 
temporarily make the 23 version 2.4.0 until that one is released. then we 
switch everything over.

Is there another reason to keep both?



shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java
https://reviews.apache.org/r/17632/#comment63004

nit: lots of trailing ws.


- Gunther Hagleitner


On Feb. 1, 2014, 3:05 a.m., Gopal V wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17632/
 ---
 
 (Updated Feb. 1, 2014, 3:05 a.m.)
 
 
 Review request for hive, Gunther Hagleitner and Owen O'Malley.
 
 
 Bugs: HIVE-6346
 https://issues.apache.org/jira/browse/HIVE-6346
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Hive Shims for ZeroCopy FS read and Direct ByteBuffer decompression 
 (hadoop/branch-2 changes)
 
 
 Diffs
 -
 
   pom.xml 41f5337 
   ql/pom.xml 7087a4c 
   shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 
 ec1f18e 
   shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
 d0ff7d4 
   shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
 54c38ee 
   shims/0.23C/pom.xml PRE-CREATION 
   shims/0.23C/src/main/java/org/apache/hadoop/hive/shims/Hadoop23CShims.java 
 PRE-CREATION 
   shims/aggregator/pom.xml 7aa8c4c 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 
 2b3c6c1 
   shims/common/src/main/java/org/apache/hadoop/hive/shims/ShimLoader.java 
 bf9c84f 
   shims/pom.xml 9843836 
 
 Diff: https://reviews.apache.org/r/17632/diff/
 
 
 Testing
 ---
 
 TPC-DS queries.
 
 
 Thanks,
 
 Gopal V

[jira] [Commented] (HIVE-6346) Add Hadoop-2.4.0 shims to hive-tez


[ 
https://issues.apache.org/jira/browse/HIVE-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890015#comment-13890015
 ] 

Gunther Hagleitner commented on HIVE-6346:
--

[~t3rmin4t0r]Comment on rb. Looks good so far. Biggest question I have is 
whether we need another shim version for that. Seems we could just upgrade 23 
when 2.4 is out.

Also: The test failure is unrelated. Tests have been successful.

 Add Hadoop-2.4.0 shims to hive-tez
 --

 Key: HIVE-6346
 URL: https://issues.apache.org/jira/browse/HIVE-6346
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: tez-branch
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-6346.1.patch, HIVE-6346.2.patch


 The HadoopShims needs a 0.23C shims to add extra HDFS Caching functionality 
 which is not available in 2.2.0 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6320) Row-based ORC reader with PPD turned on dies on BufferUnderFlowException

2014-02-03 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890053#comment-13890053
 ] 

Owen O'Malley commented on HIVE-6320:
-

Actually, you always need the next 2 compression blocks regardless of whether 
the compression blocks are the same for the two row groups.

The rest of the patch looks good.

 Row-based ORC reader with PPD turned on dies on BufferUnderFlowException 
 -

 Key: HIVE-6320
 URL: https://issues.apache.org/jira/browse/HIVE-6320
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-6320.1.patch


 ORC data reader crashes out on a BufferUnderflowException, while trying to 
 read data row-by-row with the predicate push-down enabled on current trunk.
 {code}
 Caused by: java.nio.BufferUnderflowException
   at java.nio.Buffer.nextGetIndex(Buffer.java:472)
   at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207)
   at 
 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101)
 {code}
 The query run is 
 {code}
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.index.filter=true;
 insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey 
 is not null;
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-02-03 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

Status: Patch Available  (was: Open)

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-10.patch, 
 HIVE-5728-2.patch, HIVE-5728-3.patch, HIVE-5728-4.patch, HIVE-5728-5.patch, 
 HIVE-5728-6.patch, HIVE-5728-7.patch, HIVE-5728-8.patch, HIVE-5728-9.patch, 
 HIVE-5728.10.patch, HIVE-5728.11.patch, HIVE-5728.12.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2014-02-03 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

Status: Open  (was: Patch Available)

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-10.patch, 
 HIVE-5728-2.patch, HIVE-5728-3.patch, HIVE-5728-4.patch, HIVE-5728-5.patch, 
 HIVE-5728-6.patch, HIVE-5728-7.patch, HIVE-5728-8.patch, HIVE-5728-9.patch, 
 HIVE-5728.10.patch, HIVE-5728.11.patch, HIVE-5728.12.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Review Request 17678: HIVE-4996 unbalanced calls to openTransaction/commitTransaction

2014-02-03 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17678/
---

Review request for hive.


Bugs: HIVE-4996
https://issues.apache.org/jira/browse/HIVE-4996


Repository: hive-git


Description
---

Background:
First issue:
There are two levels of retrying in case of transient JDO/CP/DB errors: the 
RetryingHMSHandler and RetryingMetaStore.  But the RetryingMetaStore is flawed 
in the case of a nested transaction of a larger RetryingHMSHandler transaction 
(which is majority of cases).
Consider the following sample RetryingHMSHandler call, where variable ms is a 
RetryingRawStore.
   HMSHandler.createTable()
  ms.open()  //openTx = 1
  ms.getTable() // openTx = 2, then openTx = 1 upon intermediate commit
  ms.createTable()  //openTx = 2, then openTx = 1 upon intermediate commit
  ms.commit();  //openTx = 0

If there is any transient error in any intermediate operation and 
RetryingRawStore tries again, there will always be an unbalanced transaction, 
like:
  HMSHandler.createTable()
  ms.open()  //openTx = 1
  ms.getTable() // openTx = 2, transient error, then openTx=0 upon 
rollback.  After a retry, openTx=1, then openTx=0 upon successful intermediate 
commit 
  ms.createTable()  //openTx = 1, then openTx = 0 upon intermediate commit
  ms.commit();  //unbalanced transaction!

Retrying RawStore operations doesn't make sense in nested transaction cases, as 
the first part of the transaction is rolled-back upon transient error, and 
retry logic only saves a second half which may not make sense without the 
first.  It makes much more sense to retry the entire transaction from the top, 
which is what RetryingHMSHandler would already be doing if the 
RetryingMetaStore did not interfere.

Second issue:
The recent upgrade to BoneCP 0.8.0 seemed to cause more transient errors that 
triggered this problem.  In these cases, in-use connections are finalized, as 
follows:
WARN  bonecp.ConnectionPartition 
(ConnectionPartition.java:finalizeReferent(162)) - BoneCP detected an unclosed 
connection and will now attempt to close it for you. You should be closing this 
connection in your application - enable connectionWatch for additional 
debugging assistance or set disableConnectionTracking to true to disable this 
feature entirely.
The retry of this operation seems to get a good connection and allow the 
operation to proceed.  Reading forums, it seems some others have hit this issue 
after the upgrade, and switching back to 0.7.1 in our environment eliminated 
this issue for us.  But that reversion is outside the scope of this JIRA, and 
would be better-done in either the original or follow-up JIRA that upgraded the 
version.  

This fix targets the first issue only, as anyway it is needed for any sort of 
transient error, not just the BoneCP one that I observed.

Changes:
1. Removes RetryingRawStore in favor of RetryingHMSHandler, and removes the 
configuration property of the former.
2. Addresses the resultant holes in retry, in particular in the 
RetryingHMSHandler's construction of RawStore (before, RetryingRawStore would 
have retried failures like in creating the defaultDB).  It didn't seem 
necessary to increase the default RetryingHMSHandler retries to 2 to 
compensate, but I am open to that as well.
3. Contribute the instrumentation code that helped me to find the issue.  This 
includes printing missing stacks of exceptions that triggered retry, and adding 
debug-level tracing of ObjectStore calls to give better correlation with other 
errors/warnings in hive log.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22bb22d 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 d7854fe 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRawStoreTxn.java
 0b87077 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
2d8e483 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
fb70589 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java 
dcf97ec 

Diff: https://reviews.apache.org/r/17678/diff/


Testing
---


Thanks,

Szehon Ho

[jira] [Commented] (HIVE-6002) Create new ORC write version to address the changes to RLEv2

2014-02-03 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890067#comment-13890067
 ] 

Owen O'Malley commented on HIVE-6002:
-

Rather than introduce a new version, let's add some metadata. add a key name 
ORC.FIXED.JIRA and make a comma separated list of the fixed jiras. So in this 
case, HIVE-5994.

 Create new ORC write version to address the changes to RLEv2
 

 Key: HIVE-6002
 URL: https://issues.apache.org/jira/browse/HIVE-6002
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-6002.1.patch, HIVE-6002.2.patch


 HIVE-5994 encodes large negative big integers wrongly. This results in loss 
 of original data that is being written using orc write version 0.12. Bump up 
 the version number to differentiate the bad writes by 0.12 and the good 
 writes by this new version (0.12.1?).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction


 [ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-4996:


Attachment: HIVE-4996.patch

Attaching a fix.

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Assignee: Szehon Ho
Priority: Critical
  Labels: hive, metastore
 Attachments: HIVE-4996.patch, hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6247) select count(distinct) should be MRR in Tez


[ 
https://issues.apache.org/jira/browse/HIVE-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890075#comment-13890075
 ] 

Gunther Hagleitner commented on HIVE-6247:
--

Dug a little bit into. I think the idea makes good sense, but the description 
about MR is not correct. At least I wasn't able to make MR not use a single 
reducer for the query cited. You can rewrite the query though using a subquery 
to get the result you want.

There are two more flags to consider (when rewriting):

a) set hive.optimize.reducededuplication.min.reducer:

If this is set to 1 you will have a single reducer regardless of rewrite.

b) hive.fetch.task.aggr

If this one is true the final count will happen on the client. This is more 
important in MR than Tez (because it would start a new job in MR, in tez it's 
just another stage in the DAG).

 select count(distinct) should be MRR in Tez
 ---

 Key: HIVE-6247
 URL: https://issues.apache.org/jira/browse/HIVE-6247
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Gunther Hagleitner

 The MR query plan for select count(distinct)  fires off multiple reducers, 
 with a local work task to perform final aggregation.
 The Tez version fires off exactly 1 reducer for the entire data-set which 
 chokes and dies/slows down massively.
 To reproduce on a TPC-DS database (meaningless query)
 {code}
 select count(distinct ss_net_profit) from store_sales ss join store s on 
 ss.ss_store_sk = s.s_store_sk;
 {code}
 This spins up Map 1, Map 2 (for the dim table + fact table)  Reducer 1 which 
 is always 0/1.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17678: HIVE-4996 unbalanced calls to openTransaction/commitTransaction

2014-02-03 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17678/
---

(Updated Feb. 3, 2014, 11:10 p.m.)


Review request for hive.


Bugs: HIVE-4996
https://issues.apache.org/jira/browse/HIVE-4996


Repository: hive-git


Description (updated)
---

Background:
First issue:
There are two levels of retrying in case of transient JDO/CP/DB errors: the 
RetryingHMSHandler and RetryingRawStore.  But the RetryingRawStore is flawed in 
the case of a nested transaction of a larger RetryingHMSHandler transaction 
(which is majority of cases).
Consider the following sample RetryingHMSHandler call, where variable ms is a 
RetryingRawStore.
   HMSHandler.createTable()
  ms.open()  //openTx = 1
  ms.getTable() // openTx = 2, then openTx = 1 upon intermediate commit
  ms.createTable()  //openTx = 2, then openTx = 1 upon intermediate commit
  ms.commit();  //openTx = 0

If there is any transient error in any intermediate operation and 
RetryingRawStore tries again, there will always be an unbalanced transaction, 
like:
  HMSHandler.createTable()
  ms.open()  //openTx = 1
  ms.getTable() // openTx = 2, transient error, then openTx=0 upon 
rollback.  After a retry, openTx=1, then openTx=0 upon successful intermediate 
commit 
  ms.createTable()  //openTx = 1, then openTx = 0 upon intermediate commit
  ms.commit();  //unbalanced transaction!

Retrying RawStore operations doesn't make sense in nested transaction cases, as 
the first part of the transaction is rolled-back upon transient error, and 
retry logic only saves a second half which may not make sense without the 
first.  It makes much more sense to retry the entire transaction from the top, 
which is what RetryingHMSHandler would already be doing if the RetryingRawStore 
did not interfere.

Second issue:
The recent upgrade to BoneCP 0.8.0 seemed to cause more transient errors that 
triggered this problem.  In these cases, in-use connections are finalized, as 
follows:
WARN  bonecp.ConnectionPartition 
(ConnectionPartition.java:finalizeReferent(162)) - BoneCP detected an unclosed 
connection and will now attempt to close it for you. You should be closing this 
connection in your application - enable connectionWatch for additional 
debugging assistance or set disableConnectionTracking to true to disable this 
feature entirely.
The retry of this operation seems to get a good connection and allow the 
operation to proceed.  Reading forums, it seems some others have hit this issue 
after the upgrade, and switching back to 0.7.1 in our environment eliminated 
this issue for us.  But that reversion is outside the scope of this JIRA, and 
would be better-done in either the original or follow-up JIRA that upgraded the 
version.  

This fix targets the first issue only, as anyway it is needed for any sort of 
transient error, not just the BoneCP one that I observed.

Changes:
1. Removes RetryingRawStore in favor of RetryingHMSHandler, and removes the 
configuration property of the former.
2. Addresses the resultant holes in retry, in particular in the 
RetryingHMSHandler's construction of RawStore (before, RetryingRawStore would 
have retried failures like in creating the defaultDB).  It didn't seem 
necessary to increase the default RetryingHMSHandler retries to 2 to 
compensate, but I am open to that as well.
3. Contribute the instrumentation code that helped me to find the issue.  This 
includes printing missing stacks of exceptions that triggered retry, including 
'unbalanced calls' errors to hive log, and adding debug-level tracing of 
ObjectStore calls to give better correlation with other errors/warnings in hive 
log.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22bb22d 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 d7854fe 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRawStoreTxn.java
 0b87077 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
2d8e483 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
fb70589 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java 
dcf97ec 

Diff: https://reviews.apache.org/r/17678/diff/


Testing
---


Thanks,

Szehon Ho

[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction


 [ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-4996:


Status: Open  (was: Patch Available)

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0, 0.11.0, 0.10.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Assignee: Szehon Ho
Priority: Critical
  Labels: hive, metastore
 Attachments: HIVE-4996.patch, hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6320) Row-based ORC reader with PPD turned on dies on BufferUnderFlowException

2014-02-03 Thread Prasanth J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6320:
-

Attachment: HIVE-6320.2.patch

Addressed [~owen.omalley] and [~gopalv]'s code review comments.

 Row-based ORC reader with PPD turned on dies on BufferUnderFlowException 
 -

 Key: HIVE-6320
 URL: https://issues.apache.org/jira/browse/HIVE-6320
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Gopal V
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-6320.1.patch, HIVE-6320.2.patch


 ORC data reader crashes out on a BufferUnderflowException, while trying to 
 read data row-by-row with the predicate push-down enabled on current trunk.
 {code}
 Caused by: java.nio.BufferUnderflowException
   at java.nio.Buffer.nextGetIndex(Buffer.java:472)
   at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117)
   at 
 org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207)
   at 
 org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53)
   at 
 org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581)
   at 
 org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125)
   at 
 org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101)
 {code}
 The query run is 
 {code}
 set hive.vectorized.execution.enabled=false;
 set hive.optimize.index.filter=true;
 insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey 
 is not null;
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17678: HIVE-4996 unbalanced calls to openTransaction/commitTransaction

2014-02-03 Thread Szehon Ho


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17678/
---

(Updated Feb. 3, 2014, 11:24 p.m.)


Review request for hive.


Changes
---

Fixing redundant logging for NoSuchObjectException.


Bugs: HIVE-4996
https://issues.apache.org/jira/browse/HIVE-4996


Repository: hive-git


Description
---

Background:
First issue:
There are two levels of retrying in case of transient JDO/CP/DB errors: the 
RetryingHMSHandler and RetryingRawStore.  But the RetryingRawStore is flawed in 
the case of a nested transaction of a larger RetryingHMSHandler transaction 
(which is majority of cases).
Consider the following sample RetryingHMSHandler call, where variable ms is a 
RetryingRawStore.
   HMSHandler.createTable()
  ms.open()  //openTx = 1
  ms.getTable() // openTx = 2, then openTx = 1 upon intermediate commit
  ms.createTable()  //openTx = 2, then openTx = 1 upon intermediate commit
  ms.commit();  //openTx = 0

If there is any transient error in any intermediate operation and 
RetryingRawStore tries again, there will always be an unbalanced transaction, 
like:
  HMSHandler.createTable()
  ms.open()  //openTx = 1
  ms.getTable() // openTx = 2, transient error, then openTx=0 upon 
rollback.  After a retry, openTx=1, then openTx=0 upon successful intermediate 
commit 
  ms.createTable()  //openTx = 1, then openTx = 0 upon intermediate commit
  ms.commit();  //unbalanced transaction!

Retrying RawStore operations doesn't make sense in nested transaction cases, as 
the first part of the transaction is rolled-back upon transient error, and 
retry logic only saves a second half which may not make sense without the 
first.  It makes much more sense to retry the entire transaction from the top, 
which is what RetryingHMSHandler would already be doing if the RetryingRawStore 
did not interfere.

Second issue:
The recent upgrade to BoneCP 0.8.0 seemed to cause more transient errors that 
triggered this problem.  In these cases, in-use connections are finalized, as 
follows:
WARN  bonecp.ConnectionPartition 
(ConnectionPartition.java:finalizeReferent(162)) - BoneCP detected an unclosed 
connection and will now attempt to close it for you. You should be closing this 
connection in your application - enable connectionWatch for additional 
debugging assistance or set disableConnectionTracking to true to disable this 
feature entirely.
The retry of this operation seems to get a good connection and allow the 
operation to proceed.  Reading forums, it seems some others have hit this issue 
after the upgrade, and switching back to 0.7.1 in our environment eliminated 
this issue for us.  But that reversion is outside the scope of this JIRA, and 
would be better-done in either the original or follow-up JIRA that upgraded the 
version.  

This fix targets the first issue only, as anyway it is needed for any sort of 
transient error, not just the BoneCP one that I observed.

Changes:
1. Removes RetryingRawStore in favor of RetryingHMSHandler, and removes the 
configuration property of the former.
2. Addresses the resultant holes in retry, in particular in the 
RetryingHMSHandler's construction of RawStore (before, RetryingRawStore would 
have retried failures like in creating the defaultDB).  It didn't seem 
necessary to increase the default RetryingHMSHandler retries to 2 to 
compensate, but I am open to that as well.
3. Contribute the instrumentation code that helped me to find the issue.  This 
includes printing missing stacks of exceptions that triggered retry, including 
'unbalanced calls' errors to hive log, and adding debug-level tracing of 
ObjectStore calls to give better correlation with other errors/warnings in hive 
log.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 22bb22d 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 d7854fe 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestRawStoreTxn.java
 0b87077 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
2d8e483 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java 
fb70589 
  metastore/src/java/org/apache/hadoop/hive/metastore/RetryingRawStore.java 
dcf97ec 

Diff: https://reviews.apache.org/r/17678/diff/


Testing
---


Thanks,

Szehon Ho

[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)


 [ 
https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6358:
-

Status: Open  (was: Patch Available)

 filterExpr not printed in explain for tablescan operators (ppd)
 ---

 Key: HIVE-6358
 URL: https://issues.apache.org/jira/browse/HIVE-6358
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6358.1.patch, HIVE-6358.2.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6353) Update hadoop-2 golden files after HIVE-6267


[ 
https://issues.apache.org/jira/browse/HIVE-6353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890104#comment-13890104
 ] 

Gunther Hagleitner commented on HIVE-6353:
--

failure is unrelated.

 Update hadoop-2 golden files after HIVE-6267
 

 Key: HIVE-6353
 URL: https://issues.apache.org/jira/browse/HIVE-6353
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6353.1.patch


 HIVE-6267 changed explain with lots of changes to golden files. Separate jira 
 because of number of files changed. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction


 [ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-4996:


Status: Patch Available  (was: Open)

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.12.0, 0.11.0, 0.10.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Assignee: Szehon Ho
Priority: Critical
  Labels: hive, metastore
 Attachments: HIVE-4996.1.patch, HIVE-4996.patch, hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6358) filterExpr not printed in explain for tablescan operators (ppd)


 [ 
https://issues.apache.org/jira/browse/HIVE-6358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6358:
-

Attachment: HIVE-6358.2.patch

.2 fixes test from precommit (missed one golden file)

 filterExpr not printed in explain for tablescan operators (ppd)
 ---

 Key: HIVE-6358
 URL: https://issues.apache.org/jira/browse/HIVE-6358
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6358.1.patch, HIVE-6358.2.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-4996) unbalanced calls to openTransaction/commitTransaction


 [ 
https://issues.apache.org/jira/browse/HIVE-4996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-4996:


Attachment: HIVE-4996.1.patch

 unbalanced calls to openTransaction/commitTransaction
 -

 Key: HIVE-4996
 URL: https://issues.apache.org/jira/browse/HIVE-4996
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
 Environment: hiveserver1  Java HotSpot(TM) 64-Bit Server VM (build 
 20.6-b01, mixed mode)
Reporter: wangfeng
Assignee: Szehon Ho
Priority: Critical
  Labels: hive, metastore
 Attachments: HIVE-4996.1.patch, HIVE-4996.patch, hive-4996.path

   Original Estimate: 504h
  Remaining Estimate: 504h

 when we used hiveserver1 based on hive-0.10.0, we found the Exception 
 thrown.It was:
 FAILED: Error in metadata: MetaException(message:java.lang.RuntimeException: 
 commitTransaction was called but openTransactionCalls = 0. This probably 
 indicates that the
 re are unbalanced calls to openTransaction/commitTransaction)
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 help



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6354) Some index test golden files produce non-deterministic stats in explain


[ 
https://issues.apache.org/jira/browse/HIVE-6354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890105#comment-13890105
 ] 

Gunther Hagleitner commented on HIVE-6354:
--

failed tests are flaky - unrelated to this check in.

 Some index test golden files produce non-deterministic stats in explain
 ---

 Key: HIVE-6354
 URL: https://issues.apache.org/jira/browse/HIVE-6354
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-6354.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6255) Change Hive to not pass MRSplitsProto in MRHelpers.createMRInputPayloadWithGrouping


 [ 
https://issues.apache.org/jira/browse/HIVE-6255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-6255:
-

Assignee: Thejas M Nair

 Change Hive to not pass MRSplitsProto in 
 MRHelpers.createMRInputPayloadWithGrouping
 ---

 Key: HIVE-6255
 URL: https://issues.apache.org/jira/browse/HIVE-6255
 Project: Hive
  Issue Type: Task
Reporter: Bikas Saha
Assignee: Thejas M Nair
 Attachments: HIVE-6255.1.patch


 TEZ-650 removed this superfluous parameter since splits dont need to be 
 passed to the AM when doing split calculation on the AM. This is needed after 
 Hive builds against TEZ 0.3.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6360) Hadoop 2.3 + Tez 0.3

Gunther Hagleitner created HIVE-6360:


 Summary: Hadoop 2.3 + Tez 0.3
 Key: HIVE-6360
 URL: https://issues.apache.org/jira/browse/HIVE-6360
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner


There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not 
released yet, but will be soon. I'm proposing to collect these in the tez 
branch and do a merge back once these components have been released at that 
version.

The things depending on 0.3 or hadoop 2.3 are:

- Zero Copy read for ORC
- Unions in Tez
- Tez on secure clusters
- Changes to DagUtils to reflect tez 0.2 - 0.3



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6360) Hadoop 2.3 + Tez 0.3

[
https://issues.apache.org/jira/browse/HIVE-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gunther Hagleitner updated HIVE-6360:
-

Description:
There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not
released yet, but will be soon. I'm proposing to collect these in the tez
branch and do a merge back once these components have been released at that
version.

The things depending on 0.3 or hadoop 2.3 are:

- Zero Copy read for ORC
- Unions in Tez
- Tez on secure clusters
- Changes to DagUtils to reflect tez 0.2 - 0.3
- Prewarm containers

was:
There are some things pending that rely on hadoop 2.3 or tez 0.3. These are not
released yet, but will be soon. I'm proposing to collect these in the tez
branch and do a merge back once these components have been released at that
version.

The things depending on 0.3 or hadoop 2.3 are:

- Zero Copy read for ORC
- Unions in Tez
- Tez on secure clusters
- Changes to DagUtils to reflect tez 0.2 - 0.3

Hadoop 2.3 + Tez 0.3

Key: HIVE-6360
URL: https://issues.apache.org/jira/browse/HIVE-6360
Project: Hive
Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner

There are some things pending that rely on hadoop 2.3 or tez 0.3. These are
not released yet, but will be soon. I'm proposing to collect these in the tez
branch and do a merge back once these components have been released at that
version.
The things depending on 0.3 or hadoop 2.3 are:
- Zero Copy read for ORC
- Unions in Tez
- Tez on secure clusters
- Changes to DagUtils to reflect tez 0.2 - 0.3
- Prewarm containers

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17061: HIVE-5783 - Native Parquet Support in Hive

2014-02-03 Thread Xuefu Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17061/#review33473
---



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
https://reviews.apache.org/r/17061/#comment62912

This doesn't seem being used anywhere.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
https://reviews.apache.org/r/17061/#comment62911

This doesn't seem necessary to be public.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
https://reviews.apache.org/r/17061/#comment62935

If doubt exists, it's probably better to address it.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java
https://reviews.apache.org/r/17061/#comment62937

Please remove if not used. Make it private otherwise. The same applies to 
all code.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java
https://reviews.apache.org/r/17061/#comment62943

It's better to not to hard code those string consts here. They are probably 
defined somewhere.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java
https://reviews.apache.org/r/17061/#comment62951

I don't understand what the conversion is about: string - path - uri - 
path - string?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java
https://reviews.apache.org/r/17061/#comment62953

Either put comments or log msg. Commented out code isn't comment.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java
https://reviews.apache.org/r/17061/#comment62954

Same as above.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java
https://reviews.apache.org/r/17061/#comment62957

Could you put more comments about what sort of refactoring? Can we log a 
JIRA for it also?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
https://reviews.apache.org/r/17061/#comment62991

The if ... else ... here doesn't seem terribly different. Please refeactor 
the code.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java
https://reviews.apache.org/r/17061/#comment62996

It seems that for (int i = 0; i  selectFieldCount; i++) is better for this 
loop.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java
https://reviews.apache.org/r/17061/#comment62997

Please remove the extra blank lines, which are not necessary. The same 
applies to all code changes.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java
https://reviews.apache.org/r/17061/#comment62999

Decimal treated as double? I don't think that's acceptable.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java
https://reviews.apache.org/r/17061/#comment63000

Please change to public static ...



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
https://reviews.apache.org/r/17061/#comment63003

Please change to public static, across all code changes.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
https://reviews.apache.org/r/17061/#comment63007

Please wrap long lines. Applicable to all code changes.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
https://reviews.apache.org/r/17061/#comment63011

1. private static?
2. use string constant instead of literal



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
https://reviews.apache.org/r/17061/#comment63021

These string constant should be defined globally and referred here (and 
anywhere else).



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
https://reviews.apache.org/r/17061/#comment63020

If not supposed to be call, it's better to throw an exception.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
https://reviews.apache.org/r/17061/#comment63022

Same as above.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
https://reviews.apache.org/r/17061/#comment63023

long too long



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
https://reviews.apache.org/r/17061/#comment63024

long lines.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java
https://reviews.apache.org/r/17061/#comment63030

I don't understand why Hive would inspect an inspected result.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java
https://reviews.apache.org/r/17061/#comment63029

I don't understand why Hive would inspect an inspected result.

[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-02-03 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890138#comment-13890138
 ] 

Xuefu Zhang commented on HIVE-5783:
---

Some comments are posted on RB.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Review Request 17061: HIVE-5783 - Native Parquet Support in Hive

2014-02-03 Thread Brock Noland


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17061/#review33533
---


Thanks for the comments, Justin, if you see this, I can address these tomorrow.

- Brock Noland


On Jan. 30, 2014, 2:48 p.m., Brock Noland wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17061/
 ---
 
 (Updated Jan. 30, 2014, 2:48 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-5783
 https://issues.apache.org/jira/browse/HIVE-5783
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Adds native parquet support hive
 
 
 Diffs
 -
 
   data/files/parquet_create.txt PRE-CREATION 
   data/files/parquet_partitioned.txt PRE-CREATION 
   pom.xml 41f5337 
   ql/pom.xml 7087a4c 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ProjectionPusher.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ArrayWritableGroupConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableGroupConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableRecordConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveGroupConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/DeepParquetHiveMapInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveArrayInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/StandardParquetHiveMapInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetByteInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetPrimitiveInspectorFactory.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetShortInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetStringInspector.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/writable/BigDecimalWritable.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/writable/BinaryWritable.java 
 PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java
  PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
 13d0a56 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g f83c15d 
   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 010e04f 
   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 538b2b0 
   ql/src/java/parquet/hive/DeprecatedParquetInputFormat.java PRE-CREATION 
   ql/src/java/parquet/hive/DeprecatedParquetOutputFormat.java PRE-CREATION 
   ql/src/java/parquet/hive/MapredParquetInputFormat.java PRE-CREATION 
   ql/src/java/parquet/hive/MapredParquetOutputFormat.java PRE-CREATION 
   ql/src/java/parquet/hive/serde/ParquetHiveSerDe.java PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestHiveSchemaConverter.java 
 PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetInputFormat.java
  PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java
  PRE-CREATION 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 PRE-CREATION 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestAbstractParquetMapInspector.java
  PRE-CREATION

[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

2014-02-03 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890146#comment-13890146
 ] 

Brock Noland commented on HIVE-5783:


Thanks Xuefu. Justin, I can address these items tomorrow and have an updated 
patch.

 Native Parquet Support in Hive
 --

 Key: HIVE-5783
 URL: https://issues.apache.org/jira/browse/HIVE-5783
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Justin Coffey
Assignee: Justin Coffey
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch, 
 HIVE-5783.patch, HIVE-5783.patch, HIVE-5783.patch


 Problem Statement:
 Hive would be easier to use if it had native Parquet support. Our 
 organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
 Hive integration and would like to now contribute that integration to Hive.
 About Parquet:
 Parquet is a columnar storage format for Hadoop and integrates with many 
 Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
 Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
 Parquet integration.
 Changes Details:
 Parquet was built with dependency management in mind and therefore only a 
 single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-4144) Add select database() command to show the current database


[ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890151#comment-13890151
 ] 

Navis commented on HIVE-4144:
-

Cannot reproduce. Seemed not related.

 Add select database() command to show the current database
 

 Key: HIVE-4144
 URL: https://issues.apache.org/jira/browse/HIVE-4144
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Mark Grover
Assignee: Navis
 Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, 
 HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, 
 HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, HIVE-4144.8.patch.txt, 
 HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, HIVE-4144.D9597.2.patch, 
 HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch


 A recent hive-user mailing list conversation asked about having a command to 
 show the current database.
 http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
 MySQL seems to have a command to do so:
 {code}
 select database();
 {code}
 http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
 We should look into having something similar in Hive.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6356) Dependency injection in hbase storage handler is broken


[ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890156#comment-13890156
 ] 

Navis commented on HIVE-6356:
-

Right. I've forgot there are two version of TableMapReduceUtil. HIVE-3603 
changed import clause of TableMapReduceUtil to mapred, which cause this 
problem. I'll fix this shortly after.

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6356:


Status: Open  (was: Patch Available)

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6356:


Attachment: HIVE-6356.2.patch.txt

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken


 [ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6356:


Status: Patch Available  (was: Open)

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format


[ 
https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890173#comment-13890173
 ] 

Navis commented on HIVE-6204:
-

I believe security related metrics always concerns timestamp (There are two 
time metric in role and the first patch is missing 'create time'. Will be fixed 
in next patch). So we should find a way to mask time parts without introducing 
test conf var. Any idea? 

 The result of show grant / show role should be tabular format
 -

 Key: HIVE-6204
 URL: https://issues.apache.org/jira/browse/HIVE-6204
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6204.1.patch.txt


 {noformat}
 hive show grant role role1 on all;
 OK
 database  default
 table src
 principalName role1
 principalType ROLE
 privilege Create
 grantTime Wed Dec 18 14:17:56 KST 2013
 grantor   navis
 database  default
 table srcpart
 principalName role1
 principalType ROLE
 privilege Update
 grantTime Wed Dec 18 14:18:28 KST 2013
 grantor   navis
 {noformat}
 This should be something like below, especially for JDBC clients.
 {noformat}
 hive show grant role role1 on all;
 OK
 default   src role1   ROLECreate  false   
 1387343876000   navis
 default   srcpart role1   ROLEUpdate  false   
 1387343908000   navis
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6329) Support column level encryption/decryption


[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890182#comment-13890182
 ] 

Navis commented on HIVE-6329:
-

Failures seemed not related to this. 

 Support column level encryption/decryption
 --

 Key: HIVE-6329
 URL: https://issues.apache.org/jira/browse/HIVE-6329
 Project: Hive
  Issue Type: New Feature
  Components: Security, Serializers/Deserializers
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6329.1.patch.txt, HIVE-6329.2.patch.txt, 
 HIVE-6329.3.patch.txt


 Receiving some requirements on encryption recently but hive is not supporting 
 it. Before the full implementation via HIVE-5207, this might be useful for 
 some cases.
 {noformat}
 hive create table encode_test(id int, name STRING, phone STRING, address 
 STRING) 
  ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
  WITH SERDEPROPERTIES ('column.encode.indices'='2,3', 
 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
 STORED AS TEXTFILE;
 OK
 Time taken: 0.584 seconds
 hive insert into table encode_test select 
 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
 ..
 OK
 Time taken: 5.121 seconds
 hive select * from encode_test;
 OK
 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
 Time taken: 0.078 seconds, Fetched: 1 row(s)
 hive 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6267) Explain explain


[ 
https://issues.apache.org/jira/browse/HIVE-6267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890199#comment-13890199
 ] 

Navis commented on HIVE-6267:
-

[~hagleitn] Generally looks good to me. Simple and concise but things like 
Position of Big Table should not be removed, imho. I think I was a little 
pissed off yesterday, which was Sunday for you but Monday for me. I apologize 
for the rudeness.

 Explain explain
 ---

 Key: HIVE-6267
 URL: https://issues.apache.org/jira/browse/HIVE-6267
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Fix For: 0.13.0

 Attachments: HIVE-6267.1.partial, HIVE-6267.2.partial, 
 HIVE-6267.3.partial, HIVE-6267.4.patch, HIVE-6267.5.patch, HIVE-6267.6.patch, 
 HIVE-6267.7.patch.gz, HIVE-6267.8.patch


 I've gotten feedback over time saying that it's very difficult to grok our 
 explain command. There's supposedly a lot of information that mainly matters 
 to developers or the testing framework. Comparing it to other major DBs it 
 does seem like we're packing way more into explain than other folks.
 I've gone through the explain checking, what could be done to improve 
 readability. Here's a list of things I've found:
 - AST (unreadable in it's lisp syntax, not really required for end users)
 - Vectorization (enough to display once per task and only when true)
 - Expressions representation is very lengthy, could be much more compact
 - if not exists on DDL (enough to display only on true, or maybe not at all)
 - bucketing info (enough if displayed only if table is actually bucketed)
 - external flag (show only if external)
 - GlobalTableId (don't need in plain explain, maybe in extended)
 - Position of big table (already clear from plan)
 - Stats always (Most DBs mostly only show stats in explain, that gives a 
 sense of what the planer thinks will happen)
 - skew join (only if true should be enough)
 - limit doesn't show the actual limit
 - Alias - Map Operator tree - alias is duplicated in TableScan operator
 - tag is only useful at runtime (move to explain extended)
 - Some names are camel case or abbreviated, clearer if full name
 - Tez is missing vertex map (aka edges)
 - explain formatted (json) is broken right now (swallows some information)
 Since changing explain results in many golden file updates, i'd like to take 
 a stab at all of these at once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format

2014-02-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890217#comment-13890217
 ] 

Thejas M Nair commented on HIVE-6204:
-

I think, ideally we should not have test specific conditions in the main code. 
One way to workaround it would be to use another configurable class that 
returns the timestamp. In case of tests, we use a class that just returns a 
hard-coded time. This would be something similar to dependency injection. But 
this is probably something that can also be done in a separate jira.

If hive had the ability to have metadata statements such as this in subquery, 
we could have solved this by selecting the appropriate fields for testing.
Navis, can you please create a reviewboard link as well ?


 The result of show grant / show role should be tabular format
 -

 Key: HIVE-6204
 URL: https://issues.apache.org/jira/browse/HIVE-6204
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6204.1.patch.txt


 {noformat}
 hive show grant role role1 on all;
 OK
 database  default
 table src
 principalName role1
 principalType ROLE
 privilege Create
 grantTime Wed Dec 18 14:17:56 KST 2013
 grantor   navis
 database  default
 table srcpart
 principalName role1
 principalType ROLE
 privilege Update
 grantTime Wed Dec 18 14:18:28 KST 2013
 grantor   navis
 {noformat}
 This should be something like below, especially for JDBC clients.
 {noformat}
 hive show grant role role1 on all;
 OK
 default   src role1   ROLECreate  false   
 1387343876000   navis
 default   srcpart role1   ROLEUpdate  false   
 1387343908000   navis
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6204) The result of show grant / show role should be tabular format

2014-02-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890218#comment-13890218
 ] 

Thejas M Nair commented on HIVE-6204:
-

I mean a configurable class could be used instead of 
System.currentTimeMillis()/1000; in ObjectStore.


 The result of show grant / show role should be tabular format
 -

 Key: HIVE-6204
 URL: https://issues.apache.org/jira/browse/HIVE-6204
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6204.1.patch.txt


 {noformat}
 hive show grant role role1 on all;
 OK
 database  default
 table src
 principalName role1
 principalType ROLE
 privilege Create
 grantTime Wed Dec 18 14:17:56 KST 2013
 grantor   navis
 database  default
 table srcpart
 principalName role1
 principalType ROLE
 privilege Update
 grantTime Wed Dec 18 14:18:28 KST 2013
 grantor   navis
 {noformat}
 This should be something like below, especially for JDBC clients.
 {noformat}
 hive show grant role role1 on all;
 OK
 default   src role1   ROLECreate  false   
 1387343876000   navis
 default   srcpart role1   ROLEUpdate  false   
 1387343908000   navis
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)


 [ 
https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6256:
---

Attachment: HIVE-6256.nogen.patch
HIVE-6256.patch

Patch. drop* tests seem to pass

 add batch dropping of partitions to Hive metastore (as well as to dropTable)
 

 Key: HIVE-6256
 URL: https://issues.apache.org/jira/browse/HIVE-6256
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch


 Metastore drop partitions call drops one partition; when many are being 
 dropped this can be slow. Partitions could be dropped in batch instead, if 
 multiple are dropped via one command. Drop table can also use that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HIVE-6361) Un-fork Sqlline

2014-02-03 Thread Julian Hyde (JIRA)

Julian Hyde created HIVE-6361:
-

 Summary: Un-fork Sqlline
 Key: HIVE-6361
 URL: https://issues.apache.org/jira/browse/HIVE-6361
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.12.0
Reporter: Julian Hyde


I propose to merge the two development forks of sqlline: Hive's beeline module, 
and the fork at https://github.com/julianhyde/sqlline.

How did the forks come about? Hive’s SQL command-line interface Beeline was 
created by forking Sqlline (see HIVE-987, HIVE-3100), which at the time it was 
a useful but low-activity project languishing on SourceForge without an active 
owner. Around the same time, Julian Hyde independently started a github repo 
based on the same code base. Now several projects are using Julian Hyde's 
sqlline, including Apache Drill, Apache Phoenix, Cascading Lingual and Optiq.

Merging these two forks will allow us to pool our resources. (Case in point: 
Drill issue DRILL-327 had already been fixed in a later version of sqlline; it 
still exists in beeline.)

I propose the following steps:
1. Copy Julian Hyde's sqlline as a new Hive module, hive-sqlline.
2. Port fixes to hive-beeline into hive-sqlline.
3. Make hive-beeline depend on hive-sqlline, and remove code that is identical. 
What remains in the hive-beeline module is Beeline.java (a derived class of 
Sqlline.java) and Hive-specific extensions.
4. Make the hive-sqlline the official successor to Julian Hyde's sqlline.

This achieves continuity for Hive’s users, gives the users of the non-Hive 
sqlline a version with minimal dependencies, unifies the two code lines, and 
brings everything under the Apache roof.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Re: Proposal to un-fork Sqlline

2014-02-03 Thread Julian Hyde

On Feb 3, 2014, at 11:15 AM, Xuefu Zhang xzh...@cloudera.com wrote:

 I'm thinking if it makes more sense to fork sqlline
 directly into Apache. upon its completion, Hive gets rid of its copy of
 sqlline and creates a dependency on the forked sqlline instead. I guess
 this is a top-down approach and the benefits are immediate across multiple
 projects.

You’re basically suggesting that I do step 3 before 1 and 2. It makes sense, 
because it reduces risk.

I have logged https://issues.apache.org/jira/browse/HIVE-6361 with an updated 
proposal.

Julian

[jira] [Commented] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)


[ 
https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890245#comment-13890245
 ] 

Sergey Shelukhin commented on HIVE-6256:


Oh. this patch incorporates heavily changed HIVE-6342. The approach used is 
sending expressions to metastore. Unfortunately, to populate results if 
semantic analyzer (which seems to be of dubious value) we still need to fetch 
partitions from client also.
Also, JDO requires fetching objects to delete them, which is sad... we may need 
to do followup patch to do direct SQL deletes, but that may not be safe wrt 
other code going thru DN.

 add batch dropping of partitions to Hive metastore (as well as to dropTable)
 

 Key: HIVE-6256
 URL: https://issues.apache.org/jira/browse/HIVE-6256
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch


 Metastore drop partitions call drops one partition; when many are being 
 dropped this can be slow. Partitions could be dropped in batch instead, if 
 multiple are dropped via one command. Drop table can also use that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-5859) Create view does not captures inputs

2014-02-03 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890244#comment-13890244
 ] 

Thejas M Nair commented on HIVE-5859:
-

+1

 Create view does not captures inputs
 

 Key: HIVE-5859
 URL: https://issues.apache.org/jira/browse/HIVE-5859
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D14235.1.patch, HIVE-5859.2.patch.txt, 
 HIVE-5859.3.patch.txt, HIVE-5859.4.patch.txt, HIVE-5859.5.patch.txt


 For example, 
 CREATE VIEW view_j5jbymsx8e_1 as SELECT * FROM tbl_j5jbymsx8e;
 should capture default.tbl_j5jbymsx8e as input entity for authorization 
 process but currently it's not.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)


[ 
https://issues.apache.org/jira/browse/HIVE-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890248#comment-13890248
 ] 

Sergey Shelukhin commented on HIVE-6256:


[~hagleitn] [~ashutoshc] can you please review?

 add batch dropping of partitions to Hive metastore (as well as to dropTable)
 

 Key: HIVE-6256
 URL: https://issues.apache.org/jira/browse/HIVE-6256
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Attachments: HIVE-6256.nogen.patch, HIVE-6256.patch


 Metastore drop partitions call drops one partition; when many are being 
 dropped this can be slow. Partitions could be dropped in batch instead, if 
 multiple are dropped via one command. Drop table can also use that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HIVE-6327) A few mathematic functions don't take decimal input

2014-02-03 Thread Xuefu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6327:
--

Attachment: HIVE-6327.1.patch

Patch #1 updated:

1. removed a tab
2. if condition, changed to = 1.0

 A few mathematic functions don't take decimal input
 ---

 Key: HIVE-6327
 URL: https://issues.apache.org/jira/browse/HIVE-6327
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.11.0, 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6327.1.patch, HIVE-6327.patch


 A few mathematical functions, such as sin() cos(), etc. don't take decimal as 
 argument.
 {code}
 hive show tables;
 OK
 Time taken: 0.534 seconds
 hive create table test(d decimal(5,2));
 OK
 Time taken: 0.351 seconds
 hive select sin(d) from test;
 FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'd': No 
 matching method for class org.apache.hadoop.hive.ql.udf.UDFSin with 
 (decimal(5,2)). Possible choices: _FUNC_(double)  
 {code}
 HIVE-6246 covers only sign() function. The remaining ones, including sin, 
 cos, tan, asin, acos, atan, exp, ln, log, log10, log2, radians, and sqrt. 
 These are non-generic UDFs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Review Request 17687: HIVE-6256 add batch dropping of partitions to Hive metastore (as well as to dropTable)

2014-02-03 Thread Sergey Shelukhin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17687/
---

Review request for hive, Ashutosh Chauhan and Gunther Hagleitner.


Repository: hive-git


Description
---

See jira.


Diffs
-

  
hcatalog/core/src/main/java/org/apache/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
 8bb4045 
  
hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/HCatSemanticAnalyzer.java
 75f54e2 
  metastore/if/hive_metastore.thrift e327e2a 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
2d8e483 
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 
bcbb52e 
  metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
377709f 
  metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
e18e13f 
  metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 0715e22 
  metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 2e3b6da 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 6998b43 
  
metastore/src/test/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 f54ae53 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java 598be11 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java a926f1e 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java e59decc 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 46f96ce 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java c51e998 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java f4d9a83 
  ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java 831aefc 
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionSpec.java 5a6553f 
  ql/src/test/queries/clientnegative/drop_partition_filter_failure2.q 4d238d7 
  ql/src/test/results/clientnegative/drop_partition_failure.q.out 5db9d92 
  ql/src/test/results/clientnegative/drop_partition_filter_failure.q.out 
863d821 

Diff: https://reviews.apache.org/r/17687/diff/


Testing
---


Thanks,

Sergey Shelukhin

[jira] [Commented] (HIVE-6256) add batch dropping of partitions to Hive metastore (as well as to dropTable)