[jira] [Commented] (HIVE-2471) Add timestamp column to the partition stats table.

2012-03-20 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233258#comment-13233258
 ] 

Hudson commented on HIVE-2471:
--

Integrated in Hive-trunk-h0.21 #1322 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1322/])
HIVE-2471 Add timestamp column to the partition stats table.
(Kevin Wilfong via namit) (Revision 1302739)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1302739
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsSetupConstants.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java


 Add timestamp column to the partition stats table.
 --

 Key: HIVE-2471
 URL: https://issues.apache.org/jira/browse/HIVE-2471
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2471.1.patch.txt, HIVE-2471.D2367.1.patch, 
 HIVE-2471.D2367.2.patch, HIVE-2471.D2367.3.patch


 Occasionally, when entries are added to the partition stats table the program 
 is halted before it can delete those entries, by an exception, keyboard 
 interrupt, etc.  These build up to the point where the table gets very large, 
 and it hurts the performance of the update statement which is often called.  
 In order to fix this, I am adding a column to the table which is 
 auto-populated with the current timestamp.  This will allow us to create 
 scripts that go through periodically and clean out old entries from the table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2881) Remove redundant key comparing in SMBMapJoinOperator

2012-03-20 Thread Navis (Created) (JIRA)
Remove redundant key comparing in SMBMapJoinOperator


 Key: HIVE-2881
 URL: https://issues.apache.org/jira/browse/HIVE-2881
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2881.D2379.1.patch

Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2881) Remove redundant key comparing in SMBMapJoinOperator

2012-03-20 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2881:
--

Attachment: HIVE-2881.D2379.1.patch

navis requested code review of HIVE-2881 [jira] Remove redundant key comparing 
in SMBMapJoinOperator.
Reviewers: JIRA

  DPAL-988 Remove redundant key comparing in SMBMapJoinOperator

  Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D2379

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/5331/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Remove redundant key comparing in SMBMapJoinOperator
 

 Key: HIVE-2881
 URL: https://issues.apache.org/jira/browse/HIVE-2881
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2881.D2379.1.patch


 Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1

2012-03-20 Thread Sushanth Sowmyan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-2084:
---

Attachment: HIVE-2084.2.patch.txt

Updated patch, and changed to Datanucleus v3.0.8. Does anyone still have any 
failing tests with this upgrade?

 Upgrade datanucleus from 2.0.3 to 3.0.1
 ---

 Key: HIVE-2084
 URL: https://issues.apache.org/jira/browse/HIVE-2084
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ning Zhang
Assignee: Carl Steinbach
  Labels: datanucleus
 Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, 
 HIVE-2084.patch


 It seems the datanucleus 2.2.3 does a better join in caching. The time it 
 takes to get the same set of partition objects takes about 1/4 of the time it 
 took for the first time. While with 2.0.3, it took almost the same amount of 
 time in the second execution. We should retest the test case mentioned in 
 HIVE-1853, HIVE-1862.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2882) Problem with Hive using JDBC

2012-03-20 Thread Bhavesh Shah (Created) (JIRA)
Problem with Hive using JDBC


 Key: HIVE-2882
 URL: https://issues.apache.org/jira/browse/HIVE-2882
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.7.1
 Environment: Operating System - Ubuntu 11.10
Softwares - Hadoop-0.20.2, Hive-0.7.1
Reporter: Bhavesh Shah
Priority: Critical


I am trying to implement a task in Hive (Similar to Stored Procedure in SQL 
(Block of queries)).
In SQL, when we write cursor, first we execute select query and then fetching 
the records we perform some actions.

Likely I have fired a select query in Hive as:

String driverName = org.apache.hadoop.hive.jdbc.HiveDriver;
Class.forName(driverName);
Connection con = 
DriverManager.getConnection(jdbc:hive://localhost:1/default, , );
String sql=null;
Statement stmt = con.createStatement();
Statement stmt1 = con.createStatement();
ResultSet res=null;
ResultSet rs1=null;

sql=select a,c,b from tbl_name;
res=stmt.executeQuery();--- CONTAINS 30 RECORDS
while(res.next())
{
 sql=select d,e,f, from t1;
 rs1=stmt1.executeQuery();
 like wise many queries are there.

.
.
.
..
}
But the problem is that while loop executes only once instead of 30 times when 
the inner query (inside while) gets execute.

And If I create two different connection for both the queries then all works 
fine.
Like:
String driverName = org.apache.hadoop.hive.jdbc.HiveDriver;
Class.forName(driverName);
Connection con = 
DriverManager.getConnection(jdbc:hive://localhost:1/default, , );
Connection con1 = 
DriverManager.getConnection(jdbc:hive://localhost:1/default, , );
String sql=null;
Statement stmt = con.createStatement();
Statement stmt1 = con1.createStatement();
ResultSet res=null;
ResultSet rs1=null;

To sum up, when I iterate through a result set do I need to use a different 
connection(and statement object) to
execute other queries

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2883) Metastore client doesnt close connection properly

2012-03-20 Thread Ashutosh Chauhan (Created) (JIRA)
Metastore client doesnt close connection properly
-

 Key: HIVE-2883
 URL: https://issues.apache.org/jira/browse/HIVE-2883
 Project: Hive
  Issue Type: Bug
Reporter: Ashutosh Chauhan


While closing connection, it always fail with following trace. Seemingly, it 
doesnt have any harmful effects.
{code}
12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore 
client
org.apache.thrift.transport.TTransportException: Cannot write to null 
outputStream
at 
org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
at 
org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
at 
org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
at 
com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421)
at 
com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2884) When distinct is used in from of STDDEV, the statement fails with a Java exception error

2012-03-20 Thread Mauro Cazzari (Created) (JIRA)
When distinct is used in from of STDDEV, the statement fails with a Java 
exception error
--

 Key: HIVE-2884
 URL: https://issues.apache.org/jira/browse/HIVE-2884
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.7.1
Reporter: Mauro Cazzari


Given the following statement:

select distinct STDDEV_SAMP(TXT_1.`age`) as AgeAlias, 
   STDDEV_SAMP(TXT_1.`weight`) as WeightAlias 
  from `CLASS` TXT_1;

Hive generates a Java SQL exception error upon execution. If the distinct is 
removed, the statement runs fine. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2885) select distinct string from table returns wrong result

2012-03-20 Thread Mauro Cazzari (Created) (JIRA)
select distinct string from table returns wrong result


 Key: HIVE-2885
 URL: https://issues.apache.org/jira/browse/HIVE-2885
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.8.1
Reporter: Mauro Cazzari


Give the following table:

CREATE TABLE `MYTAB` (`a` DOUBLE)

containing the values 1, 2, 3, and 4, the following SQL fails to produce the 
correct result:

select distinct 'FOO' from `MYTAB`

Note that this issue didn't show up with Hive 7. Only 8.1 seems to be affected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2886) distinct with order by fails with Java SQL exception.

2012-03-20 Thread Mauro Cazzari (Created) (JIRA)
distinct with order by fails with Java SQL exception.
-

 Key: HIVE-2886
 URL: https://issues.apache.org/jira/browse/HIVE-2886
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.7.1
Reporter: Mauro Cazzari


The following select:

select distinct TXT_1.`a`, TXT_1.`b` from `MYTAB` TXT_1 order by TXT_1.`a` asc

fails with a Java SQL exception. Note that if the distinct or the table alias 
is removed from the SQL, the statement executes fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2885) select distinct string from table returns wrong result

2012-03-20 Thread Mauro Cazzari (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233663#comment-13233663
 ] 

Mauro Cazzari commented on HIVE-2885:
-

I just verified that the problem seems to have gone away with the latest 
version of Hive. If anyone knows which Hive issue fixed this, feel free to mark 
it as a DUP.
Thanks!

 select distinct string from table returns wrong result
 

 Key: HIVE-2885
 URL: https://issues.apache.org/jira/browse/HIVE-2885
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.8.1
Reporter: Mauro Cazzari

 Give the following table:
 CREATE TABLE `MYTAB` (`a` DOUBLE)
 containing the values 1, 2, 3, and 4, the following SQL fails to produce the 
 correct result:
 select distinct 'FOO' from `MYTAB`
 Note that this issue didn't show up with Hive 7. Only 8.1 seems to be 
 affected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1

2012-03-20 Thread Carl Steinbach (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-2084:


Assignee: Sushanth Sowmyan  (was: Carl Steinbach)

@Sushanth: Can you please submit a review request on phabricator? Thanks. 
https://cwiki.apache.org/Hive/phabricatorcodereview.html

 Upgrade datanucleus from 2.0.3 to 3.0.1
 ---

 Key: HIVE-2084
 URL: https://issues.apache.org/jira/browse/HIVE-2084
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ning Zhang
Assignee: Sushanth Sowmyan
  Labels: datanucleus
 Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, 
 HIVE-2084.patch


 It seems the datanucleus 2.2.3 does a better join in caching. The time it 
 takes to get the same set of partition objects takes about 1/4 of the time it 
 took for the first time. While with 2.0.3, it took almost the same amount of 
 time in the second execution. We should retest the test case mentioned in 
 HIVE-1853, HIVE-1862.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1323 - Still Failing

2012-03-20 Thread Apache Jenkins Server
Changes for Build #1322
[namit] HIVE-2471 Add timestamp column to the partition stats table.
(Kevin Wilfong via namit)


Changes for Build #1323



1 tests failed.
REGRESSION:  
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore

Error Message:
null

Stack Trace:
java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.getDelegationToken(HiveMetaStore.java:2749)
at 
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.getDelegationTokenStr(TestHadoop20SAuthBridge.java:296)
at 
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.obtainTokenAndAddIntoUGI(TestHadoop20SAuthBridge.java:303)
at 
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore(TestHadoop20SAuthBridge.java:212)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:243)
at junit.framework.TestSuite.run(TestSuite.java:238)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1323)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1323/ to 
view the results.

[jira] [Commented] (HIVE-942) use bucketing for group by

2012-03-20 Thread Kevin Wilfong (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233789#comment-13233789
 ] 

Kevin Wilfong commented on HIVE-942:


If the table is bucketed and sorted on the group by keys, the group by operator 
would not need to do a hash, which would help with memory consumption.

 use bucketing for group by
 --

 Key: HIVE-942
 URL: https://issues.apache.org/jira/browse/HIVE-942
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain

 Group by on a bucketed column can be completely performed on the mapper if 
 the split can be adjusted to span the key boundary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-2262) mapjoin followed by union all, groupby does not work

2012-03-20 Thread Ashutosh Chauhan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-2262.


Resolution: Fixed

This is no longer reproducible on trunk. Feel free to reopen if there is some 
other variant which can produce this.

 mapjoin followed by union all, groupby does not work
 

 Key: HIVE-2262
 URL: https://issues.apache.org/jira/browse/HIVE-2262
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.7.1
Reporter: yu xiang
Priority: Trivial

 sql:
 CREATE TABLE nulltest2(int_data1 INT, int_data2 INT, boolean_data BOOLEAN, 
 double_data DOUBLE, string_data STRING) ROW FORMAT DELIMITED FIELDS 
 TERMINATED BY ',';
 CREATE TABLE nulltest3(int_data1 INT) ROW FORMAT DELIMITED FIELDS TERMINATED 
 BY ',';
 explain select int_data2,count(1) from (select /*+mapjoin(a)*/ int_data2, 1 
 as c1, 0 as c2 from nulltest2 a join nulltest3 b on(a.int_data1 = 
 b.int_data1) union all select /*+mapjoin(a)*/ int_data2, 1 as c1, 2 as c2 
 from nulltest2 a join nulltest3 b on(a.int_data1 = b.int_data1)) mapjointable 
 group by int_data2;
 exception:
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:156)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:551)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:514)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.initPlan(GenMapRedUtils.java:125)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMRRedSink1.process(GenMRRedSink1.java:76)
 at 
 org.apache.hadoop.hive.ql.optimizer.GenMRRedSink3.process(GenMRRedSink3.java:64)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
 Analyse the reason:
 1.When use mapjoin,union,groupby together,the 
 UnionProcFactory.MapJoinUnion()(optimizer) will set the MapJoinSubq true, and 
 set up the UnionParseContext.
 2.In GenMRUnion1, hive will call mergeMapJoinUnion, and also set task plan.
 3.In GenMRRedSink3, hive judges the uCtx.isMapOnlySubq(), and call 
 GenMRRedSink1()).process() to init the plan.But the utask's plan has been set 
 yet, it just need to set reducer.And also the utask is processing temporary 
 table, there is no topOp map to table.So here we get null exception.
 Solutions:
 1.SQL solution:use a sub query to modify the sql;
 2.Code solution:when in mergeMapJoinUnion, after the task plan have been set, 
 set a settaskplan flag true to indicate the plan for this utask has been 
 set.When in GenMRRedSink3 ,if this flag sets true, don't use the 
 GenMRRedSink1()).process() to reinit the plan.
 
 if (uCtx.isMapOnlySubq()!upc.isIssetTaskPlan())
 
 I don't know whether the code solution is suitable.
 Is there any better solution?
 thx

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2870) Throw an error when a nonexistent partition is accessed in strict mode

2012-03-20 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233825#comment-13233825
 ] 

Phabricator commented on HIVE-2870:
---

kevinwilfong has commented on the revision HIVE-2870 [jira] Throw an error 
when a nonexistent partition is accessed in strict mode.

  Could you add a test case for this, e.g. look at the files in 
ql/src/test/queries/clientnegative

REVISION DETAIL
  https://reviews.facebook.net/D2319


 Throw an error when a nonexistent partition is accessed in strict mode
 --

 Key: HIVE-2870
 URL: https://issues.apache.org/jira/browse/HIVE-2870
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Lucian Adrian Grijincu
Priority: Minor
 Attachments: HIVE-2870.D2319.1.patch, HIVE-2870.D2319.2.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 When a table does not exist and someone tries to read from it in a query, 
 Hive throws an error.
 But if a partition is directly accessed that does not exist, an error is not 
 thrown. This is inconsistent and also leads to a lot of confused users who 
 get no output.
 This task is to cause Hive to throw an error when the partition pruner for a 
 query eliminates ALL existing partitions for some table when running in 
 strict mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2870) Throw an error when a nonexistent partition is accessed in strict mode

2012-03-20 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233844#comment-13233844
 ] 

Phabricator commented on HIVE-2870:
---

kevinwilfong has commented on the revision HIVE-2870 [jira] Throw an error 
when a nonexistent partition is accessed in strict mode.

  By a test case, I meant add a query to that list to test that the query 
actually does fail if there are no partitions.

  Also, it might be a good idea to add a test case in client positive to make 
sure you can turn it off.

  See 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-AddaUnitTest

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java:246-252
 Sorry to be picky, but could you also add a small comment here just giving a 
quick explanation of what it does.

REVISION DETAIL
  https://reviews.facebook.net/D2319


 Throw an error when a nonexistent partition is accessed in strict mode
 --

 Key: HIVE-2870
 URL: https://issues.apache.org/jira/browse/HIVE-2870
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Lucian Adrian Grijincu
Priority: Minor
 Attachments: HIVE-2870.D2319.1.patch, HIVE-2870.D2319.2.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 When a table does not exist and someone tries to read from it in a query, 
 Hive throws an error.
 But if a partition is directly accessed that does not exist, an error is not 
 thrown. This is inconsistent and also leads to a lot of confused users who 
 get no output.
 This task is to cause Hive to throw an error when the partition pruner for a 
 query eliminates ALL existing partitions for some table when running in 
 strict mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2877) TABLESAMPLE(x PERCENT) tests fail on 0.22/0.23

2012-03-20 Thread Carl Steinbach (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233888#comment-13233888
 ] 

Carl Steinbach commented on HIVE-2877:
--

There are two distinct problems:

1) Many of the queries in split_sample.q and sample_islocalmode_hook.q are 
nondeterministic. This can be fixed by adding ORDER BY clauses.

2) The second problem is more serious. Both of the tests set 
mapred.max.split.size=300 and hive.merge.smallfiles.avgsize=1 in an effort to 
force the generation of multiple splits and multiple output files. However, 
Hadoop 0.20 is incapable of generating splits smaller than the block size when 
using CombineFileInputFormat, so only one split is generated. This has a 
significant impact on the results of the TABLESAMPLE(x PERCENT). This issue was 
fixed in MAPREDUCE-2046 which is included in 0.23.

Suggested Fixes: 
# Make the queries deterministic
# Restrict these tests to Hadoop versions = 0.22


 TABLESAMPLE(x PERCENT) tests fail on 0.22/0.23
 --

 Key: HIVE-2877
 URL: https://issues.apache.org/jira/browse/HIVE-2877
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Carl Steinbach



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.

2012-03-20 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2797:
--

Attachment: HIVE-2797.D1701.5.patch

kevinwilfong updated the revision HIVE-2797 [jira] Make the IP address of a 
Thrift client available to HMSHandler..
Reviewers: JIRA, njain, ashutoshc

  Really sorry about that Ashutosh.  The TUGIContainingTranspport does not 
extend TSocket, which caused the errors you saw.  I added a getSocket method to 
the class which returns the Socket object if the underlying TTransport is an 
instance of TSocket, otherwise null.  TUGIBasedProcessor's implementation of 
setIpAddress now uses this method and handles the case of null.

REVISION DETAIL
  https://reviews.facebook.net/D1701

AFFECTED FILES
  
shims/src/common/java/org/apache/hadoop/hive/thrift/TUGIContainingTransport.java
  
metastore/src/test/org/apache/hadoop/hive/metastore/TestRemoteUGIHiveMetaStoreIpAddress.java
  metastore/src/test/org/apache/hadoop/hive/metastore/IpAddressListener.java
  
metastore/src/test/org/apache/hadoop/hive/metastore/TestRemoteHiveMetaStoreIpAddress.java
  metastore/src/java/org/apache/hadoop/hive/metastore/TUGIBasedProcessor.java
  
metastore/src/java/org/apache/hadoop/hive/metastore/TSetIpAddressProcessor.java
  metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java


 Make the IP address of a Thrift client available to HMSHandler.
 ---

 Key: HIVE-2797
 URL: https://issues.apache.org/jira/browse/HIVE-2797
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, 
 HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch


 Currently, in unsecured mode, metastore Thrift calls are, from the 
 HMSHandler's point of view, anonymous.  If we expose the IP address of the 
 Thrift client to the HMSHandler from the Processor, this will help to give 
 some context, in particular for audit logging, of where the call is coming 
 from.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.

2012-03-20 Thread Kevin Wilfong (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2797:


Status: Patch Available  (was: Open)

 Make the IP address of a Thrift client available to HMSHandler.
 ---

 Key: HIVE-2797
 URL: https://issues.apache.org/jira/browse/HIVE-2797
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, 
 HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch


 Currently, in unsecured mode, metastore Thrift calls are, from the 
 HMSHandler's point of view, anonymous.  If we expose the IP address of the 
 Thrift client to the HMSHandler from the Processor, this will help to give 
 some context, in particular for audit logging, of where the call is coming 
 from.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.

2012-03-20 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233897#comment-13233897
 ] 

Phabricator commented on HIVE-2797:
---

kevinwilfong has commented on the revision HIVE-2797 [jira] Make the IP 
address of a Thrift client available to HMSHandler..

  I added a test case which verifies that the IP address is accessible when the 
setugi config variable is true.

  In addition I ran the entire test suite, TestHiveServerSessions timed out, 
but that seems to be an unrelated issue as it does the same on a fresh checkout.

REVISION DETAIL
  https://reviews.facebook.net/D1701


 Make the IP address of a Thrift client available to HMSHandler.
 ---

 Key: HIVE-2797
 URL: https://issues.apache.org/jira/browse/HIVE-2797
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, 
 HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch


 Currently, in unsecured mode, metastore Thrift calls are, from the 
 HMSHandler's point of view, anonymous.  If we expose the IP address of the 
 Thrift client to the HMSHandler from the Processor, this will help to give 
 some context, in particular for audit logging, of where the call is coming 
 from.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1

2012-03-20 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2084:
--

Attachment: HIVE-2084.D2397.1.patch

khorgath requested code review of HIVE-2084 [jira] Upgrade datanucleus from 
2.0.3 to 3.0.1.
Reviewers: JIRA

  Updated HIVE-2084 to work off DataNucleus release 3.0.8

  It seems the datanucleus 2.2.3 does a better join in caching. The time it 
takes to get the same set of partition objects takes about 1/4 of the time it 
took for the first time. While with 2.0.3, it took almost the same amount of 
time in the second execution. We should retest the test case mentioned in 
HIVE-1853, HIVE-1862.

TEST PLAN
  existing tests (this is a library dep upgrade)

REVISION DETAIL
  https://reviews.facebook.net/D2397

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  conf/hive-default.xml.template
  ivy/libraries.properties
  metastore/ivy.xml
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/5367/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Upgrade datanucleus from 2.0.3 to 3.0.1
 ---

 Key: HIVE-2084
 URL: https://issues.apache.org/jira/browse/HIVE-2084
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ning Zhang
Assignee: Sushanth Sowmyan
  Labels: datanucleus
 Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, 
 HIVE-2084.D2397.1.patch, HIVE-2084.patch


 It seems the datanucleus 2.2.3 does a better join in caching. The time it 
 takes to get the same set of partition objects takes about 1/4 of the time it 
 took for the first time. While with 2.0.3, it took almost the same amount of 
 time in the second execution. We should retest the test case mentioned in 
 HIVE-1853, HIVE-1862.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1

2012-03-20 Thread Sushanth Sowmyan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233928#comment-13233928
 ] 

Sushanth Sowmyan commented on HIVE-2084:


Updated : https://reviews.facebook.net/D2397

(Also, please ignore the patch file I'd attached here before, I'd generated it 
from the hcatalog root dir, so it contains extra directory structure)



 Upgrade datanucleus from 2.0.3 to 3.0.1
 ---

 Key: HIVE-2084
 URL: https://issues.apache.org/jira/browse/HIVE-2084
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Ning Zhang
Assignee: Sushanth Sowmyan
  Labels: datanucleus
 Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, 
 HIVE-2084.D2397.1.patch, HIVE-2084.patch


 It seems the datanucleus 2.2.3 does a better join in caching. The time it 
 takes to get the same set of partition objects takes about 1/4 of the time it 
 took for the first time. While with 2.0.3, it took almost the same amount of 
 time in the second execution. We should retest the test case mentioned in 
 HIVE-1853, HIVE-1862.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2881) Remove redundant key comparing in SMBMapJoinOperator

2012-03-20 Thread Navis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2881:


Status: Patch Available  (was: Open)

Passed all tests.

 Remove redundant key comparing in SMBMapJoinOperator
 

 Key: HIVE-2881
 URL: https://issues.apache.org/jira/browse/HIVE-2881
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-2881.D2379.1.patch


 Currently, SMBJoin compares keys twice in #findSmallestKey and #joinObject.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2797) Make the IP address of a Thrift client available to HMSHandler.

2012-03-20 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233950#comment-13233950
 ] 

Phabricator commented on HIVE-2797:
---

ashutoshc has accepted the revision HIVE-2797 [jira] Make the IP address of a 
Thrift client available to HMSHandler..

  No worries, Kevin. Thanks, for making changes.
  +1 Feel free to commit if tests pass.

REVISION DETAIL
  https://reviews.facebook.net/D1701

BRANCH
  svn


 Make the IP address of a Thrift client available to HMSHandler.
 ---

 Key: HIVE-2797
 URL: https://issues.apache.org/jira/browse/HIVE-2797
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2797.D1701.1.patch, HIVE-2797.D1701.2.patch, 
 HIVE-2797.D1701.3.patch, HIVE-2797.D1701.4.patch, HIVE-2797.D1701.5.patch


 Currently, in unsecured mode, metastore Thrift calls are, from the 
 HMSHandler's point of view, anonymous.  If we expose the IP address of the 
 Thrift client to the HMSHandler from the Processor, this will help to give 
 some context, in particular for audit logging, of where the call is coming 
 from.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1555) JDBC Storage Handler

2012-03-20 Thread Weihua Jiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233997#comment-13233997
 ] 

Weihua Jiang commented on HIVE-1555:


Hi Andrew,

How about the progress of integration now? Where can I find your patch?  I am 
very interested in this feature. I think I can provide some help on your work.


 JDBC Storage Handler
 

 Key: HIVE-1555
 URL: https://issues.apache.org/jira/browse/HIVE-1555
 Project: Hive
  Issue Type: New Feature
  Components: JDBC
Reporter: Bob Robertson
Assignee: Andrew Wilson
 Attachments: JDBCStorageHandler Design Doc.pdf

   Original Estimate: 24h
  Remaining Estimate: 24h

 With the Cassandra and HBase Storage Handlers I thought it would make sense 
 to include a generic JDBC RDBMS Storage Handler so that you could import a 
 standard DB table into Hive. Many people must want to perform HiveQL joins, 
 etc against tables in other systems etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira