date:20130816

[jira] [Commented] (HIVE-4914) filtering via partition name should be done inside metastore server (implementation)

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742037#comment-13742037
 ] 

Hive QA commented on HIVE-4914:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598355/HIVE-4914.patch

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 2879 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterSinglePartition
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterSinglePartition
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testFilterLastPartition
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartitionFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartitionFilter
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testFilterSinglePartition
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testFilterLastPartition
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testFilterLastPartition
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartitionFilter
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testFilterLastPartition
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testFilterSinglePartition
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testFilterSinglePartition
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported2
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartitionFilter
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testFilterLastPartition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/460/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/460/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

> filtering via partition name should be done inside metastore server 
> (implementation)
> 
>
> Key: HIVE-4914
> URL: https://issues.apache.org/jira/browse/HIVE-4914
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-4914-only-no-gen.patch, HIVE-4914-only.patch, 
> HIVE-4914.patch, HIVE-4914.patch
>
>
> Currently, if the filter pushdown is impossible (which is most cases), the 
> client gets all partition names from metastore, filters them, and asks for 
> partitions by names for the filtered set.
> Metastore server code should do that instead; it should check if pushdown is 
> possible and do it if so; otherwise it should do name-based filtering.
> Saves the roundtrip with all partition names from the server to client, and 
> also removes the need to have pushdown viability checking on both sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-5109) could not get an exists partition

2013-08-16 Thread mylinyuzhi (JIRA)

mylinyuzhi created HIVE-5109:


 Summary: could not get an exists partition
 Key: HIVE-5109
 URL: https://issues.apache.org/jira/browse/HIVE-5109
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
 Environment: ubuntu12.04
Reporter: mylinyuzhi
Priority: Critical


the partition[dt=20130805] is exists.but could not get.

20130816 17:21:31:709[ERROR] 
org.apache.hadoop.hive.metastore.RetryingHMSHandler 
NoSuchObjectException(message:partition values=[20130805])
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionWithAuth(ObjectStore.java:1424)
at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
at $Proxy8.getPartitionWithAuth(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(HiveMetaStore.java:2014)
at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
at $Proxy9.get_partition_with_auth(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:822)
at sun.reflect.GeneratedMethodAccessor159.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at $Proxy10.getPartitionWithAuthInfo(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1580)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1543)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:778)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:678)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1203)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1057)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8295)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:443)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:344)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5106) HCatFieldSchema overrides equals() but not hashCode()

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742068#comment-13742068
 ] 

Hive QA commented on HIVE-5106:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598352/HIVE-5106.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2880 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported2
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/461/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/461/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

> HCatFieldSchema overrides equals() but not hashCode()
> -
>
> Key: HIVE-5106
> URL: https://issues.apache.org/jira/browse/HIVE-5106
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.12.0
>
> Attachments: HIVE-5106.patch
>
>
> It's likely that objects of this type are not hashed today but would lead to 
> very nasty bugs if they ever will be.
> Looks like it was introduced in HCATALOG-438.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5109) could not get an exists partition

2013-08-16 Thread mylinyuzhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mylinyuzhi updated HIVE-5109:
-

Component/s: (was: HCatalog)
 Metastore

> could not get an exists partition
> -
>
> Key: HIVE-5109
> URL: https://issues.apache.org/jira/browse/HIVE-5109
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
> Environment: ubuntu12.04
>Reporter: mylinyuzhi
>Priority: Critical
>
> the partition[dt=20130805] is exists.but could not get.
> 20130816 17:21:31:709[ERROR] 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler 
> NoSuchObjectException(message:partition values=[20130805])
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionWithAuth(ObjectStore.java:1424)
>   at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>   at $Proxy8.getPartitionWithAuth(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(HiveMetaStore.java:2014)
>   at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at $Proxy9.get_partition_with_auth(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:822)
>   at sun.reflect.GeneratedMethodAccessor159.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
>   at $Proxy10.getPartitionWithAuthInfo(Unknown Source)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1580)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1543)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:778)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:678)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1203)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1057)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8295)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:443)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:344)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4999) Shim class HiveHarFileSystem does not have a hadoop2 counterpart

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742142#comment-13742142
 ] 

Hudson commented on HIVE-4999:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2271 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2271/])
HIVE-4999 Shim class HiveHarFileSystem does not have a hadoop2 counterpart 
(Brock Noland via egc)

Submitted by: Brock Noland  
Reviewed by: Edward Capriolo (ecapriolo: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514277)
* 
/hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/HiveHarFileSystem.java
* 
/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HiveHarFileSystem.java


> Shim class HiveHarFileSystem does not have a hadoop2 counterpart
> 
>
> Key: HIVE-4999
> URL: https://issues.apache.org/jira/browse/HIVE-4999
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4999.patch
>
>
> HiveHarFileSystem only exists in the 0.20 shim.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5090) Remove unwanted file from the trunk.

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742140#comment-13742140
 ] 

Hudson commented on HIVE-5090:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2271 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2271/])
HIVE-5090 - Remove unwanted file from the trunk (Ashutosh Chauhan via Brock 
Noland) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514437)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.orig


> Remove unwanted file from the trunk.
> 
>
> Key: HIVE-5090
> URL: https://issues.apache.org/jira/browse/HIVE-5090
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> Seems like 
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java.orig got 
> accidentally checked in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5055) SessionState temp file gets created in history file directory

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742143#comment-13742143
 ] 

Hudson commented on HIVE-5055:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2271 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2271/])
HIVE-5055 SessionState temp file gets created in history directory (Hari Sankar 
Sivarama Subramaniyan via egc)

Submitted by: Hari Sankar Sivarama Subramaniyan 
Reviewed by: Edward Guy Capriolo (ecapriolo: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514284)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java


> SessionState temp file gets created in history file directory
> -
>
> Key: HIVE-5055
> URL: https://issues.apache.org/jira/browse/HIVE-5055
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Thejas M Nair
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 0.12.0
>
> Attachments: HIVE-5055.1.patch.txt, HIVE-5055.2.patch.txt
>
>
> SessionState.start creates a temp file for temp results, but this file is 
> created in hive.querylog.location, which supposed to be used only for hive 
> history log files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4246) Implement predicate pushdown for ORC

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742144#comment-13742144
 ] 

Hudson commented on HIVE-4246:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2271 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2271/])
HIVE-4246: Implement predicate pushdown for ORC (Owen O'Malley via Gunther 
Hagleitner) (gunther: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514438)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/BitFieldReader.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/InStream.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RunLengthByteReader.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitFieldReader.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestBitPack.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInStream.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestIntegerCompressionReader.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthByteReader.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRunLengthIntegerReader.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java
* /hive/trunk/ql/src/test/results/compiler/plan/case_sensitivity.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/cast1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/groupby1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/groupby2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/groupby3.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/groupby4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/groupby5.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/groupby6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input20.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input3.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input5.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input7.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input8.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input9.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input_part1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input_testsequencefile.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input_testxpath.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/input_testxpath2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join3.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join5.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join7.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/join8.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample2.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample3.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample5.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/sample7.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/subq.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/udf1.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/udf4.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/udf6.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/udf_case.q.xml
* /hive/trunk/ql/src/test/results/compiler/plan/udf_when.q.xml
* /hive/tr

[jira] [Commented] (HIVE-4611) SMB joins fail based on bigtable selection policy.

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742141#comment-13742141
 ] 

Hudson commented on HIVE-4611:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2271 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2271/])
HIVE-4611 : SMB joins fail based on bigtable selection policy. (Vikram Dixit K 
via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514530)
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/BigTableSelectorForAutoSMJ.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/LeftmostBigTableSelectorForAutoSMJ.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinTaskDispatcher.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
* /hive/trunk/ql/src/test/queries/clientnegative/auto_sortmerge_join_1.q
* /hive/trunk/ql/src/test/queries/clientpositive/auto_sortmerge_join_15.q
* /hive/trunk/ql/src/test/results/clientnegative/auto_sortmerge_join_1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/smb_bucketmapjoin.q.out
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_15.q.out


> SMB joins fail based on bigtable selection policy.
> --
>
> Key: HIVE-4611
> URL: https://issues.apache.org/jira/browse/HIVE-4611
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.11.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 0.12.0
>
> Attachments: HIVE-4611.2.patch, HIVE-4611.3.patch, HIVE-4611.4.patch, 
> HIVE-4611.5.patch.txt, HIVE-4611.6.patch.txt, HIVE-4611.patch
>
>
> The default setting for 
> hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the 
> big table as the one with largest average partition size. However, this can 
> result in a query failing because this policy conflicts with the big table 
> candidates chosen for outer joins. This policy should just be a tie breaker 
> and not have the ultimate say in the choice of tables.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)

2013-08-16 Thread Benjamin Jakobus (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5019:
---

Attachment: HIVE-5019.3.patch.txt

> Use StringBuffer instead of += (issue 1)
> 
>
> Key: HIVE-5019
> URL: https://issues.apache.org/jira/browse/HIVE-5019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Benjamin Jakobus
>Assignee: Benjamin Jakobus
> Fix For: 0.12.0
>
> Attachments: HIVE-5019.2.patch.txt, HIVE-5019.3.patch.txt
>
>
> Issue 1 - use of StringBuilder over += inside loops. 
> java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
> java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
> java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
> java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
> java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
> java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
> java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
> java/org/apache/hadoop/hive/ql/udf/UDFLike.java
> java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
> java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
> java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)

2013-08-16 Thread Benjamin Jakobus (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5019:
---

Release Note: Replaced string concatenation inside loops with StringBuilder
  Status: Patch Available  (was: Open)

> Use StringBuffer instead of += (issue 1)
> 
>
> Key: HIVE-5019
> URL: https://issues.apache.org/jira/browse/HIVE-5019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Benjamin Jakobus
>Assignee: Benjamin Jakobus
> Fix For: 0.12.0
>
> Attachments: HIVE-5019.2.patch.txt, HIVE-5019.3.patch.txt
>
>
> Issue 1 - use of StringBuilder over += inside loops. 
> java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
> java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
> java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
> java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
> java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
> java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
> java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
> java/org/apache/hadoop/hive/ql/udf/UDFLike.java
> java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
> java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
> java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5019) Use StringBuffer instead of += (issue 1)

2013-08-16 Thread Benjamin Jakobus (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Jakobus updated HIVE-5019:
---

Attachment: (was: HIVE-5019.3.patch.txt)

> Use StringBuffer instead of += (issue 1)
> 
>
> Key: HIVE-5019
> URL: https://issues.apache.org/jira/browse/HIVE-5019
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Benjamin Jakobus
>Assignee: Benjamin Jakobus
> Fix For: 0.12.0
>
> Attachments: HIVE-5019.2.patch.txt, HIVE-5019.3.patch.txt
>
>
> Issue 1 - use of StringBuilder over += inside loops. 
> java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
> java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
> java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java
> java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
> java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
> java/org/apache/hadoop/hive/ql/security/authorization/BitSetCheckedAuthorizationProvider.java
> java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsUtils.java
> java/org/apache/hadoop/hive/ql/udf/UDFLike.java
> java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
> java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java
> java/org/apache/hadoop/hive/ql/udf/ptf/NPath.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742153#comment-13742153
 ] 

Hudson commented on HIVE-5069:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #60 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/60/])
HIVE-5069 : Tests on list bucketing are failing again in hadoop2 (Sergey 
Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514568)
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


> Tests on list bucketing are failing again in hadoop2
> 
>
> Key: HIVE-5069
> URL: https://issues.apache.org/jira/browse/HIVE-5069
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-5069.D12201.1.patch, HIVE-5069.D12243.1.patch
>
>
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4583) Make Hive compile and run with JDK7

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742155#comment-13742155
 ] 

Hudson commented on HIVE-4583:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #60 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/60/])
HIVE-4583 : HCatalog test TestPigHCatUtil might fail on JDK7 (Jarek Jarcec 
Cecho via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514567)
* 
/hive/trunk/hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestPigHCatUtil.java


> Make Hive compile and run with JDK7
> ---
>
> Key: HIVE-4583
> URL: https://issues.apache.org/jira/browse/HIVE-4583
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> This is an umbrella ticket to cover all issues related to supporting JDK 7 in 
> HIVE. Many such issues are expected. Of course, JDK 6 needs to continue to 
> work.
> The major obstacles on the way supporting JDK7 are:
> 1. JDBC component needs to be upgraded because of JDBC interface changes in 
> JDK7.
> 2. DataNucleus needs to be upgraded because the current version doesn't 
> support JDK7.
> 3. Many test failures needs to be fixed. The majority of the failures are 
> caused by JDK7 subtle behaviour change.
> 4. Build needs to be changed to accommodate JDK7 compiler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742154#comment-13742154
 ] 

Hudson commented on HIVE-5048:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #60 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/60/])
HIVE-5048 : StorageBasedAuthorization provider causes an NPE when asked to 
authorize from client side. (Sushanth Sowmyan via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514569)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/HiveAuthorizationProviderBase.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java


> StorageBasedAuthorization provider causes an NPE when asked to authorize from 
> client side.
> --
>
> Key: HIVE-5048
> URL: https://issues.apache.org/jira/browse/HIVE-5048
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 0.12.0
>
> Attachments: HIVE-5048.2.patch, HIVE-5048.patch
>
>
> StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a 
> HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and 
> HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705.
> As long as it's used as a HMAP, i.e. from the metastore-side, as was its 
> initial implementation intent, everything's great. However, HMAP extends HAP, 
> and there is no reason SBAP shouldn't be expected to work as a HAP as well. 
> However, it uses a wh variable that is never initialized if it is called as a 
> HAP, and hence, it will always fail when authorize is called on it.
> We should change SBAP so that it correctly initiazes wh so that it can be run 
> as a HAP as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-5109) could not get an exists partition

2013-08-16 Thread mylinyuzhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mylinyuzhi resolved HIVE-5109.
--

Resolution: Not A Problem

> could not get an exists partition
> -
>
> Key: HIVE-5109
> URL: https://issues.apache.org/jira/browse/HIVE-5109
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
> Environment: ubuntu12.04
>Reporter: mylinyuzhi
>Priority: Trivial
>
> the partition[dt=20130805] is exists.but could not get.
> 20130816 17:21:31:709[ERROR] 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler 
> NoSuchObjectException(message:partition values=[20130805])
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionWithAuth(ObjectStore.java:1424)
>   at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>   at $Proxy8.getPartitionWithAuth(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(HiveMetaStore.java:2014)
>   at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at $Proxy9.get_partition_with_auth(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:822)
>   at sun.reflect.GeneratedMethodAccessor159.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
>   at $Proxy10.getPartitionWithAuthInfo(Unknown Source)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1580)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1543)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:778)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:678)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1203)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1057)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8295)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:443)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:344)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5109) could not get an exists partition

2013-08-16 Thread mylinyuzhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mylinyuzhi updated HIVE-5109:
-

Priority: Trivial  (was: Critical)

> could not get an exists partition
> -
>
> Key: HIVE-5109
> URL: https://issues.apache.org/jira/browse/HIVE-5109
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.11.0
> Environment: ubuntu12.04
>Reporter: mylinyuzhi
>Priority: Trivial
>
> the partition[dt=20130805] is exists.but could not get.
> 20130816 17:21:31:709[ERROR] 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler 
> NoSuchObjectException(message:partition values=[20130805])
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionWithAuth(ObjectStore.java:1424)
>   at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>   at $Proxy8.getPartitionWithAuth(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(HiveMetaStore.java:2014)
>   at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
>   at $Proxy9.get_partition_with_auth(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:822)
>   at sun.reflect.GeneratedMethodAccessor159.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
>   at $Proxy10.getPartitionWithAuthInfo(Unknown Source)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1580)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1543)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:778)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:678)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1203)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1057)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8295)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:443)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:344)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5109) could not get an exists partition

2013-08-16 Thread mylinyuzhi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mylinyuzhi updated HIVE-5109:
-

Description: 
sorry,i make a misdake

the partition[dt=20130805] is exists.but could not get.

20130816 17:21:31:709[ERROR] 
org.apache.hadoop.hive.metastore.RetryingHMSHandler 
NoSuchObjectException(message:partition values=[20130805])
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionWithAuth(ObjectStore.java:1424)
at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
at $Proxy8.getPartitionWithAuth(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(HiveMetaStore.java:2014)
at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
at $Proxy9.get_partition_with_auth(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:822)
at sun.reflect.GeneratedMethodAccessor159.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at $Proxy10.getPartitionWithAuthInfo(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1580)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1543)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:778)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:678)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1203)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1057)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8295)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:443)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:344)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)

  was:
the partition[dt=20130805] is exists.but could not get.

20130816 17:21:31:709[ERROR] 
org.apache.hadoop.hive.metastore.RetryingHMSHandler 
NoSuchObjectException(message:partition values=[20130805])
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionWithAuth(ObjectStore.java:1424)
at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
at $Proxy8.getPartitionWithAuth(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition_with_auth(HiveMetaStore.java:2014)
at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:102)
at $Proxy9.get_partition_with_auth(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartitionWithAuthInfo(HiveMetaStoreClient.java:822)
at sun.reflect.GeneratedMethodAccessor159.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at $Proxy10.getPartitionWithAuthInfo(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1580)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1543)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:778)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742184#comment-13742184
 ] 

Brock Noland commented on HIVE-4838:


Done, looks like the last build had a connection error to source control.

> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally these logical units could be seperated
> * HashTableSinkObjectCtx has unused fields and unused methods
> * CommonJoinOperator and children use ArrayList on left hand side when only 
> List is required
> * There are unused classes MRU, DCLLItemm and classes which duplicate 
> functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Tez branch and tez based patches

2013-08-16 Thread Edward Capriolo

I still am not sure we are doing this the ideal way. I am not a believer in
a commit-then-review branch.

This issue is an example.

https://issues.apache.org/jira/browse/HIVE-5108

I ask myself these questions:
Does this currently work? Are their tests? If so which ones are broken? How
does the patch fix them without tests to validate?

Having a commit-then-review branch just seems subversive to our normal
process, and a quick short cut to not have to be bothered by writing tests
or involving anyone else.



On Mon, Aug 5, 2013 at 1:54 PM, Alan Gates  wrote:

>
> On Jul 29, 2013, at 9:53 PM, Edward Capriolo wrote:
>
> > Also watched http://www.ustream.tv/recorded/36323173
> >
> > I definitely see the win in being able to stream inter-stage output.
> >
> > I see some cases where small intermediate results can be kept "In
> memory".
> > But I was somewhat under the impression that the map reduce spill
> settings
> > kept stuff in memory, isn't that what spill settings are?
>
> No.  MapReduce always writes shuffle data to local disk.  And intermediate
> results between MR jobs are always persisted to HDFS, as there's no other
> option.  When we talk of being able to keep intermediate results in memory
> we mean getting rid of both of these disk writes/reads when appropriate
> (meaning not always, there's a trade off between speed and error handling
> to be made here, see below for more details).
>
> >
> > There is a few bullet points that came up repeatedly that I do not
> follow:
> >
> > Something was said to the effect of "Container reuse makes X faster".
> > Hadoop has jvm reuse. Not following what the difference is here? Not
> > everyone has a 10K node cluster.
>
> Sharing JVMs across users is inherently insecure (we can't guarantee what
> code the first user left behind that may interfere with later users).  As I
> understand container re-use in Tez it constrains the re-use to one user for
> security reasons, but still avoids additional JVM start up costs.  But this
> is a question that the Tez guys could answer better on the Tez lists (
> d...@tez.incubator.apache.org)
>
> >
> > "Joins in map reduce are hard" Really? I mean some of them are I guess,
> but
> > the typical join is very easy. Just shuffle by the join key. There was
> not
> > really enough low level details here saying why joins are better in tez.
>
> Join is not a natural operation in MapReduce.  MR gives you one input and
> one output.  You end up having to bend the rules to do have multiple
> inputs.  The idea here is that Tez can provide operators that naturally
> work with joins and other operations that don't fit the one input/one
> output model (eg unions, etc.).
>
> >
> > "Chosing the number of maps and reduces is hard" Really? I do not find it
> > that hard, I think there are times when it's not perfect but I do not
> find
> > it hard. The talk did not really offer anything here technical on how tez
> > makes this better other then it could make it better.
>
> Perhaps manual would be a better term here than hard.  In our experience
> it takes quite a bit of engineer trial and error to determine the optimal
> numbers.  This may be ok if you're going to invest the time once and then
> run the same query every day for 6 months.  But obviously it doesn't work
> for the ad hoc case.  Even in the batch case it's not optimal because every
> once and a while an engineer has to go back and re-optimize the query to
> deal with changing data sizes, data characteristics, etc.  We want the
> optimizer to handle this without human intervention.
>
> >
> > The presentations mentioned streaming data, how do two nodes stream data
> > between a tasks and how it it reliable? If the sender or receiver dies
> does
> > the entire process have to start again?
>
> If the sender or receiver dies then the query has to be restarted from
> some previous point where data was persisted to disk.  The idea here is
> that speed vs error recovery trade offs should be made by the optimizer.
>  If the optimizer estimates that a query will complete in 5 seconds it can
> stream everything and if a node fails it just re-runs the whole query.  If
> it estimates that a particular phase of a query will run for an hour it can
> choose to persist the results to HDFS so that in the event of a failure
> downstream the long phase need not be re-run.  Again we want this to be
> done automatically by the system so the user doesn't need to control this
> level of detail.
>
> >
> > Again one of the talks implied there is a prototype out there that
> launches
> > hive jobs into tez. I would like to see that, it might answer more
> > questions then a power point, and I could profile some common queries.
>
> As mentioned in a previous email afaik Gunther's pushed all these changes
> to the Tez branch in Hive.
>
> Alan.
>
> >
> > Random late night thoughts over,
> > Ed
> >
> >
> >
> >
> >
> >
> > On Tue, Jul 30, 2013 at 12:02 AM, Edward Capriolo  >wrote:
> >
> >> At ~25:00
> >>
> >> "T

[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742195#comment-13742195
 ] 

Brock Noland commented on HIVE-4925:


Yeah but if you look at build files it takes the property set by javac.version 
and places it in the source and target JVM versions. AFAIK that means if you 
don't set -Djavac.version=1.7 at the command line, you will be compiling and 
running with the JDK7 but the code will be at version JDK6. For example:

https://github.com/apache/hive/blob/trunk/build-common.xml#L301

> Modify Hive build to enable compiling and running Hive with JDK7
> 
>
> Key: HIVE-4925
> URL: https://issues.apache.org/jira/browse/HIVE-4925
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1511) Hive plan serialization is slow

2013-08-16 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-1511:
---

Attachment: HIVE-1511.4.patch

Named the patch incorrectly to get precommit tests. Renamed as 
HIVE-1511.4.patch.

> Hive plan serialization is slow
> ---
>
> Key: HIVE-1511
> URL: https://issues.apache.org/jira/browse/HIVE-1511
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Ning Zhang
>Assignee: Mohammad Kamrul Islam
> Attachments: HIVE-1511.4.patch, HIVE-1511.patch, 
> HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, 
> HIVE-1511-wip.patch
>
>
> As reported by Edward Capriolo:
> For reference I did this as a test case
> SELECT * FROM src where
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> ...(100 more of these)
> No OOM but I gave up after the test case did not go anywhere for about
> 2 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4583) Make Hive compile and run with JDK7

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742224#comment-13742224
 ] 

Hudson commented on HIVE-4583:
--

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #129 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/129/])
HIVE-4583 : HCatalog test TestPigHCatUtil might fail on JDK7 (Jarek Jarcec 
Cecho via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514567)
* 
/hive/trunk/hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestPigHCatUtil.java


> Make Hive compile and run with JDK7
> ---
>
> Key: HIVE-4583
> URL: https://issues.apache.org/jira/browse/HIVE-4583
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> This is an umbrella ticket to cover all issues related to supporting JDK 7 in 
> HIVE. Many such issues are expected. Of course, JDK 6 needs to continue to 
> work.
> The major obstacles on the way supporting JDK7 are:
> 1. JDBC component needs to be upgraded because of JDBC interface changes in 
> JDK7.
> 2. DataNucleus needs to be upgraded because the current version doesn't 
> support JDK7.
> 3. Many test failures needs to be fixed. The majority of the failures are 
> caused by JDK7 subtle behaviour change.
> 4. Build needs to be changed to accommodate JDK7 compiler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1374#comment-1374
 ] 

Hudson commented on HIVE-5069:
--

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #129 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/129/])
HIVE-5069 : Tests on list bucketing are failing again in hadoop2 (Sergey 
Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514568)
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


> Tests on list bucketing are failing again in hadoop2
> 
>
> Key: HIVE-5069
> URL: https://issues.apache.org/jira/browse/HIVE-5069
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-5069.D12201.1.patch, HIVE-5069.D12243.1.patch
>
>
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742223#comment-13742223
 ] 

Hudson commented on HIVE-5048:
--

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #129 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/129/])
HIVE-5048 : StorageBasedAuthorization provider causes an NPE when asked to 
authorize from client side. (Sushanth Sowmyan via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514569)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/HiveAuthorizationProviderBase.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java


> StorageBasedAuthorization provider causes an NPE when asked to authorize from 
> client side.
> --
>
> Key: HIVE-5048
> URL: https://issues.apache.org/jira/browse/HIVE-5048
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 0.12.0
>
> Attachments: HIVE-5048.2.patch, HIVE-5048.patch
>
>
> StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a 
> HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and 
> HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705.
> As long as it's used as a HMAP, i.e. from the metastore-side, as was its 
> initial implementation intent, everything's great. However, HMAP extends HAP, 
> and there is no reason SBAP shouldn't be expected to work as a HAP as well. 
> However, it uses a wh variable that is never initialized if it is called as a 
> HAP, and hence, it will always fail when authorize is called on it.
> We should change SBAP so that it correctly initiazes wh so that it can be run 
> as a HAP as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-16 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742248#comment-13742248
 ] 

Xuefu Zhang commented on HIVE-4925:
---

Good catch. I was searching java.version.

I'm not sure why we have that in that target (and a couple of others). It 
doesn't seem making sense to have that only in part of the build. Regardless, 
we have two choices:

1. Do nothing. People be aware of using -Djavac.version on command line in JDK7
2. Get rid of javac.version, so source/target version will be defaulted to that 
of the compiler.

Thought?





> Modify Hive build to enable compiling and running Hive with JDK7
> 
>
> Key: HIVE-4925
> URL: https://issues.apache.org/jira/browse/HIVE-4925
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742254#comment-13742254
 ] 

Brock Noland commented on HIVE-4925:


I don't think we should get rid of javac.version because at some point we'll 
want to use JDK8. It's much easier to use -Djavac.version than to switch out 
your JVM. If there are areas that are not using source/target I think we should 
fix that.

> Modify Hive build to enable compiling and running Hive with JDK7
> 
>
> Key: HIVE-4925
> URL: https://issues.apache.org/jira/browse/HIVE-4925
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-16 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742256#comment-13742256
 ] 

Ashutosh Chauhan commented on HIVE-4925:


I agree its unnecessary detail which every dev has to know and is confusing at 
best. I vote for option 2 as well.

> Modify Hive build to enable compiling and running Hive with JDK7
> 
>
> Key: HIVE-4925
> URL: https://issues.apache.org/jira/browse/HIVE-4925
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742257#comment-13742257
 ] 

Brock Noland commented on HIVE-4925:


I chatted with [~jarcec] about this awhile back. Do you have thoughts Jarcec?

> Modify Hive build to enable compiling and running Hive with JDK7
> 
>
> Key: HIVE-4925
> URL: https://issues.apache.org/jira/browse/HIVE-4925
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: threading with hive client

2013-08-16 Thread Kristopher Glover

One more question. If the SemanticAnalyzer isn't fully thread safe could
you provide any pointers as to why it may not be thread safe? It's a 9000
line file so any hints as to where to get started would be much
appreciated. I don't see anything very obvious like globally shared member
variables so I'm guessing it's more subtle then that.

Thanks,
Kris

On 8/15/13 5:29 PM, "Kristopher Glover"  wrote:

>Thanks for all the great insight. I'll poke around a little more to see
>if I could at least start documenting the changes required to make
>everything thread safe as well as remove the synchronization.
>
>@Xuefu-
>I completely understand your points, I was just trying to figure out if
>there was a specific functional reason for making them public when there
>was a known vulnerability. For instance,  why not synchronize the compile
>method itself instead of relying on external synchronization. From the
>sound of it there were no specific reasons, other then no one has gotten
>around to making the improvements yet. Maybe it'll be something I can
>contribute back.
>
>Thanks again,
>Kris
>
>Xuefu Zhang wrote:
>To add,
>
>1. Being public doesn't necessarily guarantee thread-safety. Of course,
>this is no excuse for not documenting thread-safety.
>2. Sometimes a method is made public for testing, which is bad in my
>opnion, but I saw many instances like this before.
>
>--Xuefu
>
>
>
>On Thu, Aug 15, 2013 at 1:11 PM, Brock Noland  wrote:
>
>> Well you would have probably found the areas we need to fix! :) The hive
>> source is is not strict about methods and member visibility. The good
>>news
>> is that we have been making significant improvements in this aspect.
>>
>> Brock
>>
>>
>> On Thu, Aug 15, 2013 at 2:55 PM, Kristopher Glover > >wrote:
>>
>> > Interesting, I didn't realize that. If that's the case then I suppose
>> it'd
>> > be really bad for me to circumvent the lock by reproducing the
>>Driver#run
>> > method by calling Driver#compile and Driver#execute directly from
>>within
>> > my app.
>> >
>> > If that is the case why make Driver#compile and Driver#execute public
>> > methods? There doesn't seem to be any inheritance that requires them
>>to
>> be
>> > public and the fact that they are public opens up a thread safety
>>issue.
>> >
>> > Thanks,
>> > Kris
>> >
>> > On 8/15/13 1:11 PM, "Brock Noland"  wrote:
>> >
>> > >The hive semantic analyzer is not fully thread safe.  We'd like to
>> remove
>> > >that lock but it will be a large project.
>> > >
>> > >Brock
>> > >
>> > >
>> > >On Thu, Aug 15, 2013 at 11:12 AM, Kristopher Glover
>> > >wrote:
>> > >
>> > >> Hi Everyone,
>> > >>
>> > >> I'm experiencing a threading issue with the Hive client where I
>>want
>> to
>> > >> run multiple queries on the same JVM.
>> > >>
>> > >>  The problem I'm having is that
>>org.apache.hadoop.hive.ql.Driver#run
>> > >>(line
>> > >> 907)  has the following few lines of code :
>> > >>
>> > >>  synchronized (compileMonitor) {
>> > >>
>> > >>   ret = compile(command);
>> > >>
>> > >> }
>> > >>
>> > >>
>> > >> The compileMonitor is a static so it blocks all threads even though
>> I'm
>> > >> using different instances of the Driver class. I could explicitly
>>call
>> > >> Driver#compile then Driver#execute to avoid the synchronized block
>> but I
>> > >> don't know if it's serving a special purpose. Does anyone know why
>> that
>> > >> synchronized block is there and if its really necessary ?
>> > >>
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Kris
>> > >>
>> > >
>> > >
>> > >
>> > >--
>> > >Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>> >
>> >
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>>

Re: threading with hive client

2013-08-16 Thread Brock Noland

https://issues.apache.org/jira/browse/HIVE-4239
https://issues.apache.org/jira/browse/HIVE-80

I would guess this comment on HIVE-80 is still applicable:

"there may still be some thread-unsafe code, but no one knows for sure.
Given that, the only approach may be to do as much review as possible (e.g.
grep for statics that shouldn't be there), ask everyone to add any known
issues here, and then set up a testbed and see what turns up."


On Fri, Aug 16, 2013 at 9:28 AM, Kristopher Glover wrote:

> One more question. If the SemanticAnalyzer isn't fully thread safe could
> you provide any pointers as to why it may not be thread safe? It's a 9000
> line file so any hints as to where to get started would be much
> appreciated. I don't see anything very obvious like globally shared member
> variables so I'm guessing it's more subtle then that.
>
> Thanks,
> Kris
>
> On 8/15/13 5:29 PM, "Kristopher Glover"  wrote:
>
> >Thanks for all the great insight. I'll poke around a little more to see
> >if I could at least start documenting the changes required to make
> >everything thread safe as well as remove the synchronization.
> >
> >@Xuefu-
> >I completely understand your points, I was just trying to figure out if
> >there was a specific functional reason for making them public when there
> >was a known vulnerability. For instance,  why not synchronize the compile
> >method itself instead of relying on external synchronization. From the
> >sound of it there were no specific reasons, other then no one has gotten
> >around to making the improvements yet. Maybe it'll be something I can
> >contribute back.
> >
> >Thanks again,
> >Kris
> >
> >Xuefu Zhang wrote:
> >To add,
> >
> >1. Being public doesn't necessarily guarantee thread-safety. Of course,
> >this is no excuse for not documenting thread-safety.
> >2. Sometimes a method is made public for testing, which is bad in my
> >opnion, but I saw many instances like this before.
> >
> >--Xuefu
> >
> >
> >
> >On Thu, Aug 15, 2013 at 1:11 PM, Brock Noland  wrote:
> >
> >> Well you would have probably found the areas we need to fix! :) The hive
> >> source is is not strict about methods and member visibility. The good
> >>news
> >> is that we have been making significant improvements in this aspect.
> >>
> >> Brock
> >>
> >>
> >> On Thu, Aug 15, 2013 at 2:55 PM, Kristopher Glover <
> kglo...@appnexus.com
> >> >wrote:
> >>
> >> > Interesting, I didn't realize that. If that's the case then I suppose
> >> it'd
> >> > be really bad for me to circumvent the lock by reproducing the
> >>Driver#run
> >> > method by calling Driver#compile and Driver#execute directly from
> >>within
> >> > my app.
> >> >
> >> > If that is the case why make Driver#compile and Driver#execute public
> >> > methods? There doesn't seem to be any inheritance that requires them
> >>to
> >> be
> >> > public and the fact that they are public opens up a thread safety
> >>issue.
> >> >
> >> > Thanks,
> >> > Kris
> >> >
> >> > On 8/15/13 1:11 PM, "Brock Noland"  wrote:
> >> >
> >> > >The hive semantic analyzer is not fully thread safe.  We'd like to
> >> remove
> >> > >that lock but it will be a large project.
> >> > >
> >> > >Brock
> >> > >
> >> > >
> >> > >On Thu, Aug 15, 2013 at 11:12 AM, Kristopher Glover
> >> > >wrote:
> >> > >
> >> > >> Hi Everyone,
> >> > >>
> >> > >> I'm experiencing a threading issue with the Hive client where I
> >>want
> >> to
> >> > >> run multiple queries on the same JVM.
> >> > >>
> >> > >>  The problem I'm having is that
> >>org.apache.hadoop.hive.ql.Driver#run
> >> > >>(line
> >> > >> 907)  has the following few lines of code :
> >> > >>
> >> > >>  synchronized (compileMonitor) {
> >> > >>
> >> > >>   ret = compile(command);
> >> > >>
> >> > >> }
> >> > >>
> >> > >>
> >> > >> The compileMonitor is a static so it blocks all threads even though
> >> I'm
> >> > >> using different instances of the Driver class. I could explicitly
> >>call
> >> > >> Driver#compile then Driver#execute to avoid the synchronized block
> >> but I
> >> > >> don't know if it's serving a special purpose. Does anyone know why
> >> that
> >> > >> synchronized block is there and if its really necessary ?
> >> > >>
> >> > >>
> >> > >> Thanks,
> >> > >>
> >> > >> Kris
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > >--
> >> > >Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >> >
> >> >
> >>
> >>
> >> --
> >> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
> >>
>
>


-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

[jira] [Resolved] (HIVE-5089) Non query PreparedStatements are always failing on remote HiveServer2

2013-08-16 Thread Julien Letrouit (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Letrouit resolved HIVE-5089.
---

   Resolution: Fixed
Fix Version/s: 0.12.0

Fixed on trunk.

> Non query PreparedStatements are always failing on remote HiveServer2
> -
>
> Key: HIVE-5089
> URL: https://issues.apache.org/jira/browse/HIVE-5089
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.11.0
>Reporter: Julien Letrouit
> Fix For: 0.12.0
>
>
> This is reproducing the issue systematically:
> {noformat}
> import org.apache.hive.jdbc.HiveDriver;
> import java.sql.Connection;
> import java.sql.DriverManager;
> import java.sql.PreparedStatement;
> public class Main {
>   public static void main(String[] args) throws Exception {
> DriverManager.registerDriver(new HiveDriver());
> Connection conn = DriverManager.getConnection("jdbc:hive2://someserver");
> PreparedStatement smt = conn.prepareStatement("SET hivevar:test=1");
> smt.execute(); // Exception here
> conn.close();
>   }
> }
> {noformat}
> It is producing the following stacktrace:
> {noformat}
> Exception in thread "main" java.sql.SQLException: Could not create ResultSet: 
> null
>   at 
> org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:183)
>   at 
> org.apache.hive.jdbc.HiveQueryResultSet.(HiveQueryResultSet.java:134)
>   at 
> org.apache.hive.jdbc.HiveQueryResultSet$Builder.build(HiveQueryResultSet.java:122)
>   at 
> org.apache.hive.jdbc.HivePreparedStatement.executeImmediate(HivePreparedStatement.java:194)
>   at 
> org.apache.hive.jdbc.HivePreparedStatement.execute(HivePreparedStatement.java:137)
>   at Main.main(Main.java:12)
> Caused by: org.apache.thrift.transport.TTransportException
>   at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>   at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
>   at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
>   at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
>   at 
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
>   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>   at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>   at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>   at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Client.recv_GetResultSetMetadata(TCLIService.java:466)
>   at 
> org.apache.hive.service.cli.thrift.TCLIService$Client.GetResultSetMetadata(TCLIService.java:453)
>   at 
> org.apache.hive.jdbc.HiveQueryResultSet.retrieveSchema(HiveQueryResultSet.java:154)
>   ... 5 more
> {noformat}
> I tried to fix it, unfortunately, the standalone server used in unit tests do 
> not reproduce the issue. The following test added to TestJdbcDriver2 is 
> passing:
> {noformat}
>   public void testNonQueryPrepareStatement() throws Exception {
> try {
>   PreparedStatement ps = con.prepareStatement("SET hivevar:test=1");
>   boolean hasResultSet = ps.execute();
>   assertTrue(hasResultSet);
>   ps.close();
> } catch (Exception e) {
>   e.printStackTrace();
>   fail(e.toString());
> }
>   }
> {noformat}
> Any guidance on how to reproduce it in tests would be appreciated.
> Impact: the data analysis tools we are using are performing 
> PreparedStatements. The use of custom UDF is forcing us to add 'ADD JAR ...' 
> and 'CREATE TEMPORARY FUNCTION ...' statement to our query. Those statements 
> are failing when executed as PreparedStatements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Tez branch and tez based patches

2013-08-16 Thread Edward Capriolo

Commit then review, and self commit, destroys the good things we get from
our normal system.

http://anna.gs/blog/2013/08/12/code-review-ftw/

I am most worried about silo's and knowledge, lax testing policies, and
code quality. Which I now have seen on several occasions when something is
happening in a branch. (not calling out tez branch in particular)



On Fri, Aug 16, 2013 at 9:13 AM, Edward Capriolo wrote:

> I still am not sure we are doing this the ideal way. I am not a believer
> in a commit-then-review branch.
>
> This issue is an example.
>
> https://issues.apache.org/jira/browse/HIVE-5108
>
> I ask myself these questions:
> Does this currently work? Are their tests? If so which ones are broken?
> How does the patch fix them without tests to validate?
>
> Having a commit-then-review branch just seems subversive to our normal
> process, and a quick short cut to not have to be bothered by writing tests
> or involving anyone else.
>
>
>
> On Mon, Aug 5, 2013 at 1:54 PM, Alan Gates  wrote:
>
>>
>> On Jul 29, 2013, at 9:53 PM, Edward Capriolo wrote:
>>
>> > Also watched http://www.ustream.tv/recorded/36323173
>> >
>> > I definitely see the win in being able to stream inter-stage output.
>> >
>> > I see some cases where small intermediate results can be kept "In
>> memory".
>> > But I was somewhat under the impression that the map reduce spill
>> settings
>> > kept stuff in memory, isn't that what spill settings are?
>>
>> No.  MapReduce always writes shuffle data to local disk.  And
>> intermediate results between MR jobs are always persisted to HDFS, as
>> there's no other option.  When we talk of being able to keep intermediate
>> results in memory we mean getting rid of both of these disk writes/reads
>> when appropriate (meaning not always, there's a trade off between speed and
>> error handling to be made here, see below for more details).
>>
>> >
>> > There is a few bullet points that came up repeatedly that I do not
>> follow:
>> >
>> > Something was said to the effect of "Container reuse makes X faster".
>> > Hadoop has jvm reuse. Not following what the difference is here? Not
>> > everyone has a 10K node cluster.
>>
>> Sharing JVMs across users is inherently insecure (we can't guarantee what
>> code the first user left behind that may interfere with later users).  As I
>> understand container re-use in Tez it constrains the re-use to one user for
>> security reasons, but still avoids additional JVM start up costs.  But this
>> is a question that the Tez guys could answer better on the Tez lists (
>> d...@tez.incubator.apache.org)
>>
>> >
>> > "Joins in map reduce are hard" Really? I mean some of them are I guess,
>> but
>> > the typical join is very easy. Just shuffle by the join key. There was
>> not
>> > really enough low level details here saying why joins are better in tez.
>>
>> Join is not a natural operation in MapReduce.  MR gives you one input and
>> one output.  You end up having to bend the rules to do have multiple
>> inputs.  The idea here is that Tez can provide operators that naturally
>> work with joins and other operations that don't fit the one input/one
>> output model (eg unions, etc.).
>>
>> >
>> > "Chosing the number of maps and reduces is hard" Really? I do not find
>> it
>> > that hard, I think there are times when it's not perfect but I do not
>> find
>> > it hard. The talk did not really offer anything here technical on how
>> tez
>> > makes this better other then it could make it better.
>>
>> Perhaps manual would be a better term here than hard.  In our experience
>> it takes quite a bit of engineer trial and error to determine the optimal
>> numbers.  This may be ok if you're going to invest the time once and then
>> run the same query every day for 6 months.  But obviously it doesn't work
>> for the ad hoc case.  Even in the batch case it's not optimal because every
>> once and a while an engineer has to go back and re-optimize the query to
>> deal with changing data sizes, data characteristics, etc.  We want the
>> optimizer to handle this without human intervention.
>>
>> >
>> > The presentations mentioned streaming data, how do two nodes stream data
>> > between a tasks and how it it reliable? If the sender or receiver dies
>> does
>> > the entire process have to start again?
>>
>> If the sender or receiver dies then the query has to be restarted from
>> some previous point where data was persisted to disk.  The idea here is
>> that speed vs error recovery trade offs should be made by the optimizer.
>>  If the optimizer estimates that a query will complete in 5 seconds it can
>> stream everything and if a node fails it just re-runs the whole query.  If
>> it estimates that a particular phase of a query will run for an hour it can
>> choose to persist the results to HDFS so that in the event of a failure
>> downstream the long phase need not be re-run.  Again we want this to be
>> done automatically by the system so the user doesn't need to control this
>> leve

[jira] [Commented] (HIVE-5079) Make Hive compile under Windows

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742293#comment-13742293
 ] 

Brock Noland commented on HIVE-5079:


Simpler:

tr -d '\r' 

> Make Hive compile under Windows
> ---
>
> Key: HIVE-5079
> URL: https://issues.apache.org/jira/browse/HIVE-5079
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: HIVE-5079-1.patch
>
>
> Hive compilation failed under Windows. Error message:
> {code}
> compile:
>  [echo] Project: common
>  [exec] D:\Program Files (x86)\GnuWin32\bin\xargs.exe: md5sum: No such 
> file
> or directory
>  [exec] md5sum: 
> ../serde/src/java/org/apache/hadoop/hive/serde2/io/Timesta:
> No such file or directory
> [javac] Compiling 25 source files to 
> D:\Users\Administrator\hive\build\common\classes
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:4:
>  unclosed string literal
> [javac] @HiveVersionAnnotation(version="0.12.0-SNAPSHOT", 
> revision="80eadd8fa2af5eeba61f921318ab8b2c19980ab3", branch="trunk
> [javac]
>   ^
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:5:
>  unclosed string literal
> [javac] ",
> [javac] ^
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:6:
>  class, interface, or enum expected
> [javac]  user="Administrator
> [javac]  ^
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:6:
>  unclosed string literal
> [javac]  user="Administrator
> [javac]   ^
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:10:
>  unclosed string literal
> [javac] ",
> [javac] ^
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:11:
>  unclosed string literal
> [javac]  
> srcChecksum="aadceb95c37a1704aaf19501f46f6e84
> [javac]  ^
> [javac] 
> D:\Users\Administrator\hive\common\src\gen\org\apache\hive\common\package-info.java:12:
>  unclosed string literal
> [javac] ")
> [javac] ^
> [javac] 7 errors
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742310#comment-13742310
 ] 

Hive QA commented on HIVE-4838:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598209/HIVE-4838.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2884 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported2
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/463/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/463/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally these logical units could be seperated
> * HashTableSinkObjectCtx has unused fields and unused methods
> * CommonJoinOperator and children use ArrayList on left hand side when only 
> List is required
> * There are unused classes MRU, DCLLItemm and classes which duplicate 
> functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742316#comment-13742316
 ] 

Brock Noland commented on HIVE-4838:


That test has been failing since commit. I believe Gunther asked someone to 
look at it.

> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally these logical units could be seperated
> * HashTableSinkObjectCtx has unused fields and unused methods
> * CommonJoinOperator and children use ArrayList on left hand side when only 
> List is required
> * There are unused classes MRU, DCLLItemm and classes which duplicate 
> functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4838:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Brock for this massive cleanup!

> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.12.0
>
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally these logical units could be seperated
> * HashTableSinkObjectCtx has unused fields and unused methods
> * CommonJoinOperator and children use ArrayList on left hand side when only 
> List is required
> * There are unused classes MRU, DCLLItemm and classes which duplicate 
> functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-16 Thread Henry Robinson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742335#comment-13742335
 ] 

Henry Robinson commented on HIVE-4569:
--

As an alternative suggestion, what about considering a 
{{WaitUntilComplete(TOperationStatus)}} call? The benefit would be that there 
was immediately a way to block on the result of every operation (rather than 
adding {{*Async}} APIs to the interface and doubling its size). Then 
{{executeStatement}} doesn't need to change its documented semantics, and Hive 
can immediately be compatible by making {{WaitUntilComplete}} a no-op until 
asynchronous support is completely ready.

I also agree that it might be worth splitting this discussion into a separate 
JIRA.

> GetQueryPlan api in Hive Server2
> 
>
> Key: HIVE-4569
> URL: https://issues.apache.org/jira/browse/HIVE-4569
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Jaideep Dhok
> Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
> HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch, 
> HIVE-4569.D12333.1.patch
>
>
> It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
> api available in HiveServer2, though the wiki 
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
> contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [Discuss] project chop up

2013-08-16 Thread Edward Capriolo

Summary from hive-irc channel. Minor edits for spell check/grammar.

The last 10 lines are a summary of the key points.

[10:59:17]  noland: et all. Do you want to talk about hive in
maven?
[11:01:06] smonchi [~
ro...@host34-189-dynamic.23-79-r.retail.telecomitalia.it] has quit IRC:
Quit: ... 'cause there is no patch for human stupidity ...
[11:10:04]  ecapriolo: yeah that sounds good to me!
[11:10:22]  I saw you created the jira but haven't had time to look
[11:10:32]  So I found a few things
[11:10:49]  In common there is one or two testats that actually
fork a process :)
[11:10:56]  and use build.test.resources
[11:11:12]  Some serde, uses some methods from ql in testing
[11:11:27]  and shims really needs a separate hadoop test shim
[11:11:32]  But that is all simple stuff
[11:11:47]  The biggest problem is I do not know how to solve
shims with maven
[11:11:50]  do you have any ideas
[11:11:52]  ?
[11:13:00]  That one is going to be a challenge. It might be that
in that section we have to drop down to ant
[11:14:44]  Is it a requirement that we build both the .20 and .23
shims for a "package" as we do today?
[11:16:46]  I was thinking we can do it like a JDBC driver
[11:16:59]  Se separate out the interface of shims
[11:17:22]  And then at runtime we drop in a driver implementing
[11:17:34] Wertax [~wer...@wolfkamp.xs4all.nl] has quit IRC: Remote host
closed the connection
[11:17:36]  That or we could use maven's profile system
[11:18:09]  It seems that everything else can actually link
against hadoop-0.20.2 as a provided dependency
[11:18:37]  Yeah either would work. The driver method would
probably require use to use ant build both the drivers?
[11:18:44]  I am a fan of mvn profiles
[11:19:05]  I was thinking we kinda separate the shim out into
its own project,, not a module
[11:19:10]  to achive that jdbc thing
[11:19:27]  But I do not have a solution yet, I was looking to
farm that out to someone smart...like you :)
[11:19:33]  :)
[11:19:47]  All I know is that we need a test shim because
HadoopShim requires hadoop-test jars
[11:20:10]  then the Mini stuff is only used in qtest anyway
[11:20:48]  Is this something you want to help with? I was
thinking of spinning up a github
[11:20:50]  I think that the separate projects would work and
perhaps nicely.
[11:21:01]  Yeah I'd be interested in helping!
[11:21:17]  But I am going on vacation starting next week for about
10 days
[11:21:27]  Ah cool where are you going?
[11:21:37]  Netherlands
[11:21:42]  Biking around and such
[11:23:52]  The one thing I was thinking about with regards to a
branch is keeping history. We'll want to keep history for the files but
AFAICT svn doesn't understand git mv.
[11:24:16] Wertax [~wer...@wolfkamp.xs4all.nl] has joined #hive
[11:31:19] jeromatron [~text...@host90-152-1-162.ipv4.regusnet.com] has
quit IRC: Quit: My MacBook Pro has gone to sleep. ZZZzzz…
[11:35:49]  noland: Right I do not play to suggest that we will
do this in git
[11:36:11]  I just see that we are going to have to hack stuff
up and it is not the type of work that lends itself well to branches.
[11:36:17]  Ahh ok
[11:36:56]  Once we come up with a solution for the shims, and
we have something that can reasonably build and test hive we can figure out
how to apply that to a branch/trunk
[11:36:58]  yeah so just do a POC on github and then implement on
svn
[11:37:05]  cool
[11:37:29]  Along the way we can probably find things that we
can do like that common test I found and other minor things
[11:37:41]  sounds good
[11:37:50]  Those we can likely just commit into the current
trunk and I will file issues for those now
[11:37:58]  cool
[11:38:41]  But yea man. I just cant take the project as it is
now
[11:38:51]  in eclipse everytime I touch a file it rebuilds
everything!
[11:38:53]  Its like WTF
[11:39:09]  Running one tests takes like 3 minutes
[11:39:12]  its out of control
[11:39:23]  LOL
[11:39:29]  I agree 110%
[11:39:32]  eclipse was not always like that I am not sure how
the hell it happened
[11:39:51]  The eclipse sep thing is so harmful
[11:40:08]  dep thing that is
[11:40:12]  I mean command line ant was always bad, but you used
to be able to work in eclipse without having to rebuild everything every
change/test
[11:40:39]  Yeah the first thing I do these days is disable the ant
builder
[11:40:52]  Ow... I did not really know that was a thing
[11:40:55]  it starts compiling while you are still working and
blocks for minutes
[11:41:02]  Right that is what I mean
[11:41:11]  Everyone has like 10 hacks to work on the project
[11:41:14]  yeah you can remove it in project…one sec
[11:41:17]  perm gen
[11:41:20]  ant builder
[11:41:32]  project -> properties -> builders
[11:41:34]  hive does not build offline anymore
[11:41:37]  yeah
[11:41:47]  Im not sure when this stuff went bad, but it has
gotten really really bad
[11:42:09]  Also what I plan on doing is stripping out
non-essentials
[11:42:25]  like serde has all this thrift and avro stuff to
suppor

[jira] [Created] (HIVE-5110) Investigate memory consumption during map-side join

2013-08-16 Thread Brock Noland (JIRA)

Brock Noland created HIVE-5110:
--

 Summary: Investigate memory consumption during map-side join
 Key: HIVE-5110
 URL: https://issues.apache.org/jira/browse/HIVE-5110
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland


In map-side join we have this track the JVM memory usage and if it's over a 
certain limit we fail the task. In HIVE-4838 we discussed removing that and 
letting the JVM run OOM. However there was a concern that this could cause 
performance problems for queries that will run out of memory.

This jira is to decide we can let the JVM run OOM and if so remove that check. 
If we cannot, can we raise the default limit?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742343#comment-13742343
 ] 

Brock Noland commented on HIVE-4838:


Thanks!! I have opened HIVE-5110 to look at the memory consumption stuff we 
discussed.

> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.12.0
>
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally these logical units could be seperated
> * HashTableSinkObjectCtx has unused fields and unused methods
> * CommonJoinOperator and children use ArrayList on left hand side when only 
> List is required
> * There are unused classes MRU, DCLLItemm and classes which duplicate 
> functionality MapJoinSingleKey and MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4940) udaf_percentile_approx.q is not deterministic

2013-08-16 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4940:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Thanks for the contribution Navis! I committed to trunk.

> udaf_percentile_approx.q is not deterministic
> -
>
> Key: HIVE-4940
> URL: https://issues.apache.org/jira/browse/HIVE-4940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4940.D12189.1.patch
>
>
> Makes different result for 20(S) and 23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5069) Tests on list bucketing are failing again in hadoop2

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742361#comment-13742361
 ] 

Hudson commented on HIVE-5069:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2272 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2272/])
HIVE-5069 : Tests on list bucketing are failing again in hadoop2 (Sergey 
Shelukhin via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514568)
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java


> Tests on list bucketing are failing again in hadoop2
> 
>
> Key: HIVE-5069
> URL: https://issues.apache.org/jira/browse/HIVE-5069
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Sergey Shelukhin
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-5069.D12201.1.patch, HIVE-5069.D12243.1.patch
>
>
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5048) StorageBasedAuthorization provider causes an NPE when asked to authorize from client side.

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742362#comment-13742362
 ] 

Hudson commented on HIVE-5048:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2272 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2272/])
HIVE-5048 : StorageBasedAuthorization provider causes an NPE when asked to 
authorize from client side. (Sushanth Sowmyan via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514569)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/HiveAuthorizationProviderBase.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java


> StorageBasedAuthorization provider causes an NPE when asked to authorize from 
> client side.
> --
>
> Key: HIVE-5048
> URL: https://issues.apache.org/jira/browse/HIVE-5048
> Project: Hive
>  Issue Type: Bug
>  Components: Security
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 0.12.0
>
> Attachments: HIVE-5048.2.patch, HIVE-5048.patch
>
>
> StorageBasedAuthorizationProvider(henceforth referred to as SBAP) is a 
> HiveMetastoreAuthorizationProvider (henceforth referred to as HMAP, and 
> HiveAuthorizationProvider as HAP) that was introduced as part of HIVE-3705.
> As long as it's used as a HMAP, i.e. from the metastore-side, as was its 
> initial implementation intent, everything's great. However, HMAP extends HAP, 
> and there is no reason SBAP shouldn't be expected to work as a HAP as well. 
> However, it uses a wh variable that is never initialized if it is called as a 
> HAP, and hence, it will always fail when authorize is called on it.
> We should change SBAP so that it correctly initiazes wh so that it can be run 
> as a HAP as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4583) Make Hive compile and run with JDK7

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742363#comment-13742363
 ] 

Hudson commented on HIVE-4583:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2272 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2272/])
HIVE-4583 : HCatalog test TestPigHCatUtil might fail on JDK7 (Jarek Jarcec 
Cecho via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514567)
* 
/hive/trunk/hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hcatalog/pig/TestPigHCatUtil.java


> Make Hive compile and run with JDK7
> ---
>
> Key: HIVE-4583
> URL: https://issues.apache.org/jira/browse/HIVE-4583
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>
> This is an umbrella ticket to cover all issues related to supporting JDK 7 in 
> HIVE. Many such issues are expected. Of course, JDK 6 needs to continue to 
> work.
> The major obstacles on the way supporting JDK7 are:
> 1. JDBC component needs to be upgraded because of JDBC interface changes in 
> JDK7.
> 2. DataNucleus needs to be upgraded because the current version doesn't 
> support JDK7.
> 3. Many test failures needs to be fixed. The majority of the failures are 
> caused by JDK7 subtle behaviour change.
> 4. Build needs to be changed to accommodate JDK7 compiler.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3926) PPD on virtual column of partitioned table is not working

2013-08-16 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742373#comment-13742373
 ] 

Sergey Shelukhin commented on HIVE-3926:


I got around to this yesterday, the "null" values are actually put in the expr 
by PPRColumnExprProcessor in ExprProcFactory; it thus replaces non-partition 
columns, many functions, etc. compactExpr in Pruner then compacts the nulls.
The root problem is that ExprNodeColumnDesc doesn't distinguish partition and 
virtual columns, so partition columns also get there.
I will try to mitigate that in separate JIRA, however ExprNodeColumnDesc ctor 
w/isPartitionColOrVirtualCol arg is called in many places, so if they call 
cannot be easily replaced post-filtering can be done before expression gets to 
the pruner.


> PPD on virtual column of partitioned table is not working
> -
>
> Key: HIVE-3926
> URL: https://issues.apache.org/jira/browse/HIVE-3926
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-3926.6.patch, HIVE-3926.D8121.1.patch, 
> HIVE-3926.D8121.2.patch, HIVE-3926.D8121.3.patch, HIVE-3926.D8121.4.patch, 
> HIVE-3926.D8121.5.patch
>
>
> {code}
> select * from src where BLOCK__OFFSET__INSIDE__FILE<100;
> {code}
> is working, but
> {code}
> select * from srcpart where BLOCK__OFFSET__INSIDE__FILE<100;
> {code}
> throws SemanticException. Disabling PPD makes it work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-5111) ExprNodeColumnDesc doesn't distinguish partition and virtual columns, causing partition pruner to receive the latter

2013-08-16 Thread Sergey Shelukhin (JIRA)

Sergey Shelukhin created HIVE-5111:
--

 Summary: ExprNodeColumnDesc doesn't distinguish partition and 
virtual columns, causing partition pruner to receive the latter
 Key: HIVE-5111
 URL: https://issues.apache.org/jira/browse/HIVE-5111
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


See HIVE-3926

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4925) Modify Hive build to enable compiling and running Hive with JDK7

2013-08-16 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742374#comment-13742374
 ] 

Xuefu Zhang commented on HIVE-4925:
---

It appears to me that javac.version option is there by mistake, given the fact 
that majority of the code is compile with no consideration of that. In my 
personal opinion, we shouldn't turn an error to a feature. However, this 
doesn't exclude having such a feature in the future when need arrives. Thus, 
Option#2 seems making more sense to me.

> Modify Hive build to enable compiling and running Hive with JDK7
> 
>
> Key: HIVE-4925
> URL: https://issues.apache.org/jira/browse/HIVE-4925
> Project: Hive
>  Issue Type: Sub-task
>  Components: Build Infrastructure
>Affects Versions: 0.11.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742430#comment-13742430
 ] 

Hive QA commented on HIVE-1511:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598462/HIVE-1511.4.patch

{color:red}ERROR:{color} -1 due to 134 failed/errored test(s), 2879 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_udf
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join31
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_npath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_register_tblfn
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_

[jira] [Updated] (HIVE-5072) [WebHCat]Enable directly invoke Sqoop job through Templeton

2013-08-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-5072:
-

  Component/s: HCatalog
Affects Version/s: 0.12.0
  Summary: [WebHCat]Enable directly invoke Sqoop job through 
Templeton  (was: Enable directly invoke Sqoop job through Templeton)

> [WebHCat]Enable directly invoke Sqoop job through Templeton
> ---
>
> Key: HIVE-5072
> URL: https://issues.apache.org/jira/browse/HIVE-5072
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Shuaishuai Nie
> Attachments: HIVE-5072.1.patch, HIVE-5072.2.patch, 
> Templeton-Sqoop-Action.pdf
>
>
> Now it is hard to invoke a Sqoop job through templeton. The only way is to 
> use the classpath jar generated by a sqoop job and use the jar delegator in 
> Templeton. We should implement Sqoop Delegator to enable directly invoke 
> Sqoop job through Templeton.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HIVE-4989) Consolidate and simplify vectorization code and test generation

2013-08-16 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey reopened HIVE-4989:



The patch seems to have some format issue, that it doesn't apply all the 
changes. I think we should revert it for now. I am uploading a patch that 
reverts this change. We can get it committed again when format is fixed.

> Consolidate and simplify vectorization code and test generation
> ---
>
> Key: HIVE-4989
> URL: https://issues.apache.org/jira/browse/HIVE-4989
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Tony Murphy
>Assignee: Tony Murphy
> Fix For: vectorization-branch
>
> Attachments: HIVE-4989.revert.patch, HIVE-4989-vectorization.patch
>
>
> The current code generation is unwieldy to use and prone to errors. This 
> change consolidates all the code and test generation into a single location, 
> and removes the need to manually place files which can lead to missing or 
> incomplete code or tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4989) Consolidate and simplify vectorization code and test generation

2013-08-16 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-4989:
---

Attachment: HIVE-4989.revert.patch

The revert patch is attached.

> Consolidate and simplify vectorization code and test generation
> ---
>
> Key: HIVE-4989
> URL: https://issues.apache.org/jira/browse/HIVE-4989
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Tony Murphy
>Assignee: Tony Murphy
> Fix For: vectorization-branch
>
> Attachments: HIVE-4989.revert.patch, HIVE-4989-vectorization.patch
>
>
> The current code generation is unwieldy to use and prone to errors. This 
> change consolidates all the code and test generation into a single location, 
> and removes the need to manually place files which can lead to missing or 
> incomplete code or tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-16 Thread Brock Noland (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742507#comment-13742507
 ] 

Brock Noland commented on HIVE-1511:


I looked a bunch of failures and it looks like the issue Ashutosh reported. 
I'll upload a version of the patch with 2.22-SNAPSHOT version shortly.

> Hive plan serialization is slow
> ---
>
> Key: HIVE-1511
> URL: https://issues.apache.org/jira/browse/HIVE-1511
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Ning Zhang
>Assignee: Mohammad Kamrul Islam
> Attachments: HIVE-1511.4.patch, HIVE-1511.patch, 
> HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, 
> HIVE-1511-wip.patch
>
>
> As reported by Edward Capriolo:
> For reference I did this as a test case
> SELECT * FROM src where
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> ...(100 more of these)
> No OOM but I gave up after the test case did not go anywhere for about
> 2 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Adding WebHCat sub component to Hive project in ASF Jira

2013-08-16 Thread Eugene Koifman

Hi,
could somebody who has permissions to do so create WebHCat component under
Hive?
It will help track things.

Thanks,
Eugene

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Adding WebHCat sub component to Hive project in ASF Jira

2013-08-16 Thread Ashutosh Chauhan

Done. Looking forward to contributions in that area!

Thanks,
Ashutosh


On Fri, Aug 16, 2013 at 11:44 AM, Eugene Koifman
wrote:

> Hi,
> could somebody who has permissions to do so create WebHCat component under
> Hive?
> It will help track things.
>
> Thanks,
> Eugene
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

[jira] [Updated] (HIVE-4989) Consolidate and simplify vectorization code and test generation

2013-08-16 Thread Tony Murphy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Murphy updated HIVE-4989:
--

Attachment: HIVE-4989.1-vectorization.patch

> Consolidate and simplify vectorization code and test generation
> ---
>
> Key: HIVE-4989
> URL: https://issues.apache.org/jira/browse/HIVE-4989
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Tony Murphy
>Assignee: Tony Murphy
> Fix For: vectorization-branch
>
> Attachments: HIVE-4989.1-vectorization.patch, HIVE-4989.revert.patch, 
> HIVE-4989-vectorization.patch
>
>
> The current code generation is unwieldy to use and prone to errors. This 
> change consolidates all the code and test generation into a single location, 
> and removes the need to manually place files which can lead to missing or 
> incomplete code or tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4989) Consolidate and simplify vectorization code and test generation

2013-08-16 Thread Tony Murphy (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742525#comment-13742525
 ] 

Tony Murphy commented on HIVE-4989:
---

Not sure what happened with the last patch, but the formatting looks bad. i've 
regenerated the patch, manually inspected it, and successfully applied it after 
the revert patch.

> Consolidate and simplify vectorization code and test generation
> ---
>
> Key: HIVE-4989
> URL: https://issues.apache.org/jira/browse/HIVE-4989
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Tony Murphy
>Assignee: Tony Murphy
> Fix For: vectorization-branch
>
> Attachments: HIVE-4989.1-vectorization.patch, HIVE-4989.revert.patch, 
> HIVE-4989-vectorization.patch
>
>
> The current code generation is unwieldy to use and prone to errors. This 
> change consolidates all the code and test generation into a single location, 
> and removes the need to manually place files which can lead to missing or 
> incomplete code or tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742550#comment-13742550
 ] 

Hudson commented on HIVE-4838:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #61 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/61/])
HIVE-4838 : Refactor MapJoin HashMap code to improve testability and 
readability (Brock Noland via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514760)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinMetaData.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionException.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionHandler.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/DCLLItem.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MRU.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectSerDeContext.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHashMapWrapper.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin/TestMapJoinMemoryExhaustionHandler.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKeys.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinRowContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java


> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.12.0
>
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Id

[jira] [Commented] (HIVE-4940) udaf_percentile_approx.q is not deterministic

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742549#comment-13742549
 ] 

Hudson commented on HIVE-4940:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #61 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/61/])
HIVE-4940 udaf_percentile_approx.q is not deterministic (Navis via Brock 
Noland) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514771)
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx.q
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx_20.q
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx_23.q
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx_20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out


> udaf_percentile_approx.q is not deterministic
> -
>
> Key: HIVE-4940
> URL: https://issues.apache.org/jira/browse/HIVE-4940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4940.D12189.1.patch
>
>
> Makes different result for 20(S) and 23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1511) Hive plan serialization is slow

2013-08-16 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-1511:
---

Attachment: HIVE-1511.5.patch

Uploading HIVE-1511.5.patch which uses 2.22-SNAPSHOT for testing purposes.

> Hive plan serialization is slow
> ---
>
> Key: HIVE-1511
> URL: https://issues.apache.org/jira/browse/HIVE-1511
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Ning Zhang
>Assignee: Mohammad Kamrul Islam
> Attachments: HIVE-1511.4.patch, HIVE-1511.5.patch, HIVE-1511.patch, 
> HIVE-1511-wip2.patch, HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, 
> HIVE-1511-wip.patch
>
>
> As reported by Edward Capriolo:
> For reference I did this as a test case
> SELECT * FROM src where
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> ...(100 more of these)
> No OOM but I gave up after the test case did not go anywhere for about
> 2 minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4388) HBase tests fail against Hadoop 2

2013-08-16 Thread Brock Noland (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-4388:
---

Attachment: HIVE-4388.patch

Update patch used 0.95.2 release in addition to removing system property hacks 
for hbase profile resolution.

> HBase tests fail against Hadoop 2
> -
>
> Key: HIVE-4388
> URL: https://issues.apache.org/jira/browse/HIVE-4388
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Gunther Hagleitner
>Assignee: Brock Noland
> Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
> HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
> HIVE-4388.patch, HIVE-4388-wip.txt
>
>
> Currently we're building by default against 0.92. When you run against hadoop 
> 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
> HIVE-3861 upgrades the version of hbase used. This will get you past the 
> problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-5112) Upgrade protobuf to 2.5 from 2.4

2013-08-16 Thread Brock Noland (JIRA)

Brock Noland created HIVE-5112:
--

 Summary: Upgrade protobuf to 2.5 from 2.4
 Key: HIVE-5112
 URL: https://issues.apache.org/jira/browse/HIVE-5112
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland


Hadoop and Hbase have both upgraded protobuf. We should as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-5113) webhcat should allow configuring memory used by templetoncontroller map job in hadoop2

2013-08-16 Thread Thejas M Nair (JIRA)

Thejas M Nair created HIVE-5113:
---

 Summary: webhcat should allow configuring memory used by 
templetoncontroller map job in hadoop2
 Key: HIVE-5113
 URL: https://issues.apache.org/jira/browse/HIVE-5113
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Reporter: Thejas M Nair
Assignee: Thejas M Nair


Webhcat should allow the following hadoop2 config parameters to be set the 
templetoncontroller map-only job that actually runs the pig/hive/mr command.

mapreduce.map.memory.mb
yarn.app.mapreduce.am.resource.mb
yarn.app.mapreduce.am.command-opts

It should also be set to reasonable defaults.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-4989) Consolidate and simplify vectorization code and test generation

2013-08-16 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-4989.


Resolution: Fixed

Committed revert patch to branch. Thanks, Jitendra! Jitendra, can you also 
review Tony's latest patch. 

> Consolidate and simplify vectorization code and test generation
> ---
>
> Key: HIVE-4989
> URL: https://issues.apache.org/jira/browse/HIVE-4989
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: vectorization-branch
>Reporter: Tony Murphy
>Assignee: Tony Murphy
> Fix For: vectorization-branch
>
> Attachments: HIVE-4989.1-vectorization.patch, HIVE-4989.revert.patch, 
> HIVE-4989-vectorization.patch
>
>
> The current code generation is unwieldy to use and prone to errors. This 
> change consolidates all the code and test generation into a single location, 
> and removes the need to manually place files which can lead to missing or 
> incomplete code or tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742622#comment-13742622
 ] 

Hudson commented on HIVE-4838:
--

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #130 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/130/])
HIVE-4838 : Refactor MapJoin HashMap code to improve testability and 
readability (Brock Noland via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514760)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinMetaData.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionException.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionHandler.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/DCLLItem.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MRU.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectSerDeContext.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHashMapWrapper.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin/TestMapJoinMemoryExhaustionHandler.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKeys.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinRowContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java


> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.12.0
>
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container.

[jira] [Commented] (HIVE-4940) udaf_percentile_approx.q is not deterministic

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742621#comment-13742621
 ] 

Hudson commented on HIVE-4940:
--

FAILURE: Integrated in Hive-trunk-hadoop1-ptest #130 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/130/])
HIVE-4940 udaf_percentile_approx.q is not deterministic (Navis via Brock 
Noland) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514771)
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx.q
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx_20.q
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx_23.q
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx_20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out


> udaf_percentile_approx.q is not deterministic
> -
>
> Key: HIVE-4940
> URL: https://issues.apache.org/jira/browse/HIVE-4940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4940.D12189.1.patch
>
>
> Makes different result for 20(S) and 23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5105) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap

2013-08-16 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742626#comment-13742626
 ] 

Sushanth Sowmyan commented on HIVE-5105:


+1, Thanks for the test as well, Eugene. :)

> HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
> fieldPositionMap
> -
>
> Key: HIVE-5105
> URL: https://issues.apache.org/jira/browse/HIVE-5105
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.12.0
>
> Attachments: HIVE-5105.patch
>
>
> org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema 
> hcatFieldSchema) makes the following call:
> fieldPositionMap.remove(hcatFieldSchema);
> but fieldPositionMap is of type Map so the element is not 
> getting removed
> Here's a detailed comment from [~sushanth]
> The result is that that the name will not be removed from fieldPositionMap. 
> This results in 2 things:
> a) If anyone tries to append a field to a hcatschema after having removed 
> that field, it shouldn't fail, but it will.
> b) If anyone asks for the position of the removed field by name, it will 
> still give the position.
> Now, there is only one place in hcat code where we remove a field, and that 
> is called from HCatOutputFormat.setSchema, where we try to detect if the user 
> specified partition column names in the schema when they shouldn't have, and 
> if they did, we remove it. Normally, people do not specify this, and this 
> check tends to be superfluous.
> Once we do this, we wind up serializing that new object (after performing 
> some validations), and this does appear to stay through the serialization 
> (and eventual deserialization) which is very worrying.
> However, we are luckily saved by the fact that we do not append that field to 
> it at any time(all appends in hcat code are done on newly initialized 
> HCatSchema objects which have had no removes done on them), and we don't ask 
> for the position of something we do not expect to be there(harder to verify 
> for certain, but seems to be the case on inspection).
> The main part that gives me worry is that HCatSchema is part of our public 
> interface for HCat, in that M/R programs that use HCat can use it, and thus, 
> they might have more interesting usage patterns that are hitting this bug.
> I can't think of any currently open bugs that is caused by this because of 
> the rarity of the situation, but nevertheless, something we should fix 
> immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory

2013-08-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742653#comment-13742653
 ] 

Eugene Koifman commented on HIVE-4441:
--

HIVE-4601 includes a fix tom TempletonUtils#hadoopFsPath() respect the passed 
in "user"

[~daijy]  Are you sure it's a good idea to place this file in different places 
on different file systems?  Seems like it will lead to confusion.

Is it possible to create a test (e2e I presume) for this?

> [HCatalog] WebHCat does not honor user home directory
> -
>
> Key: HIVE-4441
> URL: https://issues.apache.org/jira/browse/HIVE-4441
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
> Attachments: HIVE-4441-1.patch, HIVE-4441-2.patch, HIVE-4441-3.patch
>
>
> If I submit a job as user "A" and I specify statusdir as a relative path, I 
> would expect results to be stored in the folder relative to the user A's home 
> folder.
> For example, if I run:
> {code}curl -s -d user.name=hdinsightuser -d execute="show+tables;" -d 
> statusdir="pokes.output" 'http://localhost:50111/templeton/v1/hive'{code}
> I get the results under:
> {code}/user/hdp/pokes.output{code}
> And I expect them to be under:
> {code}/user/hdinsightuser/pokes.output{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call

2013-08-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742670#comment-13742670
 ] 

Eugene Koifman commented on HIVE-4442:
--

Why are Delete/ListDelegator use UserGroupInformation ugi = 
UserGroupInformation.createRemoteUser(user);
Shouldn't they use UgiFactory?  Everything else in WebHCat does.
Otherwise, this looks good.

> [HCatalog] WebHCat should not override user.name parameter for Queue call
> -
>
> Key: HIVE-4442
> URL: https://issues.apache.org/jira/browse/HIVE-4442
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Daniel Dai
> Attachments: HIVE-4442-1.patch, HIVE-4442-2.patch
>
>
> Currently templeton for the Queue call uses the user.name to filter the 
> results of the call in addition to the default security.
> Ideally the filter is an optional parameter to the call independent of the 
> security check.
> I would suggest a parameter in addition to GET queue (jobs) give you all the 
> jobs a user have permission:
> GET queue?showall=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4586) [HCatalog] WebHCat should return 404 error for undefined resource

2013-08-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742674#comment-13742674
 ] 

Eugene Koifman commented on HIVE-4586:
--

+1

> [HCatalog] WebHCat should return 404 error for undefined resource
> -
>
> Key: HIVE-4586
> URL: https://issues.apache.org/jira/browse/HIVE-4586
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.11.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.12.0
>
> Attachments: HIVE-4586-1.patch, HIVE-4586-2.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742718#comment-13742718
 ] 

Hive QA commented on HIVE-4388:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598521/HIVE-4388.patch

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 2868 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_joins
junit.framework.TestSuite.org.apache.hcatalog.hbase.TestSnapshots
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats2
junit.framework.TestSuite.org.apache.hcatalog.hbase.snapshot.TestZNodeSetUp
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_scan_params
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats
junit.framework.TestSuite.org.apache.hcatalog.hbase.snapshot.TestIDGenerator
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats_empty_partition
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_map_queries_prefix
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_map_queries
org.apache.hadoop.hive.hbase.TestPutResultWritable.testResult
junit.framework.TestSuite.org.apache.hcatalog.hbase.TestHBaseHCatStorageHandler
junit.framework.TestSuite.org.apache.hcatalog.hbase.snapshot.TestRevisionManagerEndpoint
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_cascade_dbdrop
junit.framework.TestSuite.org.apache.hcatalog.hbase.TestHBaseDirectOutputFormat
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown
junit.framework.TestSuite.org.apache.hcatalog.hbase.TestHBaseInputFormat
junit.framework.TestSuite.org.apache.hcatalog.hbase.TestHBaseBulkOutputFormat
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_ppd_key_ranges
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_external_table_queries
junit.framework.TestSuite.org.apache.hcatalog.hbase.snapshot.TestRevisionManager
org.apache.hadoop.hive.hbase.TestPutResultWritable.testPut
org.apache.hadoop.hive.cli.TestHBaseMinimrCliDriver.testCliDriver_hbase_bulk
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported2
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats3
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_external_table_ppd
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries
org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver_cascade_dbdrop_hadoop20
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/465/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/465/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 30 tests failed
{noformat}

This message is automatically generated.

> HBase tests fail against Hadoop 2
> -
>
> Key: HIVE-4388
> URL: https://issues.apache.org/jira/browse/HIVE-4388
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Reporter: Gunther Hagleitner
>Assignee: Brock Noland
> Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
> HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, 
> HIVE-4388.patch, HIVE-4388-wip.txt
>
>
> Currently we're building by default against 0.92. When you run against hadoop 
> 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963.
> HIVE-3861 upgrades the version of hbase used. This will get you past the 
> problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5113) webhcat should allow configuring memory used by templetoncontroller map job in hadoop2

2013-08-16 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5113:


Attachment: HIVE-5113.1.patch

> webhcat should allow configuring memory used by templetoncontroller map job 
> in hadoop2
> --
>
> Key: HIVE-5113
> URL: https://issues.apache.org/jira/browse/HIVE-5113
> Project: Hive
>  Issue Type: Improvement
>  Components: WebHCat
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-5113.1.patch
>
>
> Webhcat should allow the following hadoop2 config parameters to be set the 
> templetoncontroller map-only job that actually runs the pig/hive/mr command.
> mapreduce.map.memory.mb
> yarn.app.mapreduce.am.resource.mb
> yarn.app.mapreduce.am.command-opts
> It should also be set to reasonable defaults.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2

2013-08-16 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742735#comment-13742735
 ] 

Vaibhav Gumashta commented on HIVE-4569:


[~ashutoshc][~henryr][~jaid...@research.iiit.ac.in] [~thejas] I definitely 
think execute async is quite ready and would be a good idea to have that in, 
while we discuss concerns on GetQueryPlan/TaskStatus. Without splitting, it 
might be kind of hard to focus on each. While reviewing this patch, I was 
actually trying to group the changes in two sets - I have a document which kind 
of summarizes the changes of each group (1. ExecuteAsync 2. GetQueryPlan + 
TaskStatus). I can upload that if you guys find use for it (if we decide on 
splitting, we can use it to see what we want in each JIRA).

> GetQueryPlan api in Hive Server2
> 
>
> Key: HIVE-4569
> URL: https://issues.apache.org/jira/browse/HIVE-4569
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Amareshwari Sriramadasu
>Assignee: Jaideep Dhok
> Attachments: git-4569.patch, HIVE-4569.D10887.1.patch, 
> HIVE-4569.D11469.1.patch, HIVE-4569.D12231.1.patch, HIVE-4569.D12237.1.patch, 
> HIVE-4569.D12333.1.patch
>
>
> It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan 
> api available in HiveServer2, though the wiki 
> https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API 
> contains, not sure why it was not added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4601) WebHCat, Templeton need to support proxy users

2013-08-16 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-4601:
-

Status: Open  (was: Patch Available)

org.apache.hadoop.security.KerberosName in H1 moved to  
org.apache.hadoop.security.authentication.util.KerberosName, so this needs a 
change to Shim layer


> WebHCat, Templeton need to support proxy users
> --
>
> Key: HIVE-4601
> URL: https://issues.apache.org/jira/browse/HIVE-4601
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.11.0
>Reporter: Dilli Arumugam
>Assignee: Eugene Koifman
>  Labels: proxy, templeton
> Fix For: 0.12.0
>
> Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.patch
>
>
> We have a use case where a Gateway would provide unified and controlled 
> access to secure hadoop cluster.
> The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton 
> with SPNego.
> The Gateway would authenticate the end user with http basic and would assert 
> the end user identity as douser argument in the calls to downstream WebHDFS, 
> Oozie and Templeton.
> This works fine with WebHDFS and Oozie. But, does not work for Templeton as 
> Templeton does not support proxy users.
> Hence, request to add this improvement to Templeton.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4601) WebHCat, Templeton need to support proxy users

2013-08-16 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742737#comment-13742737
 ] 

Eugene Koifman commented on HIVE-4601:
--

previous comment should read

org.apache.hadoop.security.KerberosName in H1 moved to 
org.apache.hadoop.security.authentication.util.KerberosName in H2, so this 
needs a change to Shim layer

> WebHCat, Templeton need to support proxy users
> --
>
> Key: HIVE-4601
> URL: https://issues.apache.org/jira/browse/HIVE-4601
> Project: Hive
>  Issue Type: Improvement
>  Components: HCatalog
>Affects Versions: 0.11.0
>Reporter: Dilli Arumugam
>Assignee: Eugene Koifman
>  Labels: proxy, templeton
> Fix For: 0.12.0
>
> Attachments: HIVE-4601.2.patch, HIVE-4601.3.patch, HIVE-4601.patch
>
>
> We have a use case where a Gateway would provide unified and controlled 
> access to secure hadoop cluster.
> The Gateway itself would authenticate to secure WebHDFS, Oozie and Templeton 
> with SPNego.
> The Gateway would authenticate the end user with http basic and would assert 
> the end user identity as douser argument in the calls to downstream WebHDFS, 
> Oozie and Templeton.
> This works fine with WebHDFS and Oozie. But, does not work for Templeton as 
> Templeton does not support proxy users.
> Hence, request to add this improvement to Templeton.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5114) add a target to run tests without rebuilding them

2013-08-16 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742742#comment-13742742
 ] 

Sergey Shelukhin commented on HIVE-5114:


I have a patch that seems to work except for hcat, let me clean it up and post, 
next week probably

> add a target to run tests without rebuilding them
> -
>
> Key: HIVE-5114
> URL: https://issues.apache.org/jira/browse/HIVE-5114
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> it is sometimes annoying that each "ant test ..." cleans and rebuilds the 
> tests. It is should be relatively easy to add a "testonly" target that would 
> just run the test(s) on the existing build

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-5114) add a target to run tests without rebuilding them

2013-08-16 Thread Sergey Shelukhin (JIRA)

Sergey Shelukhin created HIVE-5114:
--

 Summary: add a target to run tests without rebuilding them
 Key: HIVE-5114
 URL: https://issues.apache.org/jira/browse/HIVE-5114
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


it is sometimes annoying that each "ant test ..." cleans and rebuilds the 
tests. It is should be relatively easy to add a "testonly" target that would 
just run the test(s) on the existing build

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5105) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap

2013-08-16 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742763#comment-13742763
 ] 

Sushanth Sowmyan commented on HIVE-5105:


Committed to hive svn trunk.

> HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
> fieldPositionMap
> -
>
> Key: HIVE-5105
> URL: https://issues.apache.org/jira/browse/HIVE-5105
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.12.0
>
> Attachments: HIVE-5105.patch
>
>
> org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema 
> hcatFieldSchema) makes the following call:
> fieldPositionMap.remove(hcatFieldSchema);
> but fieldPositionMap is of type Map so the element is not 
> getting removed
> Here's a detailed comment from [~sushanth]
> The result is that that the name will not be removed from fieldPositionMap. 
> This results in 2 things:
> a) If anyone tries to append a field to a hcatschema after having removed 
> that field, it shouldn't fail, but it will.
> b) If anyone asks for the position of the removed field by name, it will 
> still give the position.
> Now, there is only one place in hcat code where we remove a field, and that 
> is called from HCatOutputFormat.setSchema, where we try to detect if the user 
> specified partition column names in the schema when they shouldn't have, and 
> if they did, we remove it. Normally, people do not specify this, and this 
> check tends to be superfluous.
> Once we do this, we wind up serializing that new object (after performing 
> some validations), and this does appear to stay through the serialization 
> (and eventual deserialization) which is very worrying.
> However, we are luckily saved by the fact that we do not append that field to 
> it at any time(all appends in hcat code are done on newly initialized 
> HCatSchema objects which have had no removes done on them), and we don't ask 
> for the position of something we do not expect to be there(harder to verify 
> for certain, but seems to be the case on inspection).
> The main part that gives me worry is that HCatSchema is part of our public 
> interface for HCat, in that M/R programs that use HCat can use it, and thus, 
> they might have more interesting usage patterns that are hitting this bug.
> I can't think of any currently open bugs that is caused by this because of 
> the rarity of the situation, but nevertheless, something we should fix 
> immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5105) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap

2013-08-16 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-5105:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
> fieldPositionMap
> -
>
> Key: HIVE-5105
> URL: https://issues.apache.org/jira/browse/HIVE-5105
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.12.0
>
> Attachments: HIVE-5105.patch
>
>
> org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema 
> hcatFieldSchema) makes the following call:
> fieldPositionMap.remove(hcatFieldSchema);
> but fieldPositionMap is of type Map so the element is not 
> getting removed
> Here's a detailed comment from [~sushanth]
> The result is that that the name will not be removed from fieldPositionMap. 
> This results in 2 things:
> a) If anyone tries to append a field to a hcatschema after having removed 
> that field, it shouldn't fail, but it will.
> b) If anyone asks for the position of the removed field by name, it will 
> still give the position.
> Now, there is only one place in hcat code where we remove a field, and that 
> is called from HCatOutputFormat.setSchema, where we try to detect if the user 
> specified partition column names in the schema when they shouldn't have, and 
> if they did, we remove it. Normally, people do not specify this, and this 
> check tends to be superfluous.
> Once we do this, we wind up serializing that new object (after performing 
> some validations), and this does appear to stay through the serialization 
> (and eventual deserialization) which is very worrying.
> However, we are luckily saved by the fact that we do not append that field to 
> it at any time(all appends in hcat code are done on newly initialized 
> HCatSchema objects which have had no removes done on them), and we don't ask 
> for the position of something we do not expect to be there(harder to verify 
> for certain, but seems to be the case on inspection).
> The main part that gives me worry is that HCatSchema is part of our public 
> interface for HCat, in that M/R programs that use HCat can use it, and thus, 
> they might have more interesting usage patterns that are hitting this bug.
> I can't think of any currently open bugs that is caused by this because of 
> the rarity of the situation, but nevertheless, something we should fix 
> immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3936) Remote debug failed with hadoop 0.23X, hadoop 2.X

2013-08-16 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742770#comment-13742770
 ] 

Mikhail Antonov commented on HIVE-3936:
---

http://ben-tech.blogspot.com/2012/11/cdh410-hive-debug-option-bug.html

the issue may be temporarily fixed on target machine by commenting out that 
line in the /usr/lib/hive/bin/hive. So looking at HADOOP-9455, will this issue 
be fixed automatically by the fix in the Hadoop, or that requires a fix at Hive 
side?

> Remote debug failed with hadoop 0.23X, hadoop 2.X
> -
>
> Key: HIVE-3936
> URL: https://issues.apache.org/jira/browse/HIVE-3936
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.8.1, 0.9.0
>Reporter: Xie Long
>Priority: Minor
>
> In $HIVE_HOME/bin/hive and $HADOOP_HOME/bin/hadoop, $HADOOP_CLIENT_OPTS is  
> appended to $HADOOP_OPTS, which leads to the problem.
> hive --debug
> ERROR: Cannot load this JVM TI agent twice, check your java command line for 
> duplicate jdwp options.
> Error occurred during initialization of VM
> agent library failed to init: jdwp

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1511) Hive plan serialization is slow

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742782#comment-13742782
 ] 

Hive QA commented on HIVE-1511:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12598520/HIVE-1511.5.patch

{color:red}ERROR:{color} -1 due to 105 failed/errored test(s), 2885 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_19
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_ppd_key_range
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_udf
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_18
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join35
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join31
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_pushdown
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_format_number
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_when
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testsequencefile
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join29
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_ppd_key_ranges
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_22
org.apache.hive.jdbc.TestJdbcDriver2.testPrepareStatement
org.apache.hadoop.hi

Re: [Discuss] project chop up

2013-08-16 Thread Edward Capriolo

For those interested in pitching in.
https://github.com/edwardcapriolo/hive



On Fri, Aug 16, 2013 at 11:58 AM, Edward Capriolo wrote:

> Summary from hive-irc channel. Minor edits for spell check/grammar.
>
> The last 10 lines are a summary of the key points.
>
> [10:59:17]  noland: et all. Do you want to talk about hive in
> maven?
> [11:01:06] smonchi [~
> ro...@host34-189-dynamic.23-79-r.retail.telecomitalia.it] has quit IRC:
> Quit: ... 'cause there is no patch for human stupidity ...
> [11:10:04]  ecapriolo: yeah that sounds good to me!
> [11:10:22]  I saw you created the jira but haven't had time to look
> [11:10:32]  So I found a few things
> [11:10:49]  In common there is one or two testats that actually
> fork a process :)
> [11:10:56]  and use build.test.resources
> [11:11:12]  Some serde, uses some methods from ql in testing
> [11:11:27]  and shims really needs a separate hadoop test shim
> [11:11:32]  But that is all simple stuff
> [11:11:47]  The biggest problem is I do not know how to solve
> shims with maven
> [11:11:50]  do you have any ideas
> [11:11:52]  ?
> [11:13:00]  That one is going to be a challenge. It might be that
> in that section we have to drop down to ant
> [11:14:44]  Is it a requirement that we build both the .20 and .23
> shims for a "package" as we do today?
> [11:16:46]  I was thinking we can do it like a JDBC driver
> [11:16:59]  Se separate out the interface of shims
> [11:17:22]  And then at runtime we drop in a driver implementing
> [11:17:34] Wertax [~wer...@wolfkamp.xs4all.nl] has quit IRC: Remote host
> closed the connection
> [11:17:36]  That or we could use maven's profile system
> [11:18:09]  It seems that everything else can actually link
> against hadoop-0.20.2 as a provided dependency
> [11:18:37]  Yeah either would work. The driver method would
> probably require use to use ant build both the drivers?
> [11:18:44]  I am a fan of mvn profiles
> [11:19:05]  I was thinking we kinda separate the shim out into
> its own project,, not a module
> [11:19:10]  to achive that jdbc thing
> [11:19:27]  But I do not have a solution yet, I was looking to
> farm that out to someone smart...like you :)
> [11:19:33]  :)
> [11:19:47]  All I know is that we need a test shim because
> HadoopShim requires hadoop-test jars
> [11:20:10]  then the Mini stuff is only used in qtest anyway
> [11:20:48]  Is this something you want to help with? I was
> thinking of spinning up a github
> [11:20:50]  I think that the separate projects would work and
> perhaps nicely.
> [11:21:01]  Yeah I'd be interested in helping!
> [11:21:17]  But I am going on vacation starting next week for
> about 10 days
> [11:21:27]  Ah cool where are you going?
> [11:21:37]  Netherlands
> [11:21:42]  Biking around and such
> [11:23:52]  The one thing I was thinking about with regards to a
> branch is keeping history. We'll want to keep history for the files but
> AFAICT svn doesn't understand git mv.
> [11:24:16] Wertax [~wer...@wolfkamp.xs4all.nl] has joined #hive
> [11:31:19] jeromatron [~text...@host90-152-1-162.ipv4.regusnet.com] has
> quit IRC: Quit: My MacBook Pro has gone to sleep. ZZZzzz…
> [11:35:49]  noland: Right I do not play to suggest that we will
> do this in git
> [11:36:11]  I just see that we are going to have to hack stuff
> up and it is not the type of work that lends itself well to branches.
> [11:36:17]  Ahh ok
> [11:36:56]  Once we come up with a solution for the shims, and
> we have something that can reasonably build and test hive we can figure out
> how to apply that to a branch/trunk
> [11:36:58]  yeah so just do a POC on github and then implement on
> svn
> [11:37:05]  cool
> [11:37:29]  Along the way we can probably find things that we
> can do like that common test I found and other minor things
> [11:37:41]  sounds good
> [11:37:50]  Those we can likely just commit into the current
> trunk and I will file issues for those now
> [11:37:58]  cool
> [11:38:41]  But yea man. I just cant take the project as it is
> now
> [11:38:51]  in eclipse everytime I touch a file it rebuilds
> everything!
> [11:38:53]  Its like WTF
> [11:39:09]  Running one tests takes like 3 minutes
> [11:39:12]  its out of control
> [11:39:23]  LOL
> [11:39:29]  I agree 110%
> [11:39:32]  eclipse was not always like that I am not sure how
> the hell it happened
> [11:39:51]  The eclipse sep thing is so harmful
> [11:40:08]  dep thing that is
> [11:40:12]  I mean command line ant was always bad, but you
> used to be able to work in eclipse without having to rebuild everything
> every change/test
> [11:40:39]  Yeah the first thing I do these days is disable the
> ant builder
> [11:40:52]  Ow... I did not really know that was a thing
> [11:40:55]  it starts compiling while you are still working and
> blocks for minutes
> [11:41:02]  Right that is what I mean
> [11:41:11]  Everyone has like 10 hacks to work on the project
> [11:41:14]  yeah you can remove it in project…one sec
> [11:41:17]  perm gen
> [11:41:20]  a

[jira] [Commented] (HIVE-4545) HS2 should return describe table results without space padding

2013-08-16 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742811#comment-13742811
 ] 

Hive QA commented on HIVE-4545:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12596693/HIVE-4545.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 2887 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udtf_not_supported2
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/467/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/467/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests failed with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

> HS2 should return describe table results without space padding
> --
>
> Key: HIVE-4545
> URL: https://issues.apache.org/jira/browse/HIVE-4545
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-4545-1.patch, HIVE-4545.2.patch, HIVE-4545.3.patch
>
>
> HIVE-3140 changed behavior of 'DESCRIBE table;' to be like 'DESCRIBE 
> FORMATTED table;'. HIVE-3140 introduced changes to not print header in 
> 'DESCRIBE table;'. But jdbc/odbc calls still get fields padded with space for 
> the 'DESCRIBE table;' query.
> As the jdbc/odbc results are not for direct human consumption the space 
> padding should not be done for hive server2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [Discuss] project chop up

2013-08-16 Thread Xuefu Zhang

Thanks, Edward.

I'm big +1 to mavenize Hive. Hive has long reached a point where it's hard
to manage its build using ant. I'd like to help on this too.

Thanks,
Xuefu


On Fri, Aug 16, 2013 at 7:31 PM, Edward Capriolo wrote:

> For those interested in pitching in.
> https://github.com/edwardcapriolo/hive
>
>
>
> On Fri, Aug 16, 2013 at 11:58 AM, Edward Capriolo  >wrote:
>
> > Summary from hive-irc channel. Minor edits for spell check/grammar.
> >
> > The last 10 lines are a summary of the key points.
> >
> > [10:59:17]  noland: et all. Do you want to talk about hive in
> > maven?
> > [11:01:06] smonchi [~
> > ro...@host34-189-dynamic.23-79-r.retail.telecomitalia.it] has quit IRC:
> > Quit: ... 'cause there is no patch for human stupidity ...
> > [11:10:04]  ecapriolo: yeah that sounds good to me!
> > [11:10:22]  I saw you created the jira but haven't had time to
> look
> > [11:10:32]  So I found a few things
> > [11:10:49]  In common there is one or two testats that
> actually
> > fork a process :)
> > [11:10:56]  and use build.test.resources
> > [11:11:12]  Some serde, uses some methods from ql in testing
> > [11:11:27]  and shims really needs a separate hadoop test shim
> > [11:11:32]  But that is all simple stuff
> > [11:11:47]  The biggest problem is I do not know how to solve
> > shims with maven
> > [11:11:50]  do you have any ideas
> > [11:11:52]  ?
> > [11:13:00]  That one is going to be a challenge. It might be that
> > in that section we have to drop down to ant
> > [11:14:44]  Is it a requirement that we build both the .20 and
> .23
> > shims for a "package" as we do today?
> > [11:16:46]  I was thinking we can do it like a JDBC driver
> > [11:16:59]  Se separate out the interface of shims
> > [11:17:22]  And then at runtime we drop in a driver
> implementing
> > [11:17:34] Wertax [~wer...@wolfkamp.xs4all.nl] has quit IRC: Remote host
> > closed the connection
> > [11:17:36]  That or we could use maven's profile system
> > [11:18:09]  It seems that everything else can actually link
> > against hadoop-0.20.2 as a provided dependency
> > [11:18:37]  Yeah either would work. The driver method would
> > probably require use to use ant build both the drivers?
> > [11:18:44]  I am a fan of mvn profiles
> > [11:19:05]  I was thinking we kinda separate the shim out into
> > its own project,, not a module
> > [11:19:10]  to achive that jdbc thing
> > [11:19:27]  But I do not have a solution yet, I was looking to
> > farm that out to someone smart...like you :)
> > [11:19:33]  :)
> > [11:19:47]  All I know is that we need a test shim because
> > HadoopShim requires hadoop-test jars
> > [11:20:10]  then the Mini stuff is only used in qtest anyway
> > [11:20:48]  Is this something you want to help with? I was
> > thinking of spinning up a github
> > [11:20:50]  I think that the separate projects would work and
> > perhaps nicely.
> > [11:21:01]  Yeah I'd be interested in helping!
> > [11:21:17]  But I am going on vacation starting next week for
> > about 10 days
> > [11:21:27]  Ah cool where are you going?
> > [11:21:37]  Netherlands
> > [11:21:42]  Biking around and such
> > [11:23:52]  The one thing I was thinking about with regards to a
> > branch is keeping history. We'll want to keep history for the files but
> > AFAICT svn doesn't understand git mv.
> > [11:24:16] Wertax [~wer...@wolfkamp.xs4all.nl] has joined #hive
> > [11:31:19] jeromatron [~text...@host90-152-1-162.ipv4.regusnet.com] has
> > quit IRC: Quit: My MacBook Pro has gone to sleep. ZZZzzz…
> > [11:35:49]  noland: Right I do not play to suggest that we
> will
> > do this in git
> > [11:36:11]  I just see that we are going to have to hack stuff
> > up and it is not the type of work that lends itself well to branches.
> > [11:36:17]  Ahh ok
> > [11:36:56]  Once we come up with a solution for the shims, and
> > we have something that can reasonably build and test hive we can figure
> out
> > how to apply that to a branch/trunk
> > [11:36:58]  yeah so just do a POC on github and then implement on
> > svn
> > [11:37:05]  cool
> > [11:37:29]  Along the way we can probably find things that we
> > can do like that common test I found and other minor things
> > [11:37:41]  sounds good
> > [11:37:50]  Those we can likely just commit into the current
> > trunk and I will file issues for those now
> > [11:37:58]  cool
> > [11:38:41]  But yea man. I just cant take the project as it is
> > now
> > [11:38:51]  in eclipse everytime I touch a file it rebuilds
> > everything!
> > [11:38:53]  Its like WTF
> > [11:39:09]  Running one tests takes like 3 minutes
> > [11:39:12]  its out of control
> > [11:39:23]  LOL
> > [11:39:29]  I agree 110%
> > [11:39:32]  eclipse was not always like that I am not sure how
> > the hell it happened
> > [11:39:51]  The eclipse sep thing is so harmful
> > [11:40:08]  dep thing that is
> > [11:40:12]  I mean command line ant was always bad, but you
> > used to be able to work in eclipse without having to rebuild everything
> > e

[jira] [Commented] (HIVE-4940) udaf_percentile_approx.q is not deterministic

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742850#comment-13742850
 ] 

Hudson commented on HIVE-4940:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2273 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2273/])
HIVE-4940 udaf_percentile_approx.q is not deterministic (Navis via Brock 
Noland) (brock: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514771)
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx.q
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx_20.q
* /hive/trunk/ql/src/test/queries/clientpositive/udaf_percentile_approx_23.q
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx_20.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out


> udaf_percentile_approx.q is not deterministic
> -
>
> Key: HIVE-4940
> URL: https://issues.apache.org/jira/browse/HIVE-4940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Fix For: 0.12.0
>
> Attachments: HIVE-4940.D12189.1.patch
>
>
> Makes different result for 20(S) and 23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742851#comment-13742851
 ] 

Hudson commented on HIVE-4838:
--

FAILURE: Integrated in Hive-trunk-h0.21 #2273 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2273/])
HIVE-4838 : Refactor MapJoin HashMap code to improve testability and 
readability (Brock Noland via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514760)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinMetaData.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionException.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionHandler.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/DCLLItem.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MRU.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectSerDeContext.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHashMapWrapper.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin/TestMapJoinMemoryExhaustionHandler.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKeys.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinRowContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java


> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.12.0
>
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally these

[jira] [Commented] (HIVE-5105) HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742857#comment-13742857
 ] 

Hudson commented on HIVE-5105:
--

FAILURE: Integrated in Hive-trunk-hadoop2-ptest #62 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/62/])
HIVE-5105 HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
fieldPositionMap (Eugene Koifman via Sushanth Sowmyan) (khorgath: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514929)
* 
/hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/schema/HCatSchema.java
* 
/hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/data/schema/TestHCatSchema.java


> HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up 
> fieldPositionMap
> -
>
> Key: HIVE-5105
> URL: https://issues.apache.org/jira/browse/HIVE-5105
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 0.12.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 0.12.0
>
> Attachments: HIVE-5105.patch
>
>
> org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema 
> hcatFieldSchema) makes the following call:
> fieldPositionMap.remove(hcatFieldSchema);
> but fieldPositionMap is of type Map so the element is not 
> getting removed
> Here's a detailed comment from [~sushanth]
> The result is that that the name will not be removed from fieldPositionMap. 
> This results in 2 things:
> a) If anyone tries to append a field to a hcatschema after having removed 
> that field, it shouldn't fail, but it will.
> b) If anyone asks for the position of the removed field by name, it will 
> still give the position.
> Now, there is only one place in hcat code where we remove a field, and that 
> is called from HCatOutputFormat.setSchema, where we try to detect if the user 
> specified partition column names in the schema when they shouldn't have, and 
> if they did, we remove it. Normally, people do not specify this, and this 
> check tends to be superfluous.
> Once we do this, we wind up serializing that new object (after performing 
> some validations), and this does appear to stay through the serialization 
> (and eventual deserialization) which is very worrying.
> However, we are luckily saved by the fact that we do not append that field to 
> it at any time(all appends in hcat code are done on newly initialized 
> HCatSchema objects which have had no removes done on them), and we don't ask 
> for the position of something we do not expect to be there(harder to verify 
> for certain, but seems to be the case on inspection).
> The main part that gives me worry is that HCatSchema is part of our public 
> interface for HCat, in that M/R programs that use HCat can use it, and thus, 
> they might have more interesting usage patterns that are hitting this bug.
> I can't think of any currently open bugs that is caused by this because of 
> the rarity of the situation, but nevertheless, something we should fix 
> immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4838) Refactor MapJoin HashMap code to improve testability and readability

2013-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742863#comment-13742863
 ] 

Hudson commented on HIVE-4838:
--

ABORTED: Integrated in Hive-trunk-hadoop2 #365 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/365/])
HIVE-4838 : Refactor MapJoin HashMap code to improve testability and 
readability (Brock Noland via Ashutosh Chauhan) (hashutosh: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514760)
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinMetaData.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionException.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mapjoin/MapJoinMemoryExhaustionHandler.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractMapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/AbstractRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/DCLLItem.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MRU.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectSerDeContext.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinRowContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainerSerDe.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHashMapWrapper.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/mapjoin/TestMapJoinMemoryExhaustionHandler.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinEqualityTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKey.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinKeys.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinRowContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/TestMapJoinTableContainer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/persistence/Utilities.java


> Refactor MapJoin HashMap code to improve testability and readability
> 
>
> Key: HIVE-4838
> URL: https://issues.apache.org/jira/browse/HIVE-4838
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 0.12.0
>
> Attachments: HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, 
> HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch, HIVE-4838.patch
>
>
> MapJoin is an essential component for high performance joins in Hive and the 
> current code has done great service for many years. However, the code is 
> showing it's age and currently suffers  from the following issues:
> * Uses static state via the MapJoinMetaData class to pass serialization 
> metadata to the Key, Row classes.
> * The api of a logical "Table Container" is not defined and therefore it's 
> unclear what apis HashMapWrapper 
> needs to publicize. Additionally HashMapWrapper has many used public methods.
> * HashMapWrapper contains logic to serialize, test memory bounds, and 
> implement the table container. Ideally thes

88 matches

Mail list logo