from:"Daniel Dai"


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: (was: HIVE-4443-4.patch)

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, 
 HIVE-4443-4.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID
 NO PRECOMMIT TESTS 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: HIVE-4443-4.patch

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, 
 HIVE-4443-4.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID
 NO PRECOMMIT TESTS 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


[ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768557#comment-13768557
 ] 

Daniel Dai commented on HIVE-4443:
--

Test is in HIVE-5078.

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, 
 HIVE-4443-4.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID
 NO PRECOMMIT TESTS 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig


 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-:
-

Attachment: HIVE--5.patch

 [HCatalog] WebHCat Hive should support equivalent parameters as Pig 
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE--1.patch, HIVE--2.patch, HIVE--3.patch, 
 HIVE--4.patch, HIVE--5.patch


 Currently there is no files and args parameter in Hive. We shall add them 
 to make them similar to Pig.
 NO PRECOMMIT TESTS 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig


[ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768657#comment-13768657
 ] 

Daniel Dai commented on HIVE-:
--

Fixed. Sorry about that.

 [HCatalog] WebHCat Hive should support equivalent parameters as Pig 
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE--1.patch, HIVE--2.patch, HIVE--3.patch, 
 HIVE--4.patch, HIVE--5.patch


 Currently there is no files and args parameter in Hive. We shall add them 
 to make them similar to Pig.
 NO PRECOMMIT TESTS 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-9.patch

bq. o Should this include e2e tests in addition (or instead of unit tests). If 
(when  Hadoop changes the log file format this will break, but Unit tests won't 
catch this since the data that the tests parse is static.
There are e2e test cases inside a separate ticket: HIVE-5078

bq. Here is a bunch of little things/nits:
bq. o Server.java has “ if (enablelog == true  
!TempletonUtils.isset(statusdir)) throw new BadParam(enablelog is only 
applicable when statusdir is set);” in 4 different places. Can this be a 
method?
done

bq. o What is the purpose of Server#misc()?
Should not be there, removed

bq. o TempletonControllerJob: import org.apache.hive.hcatalog.templeton.Main; - 
unused import
done
bq. oo Line 173 - indentation is off?
done
bq. oo Line 295 - writer.close() - This writer is connected to System.err. What 
are the implications of closing this? What if something tries to write to it 
later?
No one after this point is writing to writer. We opened writer, so we need to 
close it in our code.

bq. o TempletonUtils has unused imports - checkstyle needs to be run on the 
whole patch.
done

bq. o TestJobIDParser mixes JUnit3 and JUnit4. It should either not extend 
TestCase (I vote for this) or not use @Test annotations
one

bq. o Can JobIDParser (and all subclasses) be made package scoped since they 
are not used outside templeton pacakge? Similarly, can methods be made as 
private as possible?
done

bq. o JobIDParser#parseJobID() has “fname” param which is not used. What is the 
intent? Should it be used in openStatusFile() call? If not, better to remove it.
we shall use it in openStatusFile(). Fixed.

bq. o JobIDParser#openStatusFile() creas a Reader. Where/when is it being 
closed?
should close in parseJobID. Fixed.

bq. o Could the 2 member variables in JobIDParser be made private (even final)?
I can make them protected, but since they will be used in subclass, so I cannot 
make them private/final

bq. o Why is TestJobIDParser using findJobID() directly? Could it not use 
parseJobID()?
Because parseJobID hardcoded with the standard output file for that parser, 
which is stderr in current directory. In the test, I want to override it to 
test the input file in the test directory 

bq. o Can JobIDParser have 1 line of class level javadoc about the purpose of 
this class?
done


 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog, WebHCat
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, 
 HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, 
 HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5086) Fix scriptfile1.q on Windows


 [ 
https://issues.apache.org/jira/browse/HIVE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5086:
-

Attachment: HIVE-5086-2.patch

Fixed unit test failure.

 Fix scriptfile1.q on Windows
 

 Key: HIVE-5086
 URL: https://issues.apache.org/jira/browse/HIVE-5086
 Project: Hive
  Issue Type: Bug
  Components: Tests, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-5086-1.patch, HIVE-5086-2.patch


 Test failed with error message:
 [junit] Task with the most failures(4): 
 [junit] -
 [junit] Task ID:
 [junit]   task_20130814023904691_0001_m_00
 [junit] 
 [junit] URL:
 [junit]   
 http://localhost:50030/taskdetails.jsp?jobid=job_20130814023904691_0001tipid=task_20130814023904691_0001_m_00
 [junit] -
 [junit] Diagnostic Messages for this Task:
 [junit] java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {key:238,value:val_238}
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
 [junit]   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 [junit]   at 
 org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
 [junit]   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
 [junit]   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 [junit]   at java.security.AccessController.doPrivileged(Native Method)
 [junit]   at javax.security.auth.Subject.doAs(Subject.java:396)
 [junit]   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 [junit]   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive 
 Runtime Error while processing row {key:238,value:val_238}
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:538)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
 [junit]   ... 8 more
 [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 [Error 2]: Unable to initialize custom script.
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:357)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:528)
 [junit]   ... 9 more
 [junit] Caused by: java.io.IOException: Cannot run program 
 D:\tmp\hadoop-Administrator\mapred\local\3_0\taskTracker\Administrator\jobcache\job_20130814023904691_0001\attempt_20130814023904691_0001_m_00_3\work\.\testgrep:
  CreateProcess error=193, %1 is not a valid Win32 application
 [junit]   at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:316)
 [junit]   ... 18 more
 [junit] Caused by: java.io.IOException: CreateProcess error=193, %1 is 
 not a valid Win32 application
 [junit]   at java.lang.ProcessImpl.create(Native Method)
 [junit]   at java.lang.ProcessImpl.init(ProcessImpl.java:81)
 [junit]   at java.lang.ProcessImpl.start(ProcessImpl.java:30)
 [junit]   at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
 [junit]   ... 19 more
 [junit] 
 [junit] 
 [junit] Exception: Client Execution failed with error code = 2
 [junit] See build/ql/tmp/hive.log, or try ant test ... 
 -Dtest.silent=false to get more logs.
 [junit] junit.framework.AssertionFailedError: Client Execution failed 
 with error code = 2
 [junit] See build/ql/tmp/hive.log, or try ant test ... 
 -Dtest.silent=false to get more logs.
 [junit]   at junit.framework.Assert.fail(Assert.java:47)
 [junit]   at 
 org.apache.hadoop.hive.cli.TestMinimrCliDriver.runTest(TestMinimrCliDriver.java:122)
 [junit]   at 
 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1(TestMinimrCliDriver.java:104)
 [junit

[jira] [Commented] (HIVE-4531) [WebHCat] Collecting task logs to hdfs


[ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769949#comment-13769949
 ] 

Daniel Dai commented on HIVE-4531:
--

https://reviews.apache.org/r/14180/

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog, WebHCat
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, 
 HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, 
 HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5086) Fix scriptfile1.q on Windows


 [ 
https://issues.apache.org/jira/browse/HIVE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5086:
-

Status: Patch Available  (was: Open)

 Fix scriptfile1.q on Windows
 

 Key: HIVE-5086
 URL: https://issues.apache.org/jira/browse/HIVE-5086
 Project: Hive
  Issue Type: Bug
  Components: Tests, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-5086-1.patch, HIVE-5086-2.patch


 Test failed with error message:
 [junit] Task with the most failures(4): 
 [junit] -
 [junit] Task ID:
 [junit]   task_20130814023904691_0001_m_00
 [junit] 
 [junit] URL:
 [junit]   
 http://localhost:50030/taskdetails.jsp?jobid=job_20130814023904691_0001tipid=task_20130814023904691_0001_m_00
 [junit] -
 [junit] Diagnostic Messages for this Task:
 [junit] java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {key:238,value:val_238}
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175)
 [junit]   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 [junit]   at 
 org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
 [junit]   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
 [junit]   at org.apache.hadoop.mapred.Child$4.run(Child.java:271)
 [junit]   at java.security.AccessController.doPrivileged(Native Method)
 [junit]   at javax.security.auth.Subject.doAs(Subject.java:396)
 [junit]   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
 [junit]   at org.apache.hadoop.mapred.Child.main(Child.java:265)
 [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive 
 Runtime Error while processing row {key:238,value:val_238}
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:538)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157)
 [junit]   ... 8 more
 [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 [Error 2]: Unable to initialize custom script.
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:357)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:528)
 [junit]   ... 9 more
 [junit] Caused by: java.io.IOException: Cannot run program 
 D:\tmp\hadoop-Administrator\mapred\local\3_0\taskTracker\Administrator\jobcache\job_20130814023904691_0001\attempt_20130814023904691_0001_m_00_3\work\.\testgrep:
  CreateProcess error=193, %1 is not a valid Win32 application
 [junit]   at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
 [junit]   at 
 org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:316)
 [junit]   ... 18 more
 [junit] Caused by: java.io.IOException: CreateProcess error=193, %1 is 
 not a valid Win32 application
 [junit]   at java.lang.ProcessImpl.create(Native Method)
 [junit]   at java.lang.ProcessImpl.init(ProcessImpl.java:81)
 [junit]   at java.lang.ProcessImpl.start(ProcessImpl.java:30)
 [junit]   at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
 [junit]   ... 19 more
 [junit] 
 [junit] 
 [junit] Exception: Client Execution failed with error code = 2
 [junit] See build/ql/tmp/hive.log, or try ant test ... 
 -Dtest.silent=false to get more logs.
 [junit] junit.framework.AssertionFailedError: Client Execution failed 
 with error code = 2
 [junit] See build/ql/tmp/hive.log, or try ant test ... 
 -Dtest.silent=false to get more logs.
 [junit]   at junit.framework.Assert.fail(Assert.java:47)
 [junit]   at 
 org.apache.hadoop.hive.cli.TestMinimrCliDriver.runTest(TestMinimrCliDriver.java:122)
 [junit]   at 
 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1(TestMinimrCliDriver.java:104)
 [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0

[jira] [Created] (HIVE-5303) metastore server OOM when exception happen

Daniel Dai created HIVE-5303:


 Summary: metastore server OOM when exception happen
 Key: HIVE-5303
 URL: https://issues.apache.org/jira/browse/HIVE-5303
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai


The issue is described in HIVE-5099. HIVE-5099 fixed the issue metastore fail 
under some circumstance. But we still need to investigate why we get OOM when 
exception happen. The test case in HIVE-5099 is enough to reproduce the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5098) Fix metastore for SQL Server


 [ 
https://issues.apache.org/jira/browse/HIVE-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5098:
-

Fix Version/s: (was: 0.12.0)

 Fix metastore for SQL Server
 

 Key: HIVE-5098
 URL: https://issues.apache.org/jira/browse/HIVE-5098
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5098-1.patch, HIVE-5098-2.patch


 We found one problem in testing SQL Server metastore. In Hive code, we use 
 substring function with single parameter in datanucleus query 
 (Expressiontree.java):
 {code}
 if (partitionColumnIndex == (partitionColumnCount - 1)) {
 valString = partitionName.substring(partitionName.indexOf(\ + 
 keyEqual + \)+ + keyEqualLength + );
   }
   else {
 valString = partitionName.substring(partitionName.indexOf(\ + 
 keyEqual + \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
   }
 {code}
 SQL server does not support single parameter substring and datanucleus does 
 not fill the gap.
 In the attached patch:
 1. creates a new jar hive-datanucleusplugin.jar in $HIVE_HOME/lib
 2. hive-datanucleusplugin.jar is a datanucleus plugin (include plugin.xml, 
 MANIFEST.MF)
 3. The plugin write a specific version of substring implementation for 
 sqlserver (which avoid using single param SUBSTRING, which is not supported 
 in SQLSever)
 4. The plugin code only kicks in when the rmdb is sqlserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server


 [ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5099:
-

Fix Version/s: (was: 0.12.0)

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy5.get_partitions_by_filter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult

[jira] [Commented] (HIVE-5167) webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh


[ 
https://issues.apache.org/jira/browse/HIVE-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770166#comment-13770166
 ] 

Daniel Dai commented on HIVE-5167:
--

+1

 webhcat_config.sh checks for env variables being set before sourcing 
 webhcat-env.sh
 ---

 Key: HIVE-5167
 URL: https://issues.apache.org/jira/browse/HIVE-5167
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5167.1.patch, HIVE-5167.2.patch


 HIVE-4820 introduced checks for env variables, but it does so before sourcing 
 webhcat-env.sh. This order needs to be reversed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-09-20 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-10.patch

Addressed review comments by [~ekoifman].

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog, WebHCat
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4531-10.patch, HIVE-4531-1.patch, 
 HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, 
 HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, 
 samplestatusdirwithlist.tar.gz


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-09-23 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-11.patch

HIVE-4531-11.patch refine the exception handling code a bit.

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog, WebHCat
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-4531-10.patch, HIVE-4531-11.patch, 
 HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, 
 HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, 
 HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5092) Fix hiveserver2 mapreduce local job on Windows

2013-09-24 Thread Daniel Dai (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776811#comment-13776811
]

Daniel Dai commented on HIVE-5092:
--

This only supposed to use for Windows. We don't have above mentioned issue on
Linux.

Also noticed the cmd script patch HIVE-3129 is not committed. This Jira is
dependent on it.

Fix hiveserver2 mapreduce local job on Windows
--

Key: HIVE-5092
URL: https://issues.apache.org/jira/browse/HIVE-5092
Project: Hive
Issue Type: Bug
Components: HiveServer2, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
Fix For: 0.12.0

Attachments: HIVE-5092-1.patch

Hiveserver2 fail on Mapreduce local job fail. For example:
{code}
select /*+ MAPJOIN(v) */ registration from studenttab10k s join votertab10k v
on (s.name = v.name);
{code}
The root cause is class not found in the local hadoop job
(MapredLocalTask.execute). HADOOP_CLASSPATH does not include $HIVE_HOME/lib.
Set HADOOP_CLASSPATH correctly will fix the issue.
However, there is one complexity in Windows. We start Hiveserver2 using
Windows service console (services.msc), which takes hiveserver2.xml generated
by hive.cmd. There is no way to pass environment variable in hiveserver2.xml
(weird but reality). I attach a patch which pass it through command line
arguments and relay to HADOOP_CLASSPATH in Hive code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5031) [WebHCat] GET job/:jobid to return userargs for a job in addtion to status information

2013-09-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5031:
-

Attachment: HIVE-5031-5.patch

Resync with trunk.

 [WebHCat] GET job/:jobid to return userargs for a job in addtion to status 
 information
 --

 Key: HIVE-5031
 URL: https://issues.apache.org/jira/browse/HIVE-5031
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-5031-1.patch, HIVE-5031-2.patch, HIVE-5031-3.patch, 
 HIVE-5031-4.patch, HIVE-5031-5.patch


 It would be nice to also have any user args that were passed into job 
 creation API including job type specific information (e.g. mapreduce libjars)
 NO PRECOMMIT TESTS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5274) HCatalog package renaming backward compatibility follow-up

2013-09-24 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776919#comment-13776919
 ] 

Daniel Dai commented on HIVE-5274:
--

+1

 HCatalog package renaming backward compatibility follow-up
 --

 Key: HIVE-5274
 URL: https://issues.apache.org/jira/browse/HIVE-5274
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.12.0

 Attachments: HIVE-5274.2.patch, HIVE-5274.3.patch, HIVE-5274.4.patch


 As part of HIVE-4869, the hbase storage handler in hcat was moved to 
 org.apache.hive.hcatalog, and then put back to org.apache.hcatalog since it 
 was intended to be deprecated as well.
 However, it imports and uses several org.apache.hive.hcatalog classes. This 
 needs to be changed to use org.apache.hcatalog classes.
 ==
 Note : The above is a complete description of this issue in and of by itself, 
 the following is more details on the backward-compatibility goal I have(not 
 saying that each of these things are violated) : 
 a) People using org.apache.hcatalog packages should continue being able to 
 use that package, and see no difference at compile time or runtime. All code 
 here is considered deprecated, and will be gone by the time hive 0.14 rolls 
 around. Additionally, org.apache.hcatalog should behave as if it were 0.11 
 for all compatibility purposes.
 b) People using org.apache.hive.hcatalog packages should never have an 
 org.apache.hcatalog dependency injected in.
 Thus,
 It is okay for org.apache.hcatalog to use org.apache.hive.hcatalog packages 
 internally (say HCatUtil, for example), as long as any interfaces only expose 
 org.apache.hcatalog.\* For tests that test org.apache.hcatalog.\*, we must be 
 capable of testing it from a pure org.apache.hcatalog.\* world.
 It is never okay for org.apache.hive.hcatalog to use org.apache.hcatalog, 
 even in tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5035) [WebHCat] Hardening parameters for Windows

2013-09-27 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5035:
-

Attachment: HIVE-5035-2.patch

Resync with trunk and fix checkstyle warning.

 [WebHCat] Hardening parameters for Windows
 --

 Key: HIVE-5035
 URL: https://issues.apache.org/jira/browse/HIVE-5035
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-5035-1.patch, HIVE-5035-2.patch


 everything pass to pig/hive/hadoop command line will be quoted. That include:
 mapreducejar:
 libjars
 arg
 define
 mapreducestream:
 cmdenv
 define
 arg
 pig
 arg
 execute
 hive
 arg
 define
 execute
 NO PRECOMMIT TESTS 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5066) [WebHCat] Other code fixes for Windows

2013-09-30 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5066:
-

Attachment: HIVE-5066-4.patch

Fixed checkstyle warnings.

 [WebHCat] Other code fixes for Windows
 --

 Key: HIVE-5066
 URL: https://issues.apache.org/jira/browse/HIVE-5066
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0

 Attachments: HIVE-5034-1.patch, HIVE-5066-2.patch, HIVE-5066-3.patch, 
 HIVE-5066-4.patch


 This is equivalent to HCATALOG-526, but updated to sync with latest trunk.
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HIVE-5034) [WebHCat] Make WebHCat work for Windows

2013-09-30 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved HIVE-5034.
--

Resolution: Fixed

All sub-tasks are resolved. Close the ticket.

 [WebHCat] Make WebHCat work for Windows
 ---

 Key: HIVE-5034
 URL: https://issues.apache.org/jira/browse/HIVE-5034
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.12.0


 This is the umbrella Jira to fix WebHCat on Windows.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5458) [WebHCat] Missing test.other.user.name parameter in e2e build.xml

2013-10-05 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5458:
-

Status: Patch Available  (was: Open)

 [WebHCat] Missing test.other.user.name parameter in e2e build.xml
 -

 Key: HIVE-5458
 URL: https://issues.apache.org/jira/browse/HIVE-5458
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5458-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5458) [WebHCat] Missing test.other.user.name parameter in e2e build.xml

2013-10-05 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5458:
-

Attachment: HIVE-5458-1.patch

 [WebHCat] Missing test.other.user.name parameter in e2e build.xml
 -

 Key: HIVE-5458
 URL: https://issues.apache.org/jira/browse/HIVE-5458
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5458-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5458) [WebHCat] Missing test.other.user.name parameter in e2e build.xml

2013-10-05 Thread Daniel Dai (JIRA)

Daniel Dai created HIVE-5458:


 Summary: [WebHCat] Missing test.other.user.name parameter in e2e 
build.xml
 Key: HIVE-5458
 URL: https://issues.apache.org/jira/browse/HIVE-5458
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5458-1.patch





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5507) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness


 [ 
https://issues.apache.org/jira/browse/HIVE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5507:
-

Attachment: HIVE-5507-1.patch

 [WebHCat] test.other.user.name parameter is missing from build.xml in e2e 
 harness
 -

 Key: HIVE-5507
 URL: https://issues.apache.org/jira/browse/HIVE-5507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5507-1.patch


 When we run templeton e2e tests, we need to specify test.other.user.name 
 parameter for a second templeton user. This is missing in build.xml.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5507) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness

Daniel Dai created HIVE-5507:


 Summary: [WebHCat] test.other.user.name parameter is missing from 
build.xml in e2e harness
 Key: HIVE-5507
 URL: https://issues.apache.org/jira/browse/HIVE-5507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0
 Attachments: HIVE-5507-1.patch

When we run templeton e2e tests, we need to specify test.other.user.name 
parameter for a second templeton user. This is missing in build.xml.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5507) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness


 [ 
https://issues.apache.org/jira/browse/HIVE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5507:
-

Status: Patch Available  (was: Open)

 [WebHCat] test.other.user.name parameter is missing from build.xml in e2e 
 harness
 -

 Key: HIVE-5507
 URL: https://issues.apache.org/jira/browse/HIVE-5507
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5507-1.patch


 When we run templeton e2e tests, we need to specify test.other.user.name 
 parameter for a second templeton user. This is missing in build.xml.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2

Daniel Dai created HIVE-5508:


 Summary: [WebHCat] ignore log collector e2e tests for Hadoop 2
 Key: HIVE-5508
 URL: https://issues.apache.org/jira/browse/HIVE-5508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0


Log collector currently only works with Hadoop 1. If run under Hadoop 2, no log 
will be collected. Templeton e2e tests check the existence of those logs, so 
they will fail under Hadoop 2. Need to disable them when run under Hadoop 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5508:
-

Attachment: HIVE-5508-1.patch

 [WebHCat] ignore log collector e2e tests for Hadoop 2
 -

 Key: HIVE-5508
 URL: https://issues.apache.org/jira/browse/HIVE-5508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5508-1.patch


 Log collector currently only works with Hadoop 1. If run under Hadoop 2, no 
 log will be collected. Templeton e2e tests check the existence of those logs, 
 so they will fail under Hadoop 2. Need to disable them when run under Hadoop 
 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5509) [WebHCat] TestDriverCurl to use string comparison for jobid


 [ 
https://issues.apache.org/jira/browse/HIVE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5509:
-

Attachment: HIVE-5509-1.patch

 [WebHCat] TestDriverCurl to use string comparison for jobid
 ---

 Key: HIVE-5509
 URL: https://issues.apache.org/jira/browse/HIVE-5509
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5509-1.patch


 In TestDriverCurl.pm, we sort job status array returned by templeton using:
 {code}
 sort { $a-{id} = $b-{id} }
 {code}
 However, = is used to compare numbers, jobid is string, so comparison is 
 wrong. This results test JOBS_4 fail in some cases.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5508:
-

Status: Patch Available  (was: Open)

 [WebHCat] ignore log collector e2e tests for Hadoop 2
 -

 Key: HIVE-5508
 URL: https://issues.apache.org/jira/browse/HIVE-5508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5508-1.patch


 Log collector currently only works with Hadoop 1. If run under Hadoop 2, no 
 log will be collected. Templeton e2e tests check the existence of those logs, 
 so they will fail under Hadoop 2. Need to disable them when run under Hadoop 
 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5509) [WebHCat] TestDriverCurl to use string comparison for jobid

Daniel Dai created HIVE-5509:


 Summary: [WebHCat] TestDriverCurl to use string comparison for 
jobid
 Key: HIVE-5509
 URL: https://issues.apache.org/jira/browse/HIVE-5509
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0
 Attachments: HIVE-5509-1.patch

In TestDriverCurl.pm, we sort job status array returned by templeton using:
{code}
sort { $a-{id} = $b-{id} }
{code}
However, = is used to compare numbers, jobid is string, so comparison is 
wrong. This results test JOBS_4 fail in some cases.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5509) [WebHCat] TestDriverCurl to use string comparison for jobid


 [ 
https://issues.apache.org/jira/browse/HIVE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5509:
-

Status: Patch Available  (was: Open)

 [WebHCat] TestDriverCurl to use string comparison for jobid
 ---

 Key: HIVE-5509
 URL: https://issues.apache.org/jira/browse/HIVE-5509
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5509-1.patch


 In TestDriverCurl.pm, we sort job status array returned by templeton using:
 {code}
 sort { $a-{id} = $b-{id} }
 {code}
 However, = is used to compare numbers, jobid is string, so comparison is 
 wrong. This results test JOBS_4 fail in some cases.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

Daniel Dai created HIVE-5510:


 Summary: [WebHCat] GET job/queue return wrong job information
 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0
 Attachments: HIVE-5510-1.patch

GET job/queue of a TempletonController job return weird information. It is a 
mix of child job and itself. It should only pull the information of the 
controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information


 [ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5510:
-

Status: Patch Available  (was: Open)

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information


 [ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5510:
-

Attachment: HIVE-5510-1.patch

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-4441) [WebHCat] WebHCat does not honor user home directory


 [ 
https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4441:
-

Attachment: HIVE-4441-1.patch

 [WebHCat] WebHCat does not honor user home directory
 

 Key: HIVE-4441
 URL: https://issues.apache.org/jira/browse/HIVE-4441
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
 Attachments: HIVE-4441-1.patch


 If I submit a job as user A and I specify statusdir as a relative path, I 
 would expect results to be stored in the folder relative to the user A's home 
 folder.
 For example, if I run:
 {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d 
 statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code}
 I get the results under:
 {code}/user/hdp/pokes.output{code}
 And I expect them to be under:
 {code}/user/hdinsightuser/pokes.output{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory


 [ 
https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4441:
-

Summary: [HCatalog] WebHCat does not honor user home directory  (was: 
[WebHCat] WebHCat does not honor user home directory)

 [HCatalog] WebHCat does not honor user home directory
 -

 Key: HIVE-4441
 URL: https://issues.apache.org/jira/browse/HIVE-4441
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
 Attachments: HIVE-4441-1.patch


 If I submit a job as user A and I specify statusdir as a relative path, I 
 would expect results to be stored in the folder relative to the user A's home 
 folder.
 For example, if I run:
 {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d 
 statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code}
 I get the results under:
 {code}/user/hdp/pokes.output{code}
 And I expect them to be under:
 {code}/user/hdinsightuser/pokes.output{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig

Daniel Dai created HIVE-:


 Summary: [HCatalog] WebHCat Hive should support equivalent 
parameters as Pig 
 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai


Currently there is no files and args parameter in Hive. We shall add them 
to make them similar to Pig.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Component/s: HCatalog

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai

 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call


 [ 
https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4442:
-

Component/s: HCatalog

 [HCatalog] WebHCat should not override user.name parameter for Queue call
 -

 Key: HIVE-4442
 URL: https://issues.apache.org/jira/browse/HIVE-4442
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai

 Currently templeton for the Queue call uses the user.name to filter the 
 results of the call in addition to the default security.
 Ideally the filter is an optional parameter to the call independent of the 
 security check.
 I would suggest a parameter in addition to GET queue (jobs) give you all the 
 jobs a user have permission:
 GET queue?showall=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory


 [ 
https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4441:
-

Component/s: HCatalog

 [HCatalog] WebHCat does not honor user home directory
 -

 Key: HIVE-4441
 URL: https://issues.apache.org/jira/browse/HIVE-4441
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4441-1.patch


 If I submit a job as user A and I specify statusdir as a relative path, I 
 would expect results to be stored in the folder relative to the user A's home 
 folder.
 For example, if I run:
 {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d 
 statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code}
 I get the results under:
 {code}/user/hdp/pokes.output{code}
 And I expect them to be under:
 {code}/user/hdinsightuser/pokes.output{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call


 [ 
https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4442:
-

Attachment: HIVE-4442-1.patch

 [HCatalog] WebHCat should not override user.name parameter for Queue call
 -

 Key: HIVE-4442
 URL: https://issues.apache.org/jira/browse/HIVE-4442
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4442-1.patch


 Currently templeton for the Queue call uses the user.name to filter the 
 results of the call in addition to the default security.
 Ideally the filter is an optional parameter to the call independent of the 
 security check.
 I would suggest a parameter in addition to GET queue (jobs) give you all the 
 jobs a user have permission:
 GET queue?showall=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: HIVE-4443-1.patch

Attach patch. The patch also contains e2e tests for HIVE-4442. That is because 
HIVE-4442 and HIVE-4443 are very intervolved and it is harder to separate the 
tests.

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4443-1.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: (was: HIVE-4443-1.patch)

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4443-1.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: HIVE-4443-1.patch

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4443-1.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call


[ 
https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644674#comment-13644674
 ] 

Daniel Dai commented on HIVE-4442:
--

Attach patch. Note the e2e tests is intervolved with HIVE-4443. I include all 
tests in HIVE-4443.

 [HCatalog] WebHCat should not override user.name parameter for Queue call
 -

 Key: HIVE-4442
 URL: https://issues.apache.org/jira/browse/HIVE-4442
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4442-1.patch


 Currently templeton for the Queue call uses the user.name to filter the 
 results of the call in addition to the default security.
 Ideally the filter is an optional parameter to the call independent of the 
 security check.
 I would suggest a parameter in addition to GET queue (jobs) give you all the 
 jobs a user have permission:
 GET queue?showall=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig


 [ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-:
-

Attachment: HIVE--1.patch

 [HCatalog] WebHCat Hive should support equivalent parameters as Pig 
 

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE--1.patch


 Currently there is no files and args parameter in Hive. We shall add them 
 to make them similar to Pig.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: (was: HIVE-4443-1.patch)

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai

 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call


 [ 
https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4443:
-

Attachment: HIVE-4443-1.patch

 [HCatalog] Have an option for GET queue to return all job information in 
 single call 
 -

 Key: HIVE-4443
 URL: https://issues.apache.org/jira/browse/HIVE-4443
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4443-1.patch


 Currently do display a summary of all jobs, one has to call GET queue to 
 retrieve all the jobids and then call GET queue/:jobid for each job. It would 
 be nice to do this in a single call.
 I would suggest:
 * GET queue - mark deprecate
 * GET queue/jobID - mark deprecate
 * DELETE queue/jobID - mark deprecate
 * GET jobs - return the list of JSON objects jobid but no detailed info
 * GET jobs/fields=* - return the list of JSON objects containing detailed Job 
 info
 * GET jobs/jobID - return the single JSON object containing the detailed 
 Job info for the job with the given ID (equivalent to GET queue/jobID)
 * DELETE jobs/jobID - equivalent to DELETE queue/jobID

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444


 [ 
https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4446:
-

Attachment: HIVE-4446-1.patch

 [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-
 

 Key: HIVE-4446
 URL: https://issues.apache.org/jira/browse/HIVE-4446
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4446-1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4465) webhcat e2e tests succeed regardless of exitvalue

2013-05-01 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646421#comment-13646421
 ] 

Daniel Dai commented on HIVE-4465:
--

+1

 webhcat e2e tests succeed regardless of exitvalue
 -

 Key: HIVE-4465
 URL: https://issues.apache.org/jira/browse/HIVE-4465
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.12.0

 Attachments: HIVE-4465.patch


 Currently the webhcat tests that check job status for Pig, Hive, and MR do 
 not check the exit value of the job.  So a job can fail and the test will 
 succeed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-09 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-1.patch

Attach initial patch.

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-09 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: (was: HIVE-4531-1.patch)

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-09 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-1.patch

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-10 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-2.patch

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-10 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Release Note: 
In POST pig/hive/jar/steaming request, if statusdir is set and enablelog=true, 
webhcat will create a $statusdir/logs directory after task finish. The attempts 
here include completed attempts and failed attempts. The layout for logs 
directory is:

logs/$job_id (directory for $job_id)
logs/$job_id/job.xml.html
logs/$job_id/$attempt_id (directory for $attempt_id)
logs/$job_id/$attempt_id/stderr
logs/$job_id/$attempt_id/stdout
logs/$job_id/$attempt_id/syslog

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-10 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-3.patch

Fix several bugs in HIVE-4531-3.patch

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs

2013-05-13 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4531:
-

Attachment: HIVE-4531-4.patch

Adding documentation.

 [WebHCat] Collecting task logs to hdfs
 --

 Key: HIVE-4531
 URL: https://issues.apache.org/jira/browse/HIVE-4531
 Project: Hive
  Issue Type: New Feature
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, 
 HIVE-4531-4.patch


 It would be nice we collect task logs after job finish. This is similar to 
 what Amazon EMR does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory

2013-05-14 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-4441:
-

Attachment: HIVE-4441-2.patch

Find one bug in original patch. Attach HIVE-4441-2.patch.

 [HCatalog] WebHCat does not honor user home directory
 -

 Key: HIVE-4441
 URL: https://issues.apache.org/jira/browse/HIVE-4441
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
 Attachments: HIVE-4441-1.patch, HIVE-4441-2.patch


 If I submit a job as user A and I specify statusdir as a relative path, I 
 would expect results to be stored in the folder relative to the user A's home 
 folder.
 For example, if I run:
 {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d 
 statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code}
 I get the results under:
 {code}/user/hdp/pokes.output{code}
 And I expect them to be under:
 {code}/user/hdinsightuser/pokes.output{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-5453) jobsubmission2.conf should use 'timeout' property


 [ 
https://issues.apache.org/jira/browse/HIVE-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5453:
-

   Resolution: Fixed
Fix Version/s: 0.13.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

+1. Change committed to trunk.

 jobsubmission2.conf should use 'timeout' property
 -

 Key: HIVE-5453
 URL: https://issues.apache.org/jira/browse/HIVE-5453
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.13.0

 Attachments: HIVE-5453.patch


 TestDriverCurl.pm used to support timeout_seconds, which got renamed to 
 'timeout'.  This makes TestHeartbeat test fail



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5448) webhcat duplicate test TestMapReduce_2 should be removed


 [ 
https://issues.apache.org/jira/browse/HIVE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5448:
-

   Resolution: Fixed
Fix Version/s: 0.13.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

+1. Patch committed to trunk.

 webhcat duplicate test TestMapReduce_2 should be removed
 

 Key: HIVE-5448
 URL: https://issues.apache.org/jira/browse/HIVE-5448
 Project: Hive
  Issue Type: Bug
  Components: Tests, WebHCat
Affects Versions: 0.12.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-5448.1.patch


 TestMapReduce_2 in jobsubmission.conf should be removed, as it is a duplicate 
 of TestHeartbeat_2 in jobsubmission2.conf
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5535) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022

Daniel Dai created HIVE-5535:


 Summary: [WebHCat] Webhcat e2e test JOBS_2 fail due to permission 
when hdfs umask setting is 022
 Key: HIVE-5535
 URL: https://issues.apache.org/jira/browse/HIVE-5535
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0
 Attachments: HIVE-5535-1.patch

Complaining no permission to output directory /tmp/templeton_test_out/$runid. 
This is because /tmp/templeton_test_out/runid is created with umask 022 with 
user test.other.user.name (the userid of the first test in the group JOBS_1). 
Other user cannot write to it (JOBS_2, which run as userid test.user.name)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5535) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022


 [ 
https://issues.apache.org/jira/browse/HIVE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5535:
-

Attachment: HIVE-5535-1.patch

 [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask 
 setting is 022
 ---

 Key: HIVE-5535
 URL: https://issues.apache.org/jira/browse/HIVE-5535
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5535-1.patch


 Complaining no permission to output directory 
 /tmp/templeton_test_out/$runid. This is because 
 /tmp/templeton_test_out/runid is created with umask 022 with user 
 test.other.user.name (the userid of the first test in the group JOBS_1). 
 Other user cannot write to it (JOBS_2, which run as userid test.user.name)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5535) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022


 [ 
https://issues.apache.org/jira/browse/HIVE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5535:
-

Status: Patch Available  (was: Open)

 [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask 
 setting is 022
 ---

 Key: HIVE-5535
 URL: https://issues.apache.org/jira/browse/HIVE-5535
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5535-1.patch


 Complaining no permission to output directory 
 /tmp/templeton_test_out/$runid. This is because 
 /tmp/templeton_test_out/runid is created with umask 022 with user 
 test.other.user.name (the userid of the first test in the group JOBS_1). 
 Other user cannot write to it (JOBS_2, which run as userid test.user.name)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2


 [ 
https://issues.apache.org/jira/browse/HIVE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5508:
-

Attachment: HIVE-5508-2.patch

Remove unrelated code (which borrowed from Pig harness).

 [WebHCat] ignore log collector e2e tests for Hadoop 2
 -

 Key: HIVE-5508
 URL: https://issues.apache.org/jira/browse/HIVE-5508
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5508-1.patch, HIVE-5508-2.patch


 Log collector currently only works with Hadoop 1. If run under Hadoop 2, no 
 log will be collected. Templeton e2e tests check the existence of those logs, 
 so they will fail under Hadoop 2. Need to disable them when run under Hadoop 
 2.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file

Daniel Dai created HIVE-5541:


 Summary: [WebHCat] Log collector does not work since we don't 
close the hdfs status file
 Key: HIVE-5541
 URL: https://issues.apache.org/jira/browse/HIVE-5541
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file


 [ 
https://issues.apache.org/jira/browse/HIVE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5541:
-

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Don't realize we already opened HIVE-5540 for that. Close as duplicate.

 [WebHCat] Log collector does not work since we don't close the hdfs status 
 file
 ---

 Key: HIVE-5541
 URL: https://issues.apache.org/jira/browse/HIVE-5541
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5541-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file


 [ 
https://issues.apache.org/jira/browse/HIVE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5541:
-

Status: Patch Available  (was: Open)

 [WebHCat] Log collector does not work since we don't close the hdfs status 
 file
 ---

 Key: HIVE-5541
 URL: https://issues.apache.org/jira/browse/HIVE-5541
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5541-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5540) webhcat e2e test failures


 [ 
https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5540:
-

Attachment: HIVE-5540-1.patch

 webhcat e2e test failures
 -

 Key: HIVE-5540
 URL: https://issues.apache.org/jira/browse/HIVE-5540
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Daniel Dai
 Attachments: HIVE-5540-1.patch, test_harnesss_1381796256


 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors 
 all of the same type.
 Expect 1 jobs in logs, but get 1
 Test run log file attached
  
 commit b612a91f7f09f45474f593f99039ec78d2c03b68
 Author: Edward Capriolo ecapri...@apache.org
 Date:   Mon Oct 14 21:40:44 2013 +
 An explode function that includes the item's position in the array (Niko 
 Stahl via egc)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 
 13f79535-47bb-0310-9956-ffa450edef68
 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 21:22:12 2013 +
 HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock 
 Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:31:22 2013 +
 HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel 
 Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 83865be207bc044c506eb957c01e8fcbf551b7d1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:14:34 2013 +
 HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when 
 hdfs umask setting is 022 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 19:50:55 2013 +
 HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu 
 via Brock Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 58a3275477fc2c4f85dfb0a729150732a8230579
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:02:22 2013 +
 HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid 
 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 95e45ede68be95603c6f43e06d9e68b20218b54f
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:00:44 2013 +
 HIVE-5507: [WebHCat] test.other.user.name parameter is missing from 
 build.xml in e2e harness (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 976ece58f134e02384e4f54474c1749b85c03934
 Author: Jianyong Dai da...@apache.org
 Date:   Mon Oct 14 18:38:29 2013 +
 HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed 
 (Thejas M Nair via Daniel Dai)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 
 13f79535-47bb-0310-9956-ffa450edef68



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file


 [ 
https://issues.apache.org/jira/browse/HIVE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5541:
-

Attachment: HIVE-5541-1.patch

 [WebHCat] Log collector does not work since we don't close the hdfs status 
 file
 ---

 Key: HIVE-5541
 URL: https://issues.apache.org/jira/browse/HIVE-5541
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5541-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5540) webhcat e2e test failures


 [ 
https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5540:
-

Status: Patch Available  (was: Open)

 webhcat e2e test failures
 -

 Key: HIVE-5540
 URL: https://issues.apache.org/jira/browse/HIVE-5540
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Daniel Dai
 Attachments: HIVE-5540-1.patch, test_harnesss_1381796256


 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors 
 all of the same type.
 Expect 1 jobs in logs, but get 1
 Test run log file attached
  
 commit b612a91f7f09f45474f593f99039ec78d2c03b68
 Author: Edward Capriolo ecapri...@apache.org
 Date:   Mon Oct 14 21:40:44 2013 +
 An explode function that includes the item's position in the array (Niko 
 Stahl via egc)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 
 13f79535-47bb-0310-9956-ffa450edef68
 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 21:22:12 2013 +
 HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock 
 Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:31:22 2013 +
 HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel 
 Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 83865be207bc044c506eb957c01e8fcbf551b7d1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:14:34 2013 +
 HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when 
 hdfs umask setting is 022 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 19:50:55 2013 +
 HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu 
 via Brock Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 58a3275477fc2c4f85dfb0a729150732a8230579
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:02:22 2013 +
 HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid 
 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 95e45ede68be95603c6f43e06d9e68b20218b54f
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:00:44 2013 +
 HIVE-5507: [WebHCat] test.other.user.name parameter is missing from 
 build.xml in e2e harness (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 976ece58f134e02384e4f54474c1749b85c03934
 Author: Jianyong Dai da...@apache.org
 Date:   Mon Oct 14 18:38:29 2013 +
 HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed 
 (Thejas M Nair via Daniel Dai)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 
 13f79535-47bb-0310-9956-ffa450edef68



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5540) webhcat e2e test failures: Expect 1 jobs in logs, but get 1


 [ 
https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5540:
-

Attachment: HIVE-5540-2.patch

Should close the file in finally block instead. Update the patch.

 webhcat e2e test failures: Expect 1 jobs in logs, but get 1
 -

 Key: HIVE-5540
 URL: https://issues.apache.org/jira/browse/HIVE-5540
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Daniel Dai
 Attachments: HIVE-5540-1.patch, HIVE-5540-2.patch, 
 test_harnesss_1381796256


 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors 
 all of the same type.
 Expect 1 jobs in logs, but get 1
 Test run log file attached
  
 commit b612a91f7f09f45474f593f99039ec78d2c03b68
 Author: Edward Capriolo ecapri...@apache.org
 Date:   Mon Oct 14 21:40:44 2013 +
 An explode function that includes the item's position in the array (Niko 
 Stahl via egc)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 
 13f79535-47bb-0310-9956-ffa450edef68
 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 21:22:12 2013 +
 HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock 
 Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:31:22 2013 +
 HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel 
 Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 83865be207bc044c506eb957c01e8fcbf551b7d1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:14:34 2013 +
 HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when 
 hdfs umask setting is 022 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 19:50:55 2013 +
 HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu 
 via Brock Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 58a3275477fc2c4f85dfb0a729150732a8230579
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:02:22 2013 +
 HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid 
 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 95e45ede68be95603c6f43e06d9e68b20218b54f
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:00:44 2013 +
 HIVE-5507: [WebHCat] test.other.user.name parameter is missing from 
 build.xml in e2e harness (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 976ece58f134e02384e4f54474c1749b85c03934
 Author: Jianyong Dai da...@apache.org
 Date:   Mon Oct 14 18:38:29 2013 +
 HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed 
 (Thejas M Nair via Daniel Dai)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 
 13f79535-47bb-0310-9956-ffa450edef68



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2013-10-15 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5510:
-

Attachment: HIVE-5510-2.patch

Addressing Eugene's review comments.

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, 
 test_harnesss_1381798977


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5540) webhcat e2e test failures: Expect 1 jobs in logs, but get 1

2013-10-15 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5540:
-

Attachment: HIVE-5540-3.patch

Add the comments Eugene suggested.

 webhcat e2e test failures: Expect 1 jobs in logs, but get 1
 -

 Key: HIVE-5540
 URL: https://issues.apache.org/jira/browse/HIVE-5540
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Daniel Dai
 Attachments: HIVE-5540-1.patch, HIVE-5540-2.patch, HIVE-5540-3.patch, 
 test_harnesss_1381796256


 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors 
 all of the same type.
 Expect 1 jobs in logs, but get 1
 Test run log file attached
  
 commit b612a91f7f09f45474f593f99039ec78d2c03b68
 Author: Edward Capriolo ecapri...@apache.org
 Date:   Mon Oct 14 21:40:44 2013 +
 An explode function that includes the item's position in the array (Niko 
 Stahl via egc)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 
 13f79535-47bb-0310-9956-ffa450edef68
 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 21:22:12 2013 +
 HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock 
 Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:31:22 2013 +
 HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel 
 Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 83865be207bc044c506eb957c01e8fcbf551b7d1
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 20:14:34 2013 +
 HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when 
 hdfs umask setting is 022 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221
 Author: Brock Noland br...@apache.org
 Date:   Mon Oct 14 19:50:55 2013 +
 HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu 
 via Brock Noland)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 58a3275477fc2c4f85dfb0a729150732a8230579
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:02:22 2013 +
 HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid 
 (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 95e45ede68be95603c6f43e06d9e68b20218b54f
 Author: Thejas Madhavan Nair the...@apache.org
 Date:   Mon Oct 14 19:00:44 2013 +
 HIVE-5507: [WebHCat] test.other.user.name parameter is missing from 
 build.xml in e2e harness (Daniel Dai via Thejas Nair)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 
 13f79535-47bb-0310-9956-ffa450edef68
 commit 976ece58f134e02384e4f54474c1749b85c03934
 Author: Jianyong Dai da...@apache.org
 Date:   Mon Oct 14 18:38:29 2013 +
 HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed 
 (Thejas M Nair via Daniel Dai)
 
 git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 
 13f79535-47bb-0310-9956-ffa450edef68



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2013-10-16 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797256#comment-13797256
 ] 

Daniel Dai commented on HIVE-5510:
--

bq. build.xml changes concurrency from 5/5 to 1/1. Why?
That's unintended. Will remove it

bq. TempletonControllerJob.Watcher.run()
bq. This also creates a JobState for each child task. What is purpose of this?
Just for each child know its parent.
bq. We never (AFAIK) get any notifications from Hadoop when a subtask 
completes. When is this JobState (for child task) ever updated? If not, then 
why create it? Conversely, maybe it would be useful to track the state of child 
task, but I think that would require more changes like registering a callback 
with JobTracker for each child task that we find. Don't know whether this is 
really useful.
Only the parent field for each each child job is meaningful, we don't track 
other changes for the child job

bq. DeleteDelegator
bq. it uses System.err for logging. Why not log4j which will be in webhcat.log 
with WARN log level
Yes, we should, will change.

bq. JobState:
bq. getChildId() is never used
This is used by StatusDelegator. Actually getChildren is never used, but it is 
there for a while, I don't want to remove it in this patch.

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, 
 test_harnesss_1381798977


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5133) webhcat jobs that need to access metastore fails in secure mode

2013-10-18 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5133:
-

   Resolution: Fixed
Fix Version/s: 0.13.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Patch committed to trunk.

 webhcat jobs that need to access metastore fails in secure mode
 ---

 Key: HIVE-5133
 URL: https://issues.apache.org/jira/browse/HIVE-5133
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Eugene Koifman
 Fix For: 0.13.0

 Attachments: HIVE-5133.1.patch, HIVE-5133.1.test.patch, 
 HIVE-5133.2.patch, HIVE-5133.3.patch, HIVE-5133.5.patch, HIVE-5133.6.patch


 Webhcat job submission requests result in the pig/hive/mr job being run from 
 a map task that it launches. In secure mode, for the pig/hive/mr job that is 
 run to be authorized to perform actions on metastore, it has to have the 
 delegation tokens from the hive metastore.
 In case of pig/MR job this is needed if hcatalog is being used in the 
 script/job.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success

2013-10-22 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802455#comment-13802455
 ] 

Daniel Dai commented on HIVE-5481:
--

HIVE-5510 make 

 WebHCat e2e test: TestStreaming -ve tests should also check for job 
 completion success
 --

 Key: HIVE-5481
 URL: https://issues.apache.org/jira/browse/HIVE-5481
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5481.1.patch


 Since TempletonController will anyway succeed for the -ve tests as well. 
 However, the exit value should be non-zero.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success

2013-10-22 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802456#comment-13802456
 ] 

Daniel Dai commented on HIVE-5481:
--

HIVE-5510 make similar changes. We can wait for HIVE-5510 check in and close 
the ticket.

 WebHCat e2e test: TestStreaming -ve tests should also check for job 
 completion success
 --

 Key: HIVE-5481
 URL: https://issues.apache.org/jira/browse/HIVE-5481
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-5481.1.patch


 Since TempletonController will anyway succeed for the -ve tests as well. 
 However, the exit value should be non-zero.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2013-10-22 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5510:
-

Attachment: HIVE-5510-3.patch

Yes, you are right. And setChildren/setChildren too. Removing all those unused 
methods.

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, 
 test_harnesss_1381798977


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444

2013-10-23 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803152#comment-13803152
 ] 

Daniel Dai commented on HIVE-4446:
--

Thanks Lefty, the document for this Jira looks good for me. There's additional 
document change not ported to cwiki yet, such as HIVE-5031, HIVE-4531, etc.

 [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-
 

 Key: HIVE-4446
 URL: https://issues.apache.org/jira/browse/HIVE-4446
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Daniel Dai
Assignee: Lefty Leverenz
 Attachments: HIVE-4446-1.patch






--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HIVE-5696) WebHCat e2e tests/jobsubmission.conf file is malformed and loosing tests

2013-10-30 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved HIVE-5696.
--

   Resolution: Fixed
Fix Version/s: 0.13.0
 Hadoop Flags: Reviewed

+1.

Patch committed to trunk.

 WebHCat e2e tests/jobsubmission.conf file is malformed and loosing tests
 

 Key: HIVE-5696
 URL: https://issues.apache.org/jira/browse/HIVE-5696
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.13.0

 Attachments: HIVE-5696.patch


 there is a misplaced bracket and curly brace (see patch file) which causes 
 the last 3 tests in TestHive to not be executed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2013-10-31 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5510:
-

Attachment: HIVE-5510-4.patch

HIVE-5510-4.patch resync with trunk.

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, 
 HIVE-5510-4.patch, test_harnesss_1381798977


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2013-10-31 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved HIVE-5510.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Patch committed to trunk.

 [WebHCat] GET job/queue return wrong job information
 

 Key: HIVE-5510
 URL: https://issues.apache.org/jira/browse/HIVE-5510
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, 
 HIVE-5510-4.patch, test_harnesss_1381798977


 GET job/queue of a TempletonController job return weird information. It is a 
 mix of child job and itself. It should only pull the information of the 
 controller job itself.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HIVE-5728) Make ORC InputFormat/OutputFormat available outside Hive

Daniel Dai created HIVE-5728:


 Summary: Make ORC InputFormat/OutputFormat available outside Hive
 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0


ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
several issues need to solve:
1. Several class is not public, eg: OrcStruct
2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
need new api)
3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat available outside Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

Attachment: HIVE-5728-1.patch

Attach HIVE-5728-1.patch. Summary of changes:
1. Create OrcNewInputFormat/OrcNewOrcputFormat in new api
2. Extract common pieces of OrcNewInputFormat/OrcInputFormat into 
OrcInputFormatUtils
3. Make several classes/methods public
4. Make WriteOptions configurable through Configuration
5. Add unit tests for newly added InputFormat/OutputFormat

 Make ORC InputFormat/OutputFormat available outside Hive
 

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

Summary: Make ORC InputFormat/OutputFormat usable outside Hive  (was: Make 
ORC InputFormat/OutputFormat available outside Hive)

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive


 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

Attachment: HIVE-5728-2.patch

Thanks. Fixed the format issue.

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive

2013-11-09 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5728:
-

Attachment: HIVE-5728-3.patch

Addressing [~owen.omalley]'s review comments. The code movement is not 
necessary, I retained the code in OrcInputFormat this time. I also leave 
OrcStruct non-public, it is not absolutely needed here.

 Make ORC InputFormat/OutputFormat usable outside Hive
 -

 Key: HIVE-5728
 URL: https://issues.apache.org/jira/browse/HIVE-5728
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch


 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are 
 several issues need to solve:
 1. Several class is not public, eg: OrcStruct
 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig 
 need new api)
 3. Has no way to push WriteOption to OutputFormat outside Hive



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Resolved] (HIVE-5098) Fix metastore for SQL Server


 [ 
https://issues.apache.org/jira/browse/HIVE-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved HIVE-5098.
--

   Resolution: Fixed
Fix Version/s: 0.13.0
 Hadoop Flags: Reviewed

The fix is included in datanucleus-rdbms 3.2.9+. Will upgrade datanucleus-rdbms 
version to embrace it (patch will be included in HIVE-5099).

 Fix metastore for SQL Server
 

 Key: HIVE-5098
 URL: https://issues.apache.org/jira/browse/HIVE-5098
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5098-1.patch, HIVE-5098-2.patch


 We found one problem in testing SQL Server metastore. In Hive code, we use 
 substring function with single parameter in datanucleus query 
 (Expressiontree.java):
 {code}
 if (partitionColumnIndex == (partitionColumnCount - 1)) {
 valString = partitionName.substring(partitionName.indexOf(\ + 
 keyEqual + \)+ + keyEqualLength + );
   }
   else {
 valString = partitionName.substring(partitionName.indexOf(\ + 
 keyEqual + \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
   }
 {code}
 SQL server does not support single parameter substring and datanucleus does 
 not fill the gap.
 In the attached patch:
 1. creates a new jar hive-datanucleusplugin.jar in $HIVE_HOME/lib
 2. hive-datanucleusplugin.jar is a datanucleus plugin (include plugin.xml, 
 MANIFEST.MF)
 3. The plugin write a specific version of substring implementation for 
 sqlserver (which avoid using single param SUBSTRING, which is not supported 
 in SQLSever)
 4. The plugin code only kicks in when the rmdb is sqlserver



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HIVE-5098) Fix metastore for SQL Server


[ 
https://issues.apache.org/jira/browse/HIVE-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844562#comment-13844562
 ] 

Daniel Dai commented on HIVE-5098:
--

Here is the fix in datanucleus: 
http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-717

 Fix metastore for SQL Server
 

 Key: HIVE-5098
 URL: https://issues.apache.org/jira/browse/HIVE-5098
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.13.0

 Attachments: HIVE-5098-1.patch, HIVE-5098-2.patch


 We found one problem in testing SQL Server metastore. In Hive code, we use 
 substring function with single parameter in datanucleus query 
 (Expressiontree.java):
 {code}
 if (partitionColumnIndex == (partitionColumnCount - 1)) {
 valString = partitionName.substring(partitionName.indexOf(\ + 
 keyEqual + \)+ + keyEqualLength + );
   }
   else {
 valString = partitionName.substring(partitionName.indexOf(\ + 
 keyEqual + \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
   }
 {code}
 SQL server does not support single parameter substring and datanucleus does 
 not fill the gap.
 In the attached patch:
 1. creates a new jar hive-datanucleusplugin.jar in $HIVE_HOME/lib
 2. hive-datanucleusplugin.jar is a datanucleus plugin (include plugin.xml, 
 MANIFEST.MF)
 3. The plugin write a specific version of substring implementation for 
 sqlserver (which avoid using single param SUBSTRING, which is not supported 
 in SQLSever)
 4. The plugin code only kicks in when the rmdb is sqlserver



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server


 [ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5099:
-

Attachment: HIVE-5099-2.patch

Update the patch after 
http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-718. Need to upgrade 
datanucleus version to embrace the change.

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy5

[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server


 [ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5099:
-

Attachment: (was: HCATALOG-48-1.patch)

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy5.get_partitions_by_filter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor

[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server


 [ 
https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-5099:
-

Attachment: HCATALOG-48-1.patch

Attach patch, which:
1. Remove hive-datanucleusplugin
2. Upgrade datanucleus version

 Some partition publish operation cause OOM in metastore backed by SQL Server
 

 Key: HIVE-5099
 URL: https://issues.apache.org/jira/browse/HIVE-5099
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Windows
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HCATALOG-48-1.patch, HIVE-5099-1.patch, HIVE-5099-2.patch


 For certain metastore operation combination, metastore operation hangs and 
 metastore server eventually fail due to OOM. This happens when metastore is 
 backed by SQL Server. Here is a testcase to reproduce:
 {code}
 CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d 
 STRING);
 CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4);
 ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3);
 ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia');
 ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure
 {code}
 The code cause the issue is in ExpressionTree.java:
 {code}
 valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + 
 \)+ + keyEqualLength + ).substring(0, 
 partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + 
 keyEqualLength + ).indexOf(\/\));
 {code}
 The snapshot of table partition before the drop partition statement is:
 {code}
 PART_ID  CREATE_TIMELAST_ACCESS_TIME  PART_NAMESD_ID  
  TBL_ID 
 931376526718  0c=France/d=4   127 33
 941376526718  0c=Russia/d=3   128 33
 951376526718  0e=Russia   129 34
 {code}
 Datanucleus query try to find the value of a particular key by locating 
 $key= as the start, / as the end. For example, value of c in 
 c=France/d=4 by locating c= as the start, / following as the end. 
 However, this query fail if we try to find value e in e=Russia since 
 there is no tailing /. 
 Other database works since the query plan first filter out the partition not 
 belonging to tbl_repro_oom1. Whether this error surface or not depends on the 
 query optimizer.
 When this exception happens, metastore keep trying and throw exception. The 
 memory image of metastore contains a large number of exception objects:
 {code}
 com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter 
 passed to the LEFT or SUBSTRING function.
   at 
 com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682)
   at 
 com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
   at 
 org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90)
   at 
 org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686)
   at org.datanucleus.store.query.Query.executeQuery(Query.java:1791)
   at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694)
   at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715)
   at 
 org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590)
   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
   at $Proxy4.getPartitionsByFilter(Unknown Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
   at $Proxy5.get_partitions_by_filter(Unknown Source

[jira] [Commented] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server