[jira] [Updated] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-: - Attachment: HIVE--3.patch Adjust formatting. [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE--1.patch, HIVE--2.patch, HIVE--3.patch Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: (was: HIVE-4443-4.patch) [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, HIVE-4443-4.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-4.patch [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, HIVE-4443-4.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768557#comment-13768557 ] Daniel Dai commented on HIVE-4443: -- Test is in HIVE-5078. [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4443-1.patch, HIVE-4443-2.patch, HIVE-4443-3.patch, HIVE-4443-4.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-: - Attachment: HIVE--5.patch [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE--1.patch, HIVE--2.patch, HIVE--3.patch, HIVE--4.patch, HIVE--5.patch Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768657#comment-13768657 ] Daniel Dai commented on HIVE-: -- Fixed. Sorry about that. [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE--1.patch, HIVE--2.patch, HIVE--3.patch, HIVE--4.patch, HIVE--5.patch Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-9.patch bq. o Should this include e2e tests in addition (or instead of unit tests). If (when Hadoop changes the log file format this will break, but Unit tests won't catch this since the data that the tests parse is static. There are e2e test cases inside a separate ticket: HIVE-5078 bq. Here is a bunch of little things/nits: bq. o Server.java has “ if (enablelog == true !TempletonUtils.isset(statusdir)) throw new BadParam(enablelog is only applicable when statusdir is set);” in 4 different places. Can this be a method? done bq. o What is the purpose of Server#misc()? Should not be there, removed bq. o TempletonControllerJob: import org.apache.hive.hcatalog.templeton.Main; - unused import done bq. oo Line 173 - indentation is off? done bq. oo Line 295 - writer.close() - This writer is connected to System.err. What are the implications of closing this? What if something tries to write to it later? No one after this point is writing to writer. We opened writer, so we need to close it in our code. bq. o TempletonUtils has unused imports - checkstyle needs to be run on the whole patch. done bq. o TestJobIDParser mixes JUnit3 and JUnit4. It should either not extend TestCase (I vote for this) or not use @Test annotations one bq. o Can JobIDParser (and all subclasses) be made package scoped since they are not used outside templeton pacakge? Similarly, can methods be made as private as possible? done bq. o JobIDParser#parseJobID() has “fname” param which is not used. What is the intent? Should it be used in openStatusFile() call? If not, better to remove it. we shall use it in openStatusFile(). Fixed. bq. o JobIDParser#openStatusFile() creas a Reader. Where/when is it being closed? should close in parseJobID. Fixed. bq. o Could the 2 member variables in JobIDParser be made private (even final)? I can make them protected, but since they will be used in subclass, so I cannot make them private/final bq. o Why is TestJobIDParser using findJobID() directly? Could it not use parseJobID()? Because parseJobID hardcoded with the standard output file for that parser, which is stderr in current directory. In the test, I want to override it to test the input file in the test directory bq. o Can JobIDParser have 1 line of class level javadoc about the purpose of this class? done [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5086) Fix scriptfile1.q on Windows
[ https://issues.apache.org/jira/browse/HIVE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5086: - Attachment: HIVE-5086-2.patch Fixed unit test failure. Fix scriptfile1.q on Windows Key: HIVE-5086 URL: https://issues.apache.org/jira/browse/HIVE-5086 Project: Hive Issue Type: Bug Components: Tests, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5086-1.patch, HIVE-5086-2.patch Test failed with error message: [junit] Task with the most failures(4): [junit] - [junit] Task ID: [junit] task_20130814023904691_0001_m_00 [junit] [junit] URL: [junit] http://localhost:50030/taskdetails.jsp?jobid=job_20130814023904691_0001tipid=task_20130814023904691_0001_m_00 [junit] - [junit] Diagnostic Messages for this Task: [junit] java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175) [junit] at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) [junit] at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) [junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) [junit] at org.apache.hadoop.mapred.Child$4.run(Child.java:271) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) [junit] at org.apache.hadoop.mapred.Child.main(Child.java:265) [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} [junit] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:538) [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157) [junit] ... 8 more [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 2]: Unable to initialize custom script. [junit] at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:357) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:528) [junit] ... 9 more [junit] Caused by: java.io.IOException: Cannot run program D:\tmp\hadoop-Administrator\mapred\local\3_0\taskTracker\Administrator\jobcache\job_20130814023904691_0001\attempt_20130814023904691_0001_m_00_3\work\.\testgrep: CreateProcess error=193, %1 is not a valid Win32 application [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) [junit] at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:316) [junit] ... 18 more [junit] Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application [junit] at java.lang.ProcessImpl.create(Native Method) [junit] at java.lang.ProcessImpl.init(ProcessImpl.java:81) [junit] at java.lang.ProcessImpl.start(ProcessImpl.java:30) [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) [junit] ... 19 more [junit] [junit] [junit] Exception: Client Execution failed with error code = 2 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 2 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestMinimrCliDriver.runTest(TestMinimrCliDriver.java:122) [junit] at org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1(TestMinimrCliDriver.java:104) [junit
[jira] [Commented] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769949#comment-13769949 ] Daniel Dai commented on HIVE-4531: -- https://reviews.apache.org/r/14180/ [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5086) Fix scriptfile1.q on Windows
[ https://issues.apache.org/jira/browse/HIVE-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5086: - Status: Patch Available (was: Open) Fix scriptfile1.q on Windows Key: HIVE-5086 URL: https://issues.apache.org/jira/browse/HIVE-5086 Project: Hive Issue Type: Bug Components: Tests, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5086-1.patch, HIVE-5086-2.patch Test failed with error message: [junit] Task with the most failures(4): [junit] - [junit] Task ID: [junit] task_20130814023904691_0001_m_00 [junit] [junit] URL: [junit] http://localhost:50030/taskdetails.jsp?jobid=job_20130814023904691_0001tipid=task_20130814023904691_0001_m_00 [junit] - [junit] Diagnostic Messages for this Task: [junit] java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:175) [junit] at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) [junit] at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) [junit] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365) [junit] at org.apache.hadoop.mapred.Child$4.run(Child.java:271) [junit] at java.security.AccessController.doPrivileged(Native Method) [junit] at javax.security.auth.Subject.doAs(Subject.java:396) [junit] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) [junit] at org.apache.hadoop.mapred.Child.main(Child.java:265) [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {key:238,value:val_238} [junit] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:538) [junit] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:157) [junit] ... 8 more [junit] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 2]: Unable to initialize custom script. [junit] at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:357) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:88) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) [junit] at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:504) [junit] at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:848) [junit] at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:528) [junit] ... 9 more [junit] Caused by: java.io.IOException: Cannot run program D:\tmp\hadoop-Administrator\mapred\local\3_0\taskTracker\Administrator\jobcache\job_20130814023904691_0001\attempt_20130814023904691_0001_m_00_3\work\.\testgrep: CreateProcess error=193, %1 is not a valid Win32 application [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) [junit] at org.apache.hadoop.hive.ql.exec.ScriptOperator.processOp(ScriptOperator.java:316) [junit] ... 18 more [junit] Caused by: java.io.IOException: CreateProcess error=193, %1 is not a valid Win32 application [junit] at java.lang.ProcessImpl.create(Native Method) [junit] at java.lang.ProcessImpl.init(ProcessImpl.java:81) [junit] at java.lang.ProcessImpl.start(ProcessImpl.java:30) [junit] at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) [junit] ... 19 more [junit] [junit] [junit] Exception: Client Execution failed with error code = 2 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] junit.framework.AssertionFailedError: Client Execution failed with error code = 2 [junit] See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. [junit] at junit.framework.Assert.fail(Assert.java:47) [junit] at org.apache.hadoop.hive.cli.TestMinimrCliDriver.runTest(TestMinimrCliDriver.java:122) [junit] at org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_scriptfile1(TestMinimrCliDriver.java:104) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0
[jira] [Created] (HIVE-5303) metastore server OOM when exception happen
Daniel Dai created HIVE-5303: Summary: metastore server OOM when exception happen Key: HIVE-5303 URL: https://issues.apache.org/jira/browse/HIVE-5303 Project: Hive Issue Type: Bug Reporter: Daniel Dai The issue is described in HIVE-5099. HIVE-5099 fixed the issue metastore fail under some circumstance. But we still need to investigate why we get OOM when exception happen. The test case in HIVE-5099 is enough to reproduce the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5098) Fix metastore for SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5098: - Fix Version/s: (was: 0.12.0) Fix metastore for SQL Server Key: HIVE-5098 URL: https://issues.apache.org/jira/browse/HIVE-5098 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5098-1.patch, HIVE-5098-2.patch We found one problem in testing SQL Server metastore. In Hive code, we use substring function with single parameter in datanucleus query (Expressiontree.java): {code} if (partitionColumnIndex == (partitionColumnCount - 1)) { valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ); } else { valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); } {code} SQL server does not support single parameter substring and datanucleus does not fill the gap. In the attached patch: 1. creates a new jar hive-datanucleusplugin.jar in $HIVE_HOME/lib 2. hive-datanucleusplugin.jar is a datanucleus plugin (include plugin.xml, MANIFEST.MF) 3. The plugin write a specific version of substring implementation for sqlserver (which avoid using single param SUBSTRING, which is not supported in SQLSever) 4. The plugin code only kicks in when the rmdb is sqlserver -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5099: - Fix Version/s: (was: 0.12.0) Some partition publish operation cause OOM in metastore backed by SQL Server Key: HIVE-5099 URL: https://issues.apache.org/jira/browse/HIVE-5099 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5099-1.patch For certain metastore operation combination, metastore operation hangs and metastore server eventually fail due to OOM. This happens when metastore is backed by SQL Server. Here is a testcase to reproduce: {code} CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d STRING); CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3); ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia'); ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure {code} The code cause the issue is in ExpressionTree.java: {code} valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); {code} The snapshot of table partition before the drop partition statement is: {code} PART_ID CREATE_TIMELAST_ACCESS_TIME PART_NAMESD_ID TBL_ID 931376526718 0c=France/d=4 127 33 941376526718 0c=Russia/d=3 128 33 951376526718 0e=Russia 129 34 {code} Datanucleus query try to find the value of a particular key by locating $key= as the start, / as the end. For example, value of c in c=France/d=4 by locating c= as the start, / following as the end. However, this query fail if we try to find value e in e=Russia since there is no tailing /. Other database works since the query plan first filter out the partition not belonging to tbl_repro_oom1. Whether this error surface or not depends on the query optimizer. When this exception happens, metastore keep trying and throw exception. The memory image of metastore contains a large number of exception objects: {code} com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter passed to the LEFT or SUBSTRING function. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686) at org.datanucleus.store.query.Query.executeQuery(Query.java:1791) at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694) at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy4.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at $Proxy5.get_partitions_by_filter(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult
[jira] [Commented] (HIVE-5167) webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh
[ https://issues.apache.org/jira/browse/HIVE-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770166#comment-13770166 ] Daniel Dai commented on HIVE-5167: -- +1 webhcat_config.sh checks for env variables being set before sourcing webhcat-env.sh --- Key: HIVE-5167 URL: https://issues.apache.org/jira/browse/HIVE-5167 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-5167.1.patch, HIVE-5167.2.patch HIVE-4820 introduced checks for env variables, but it does so before sourcing webhcat-env.sh. This order needs to be reversed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-10.patch Addressed review comments by [~ekoifman]. [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-10.patch, HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-11.patch HIVE-4531-11.patch refine the exception handling code a bit. [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog, WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-4531-10.patch, HIVE-4531-11.patch, HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, HIVE-4531-6.patch, HIVE-4531-7.patch, HIVE-4531-8.patch, HIVE-4531-9.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5092) Fix hiveserver2 mapreduce local job on Windows
[ https://issues.apache.org/jira/browse/HIVE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776811#comment-13776811 ] Daniel Dai commented on HIVE-5092: -- This only supposed to use for Windows. We don't have above mentioned issue on Linux. Also noticed the cmd script patch HIVE-3129 is not committed. This Jira is dependent on it. Fix hiveserver2 mapreduce local job on Windows -- Key: HIVE-5092 URL: https://issues.apache.org/jira/browse/HIVE-5092 Project: Hive Issue Type: Bug Components: HiveServer2, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5092-1.patch Hiveserver2 fail on Mapreduce local job fail. For example: {code} select /*+ MAPJOIN(v) */ registration from studenttab10k s join votertab10k v on (s.name = v.name); {code} The root cause is class not found in the local hadoop job (MapredLocalTask.execute). HADOOP_CLASSPATH does not include $HIVE_HOME/lib. Set HADOOP_CLASSPATH correctly will fix the issue. However, there is one complexity in Windows. We start Hiveserver2 using Windows service console (services.msc), which takes hiveserver2.xml generated by hive.cmd. There is no way to pass environment variable in hiveserver2.xml (weird but reality). I attach a patch which pass it through command line arguments and relay to HADOOP_CLASSPATH in Hive code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5031) [WebHCat] GET job/:jobid to return userargs for a job in addtion to status information
[ https://issues.apache.org/jira/browse/HIVE-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5031: - Attachment: HIVE-5031-5.patch Resync with trunk. [WebHCat] GET job/:jobid to return userargs for a job in addtion to status information -- Key: HIVE-5031 URL: https://issues.apache.org/jira/browse/HIVE-5031 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5031-1.patch, HIVE-5031-2.patch, HIVE-5031-3.patch, HIVE-5031-4.patch, HIVE-5031-5.patch It would be nice to also have any user args that were passed into job creation API including job type specific information (e.g. mapreduce libjars) NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5274) HCatalog package renaming backward compatibility follow-up
[ https://issues.apache.org/jira/browse/HIVE-5274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13776919#comment-13776919 ] Daniel Dai commented on HIVE-5274: -- +1 HCatalog package renaming backward compatibility follow-up -- Key: HIVE-5274 URL: https://issues.apache.org/jira/browse/HIVE-5274 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.12.0 Attachments: HIVE-5274.2.patch, HIVE-5274.3.patch, HIVE-5274.4.patch As part of HIVE-4869, the hbase storage handler in hcat was moved to org.apache.hive.hcatalog, and then put back to org.apache.hcatalog since it was intended to be deprecated as well. However, it imports and uses several org.apache.hive.hcatalog classes. This needs to be changed to use org.apache.hcatalog classes. == Note : The above is a complete description of this issue in and of by itself, the following is more details on the backward-compatibility goal I have(not saying that each of these things are violated) : a) People using org.apache.hcatalog packages should continue being able to use that package, and see no difference at compile time or runtime. All code here is considered deprecated, and will be gone by the time hive 0.14 rolls around. Additionally, org.apache.hcatalog should behave as if it were 0.11 for all compatibility purposes. b) People using org.apache.hive.hcatalog packages should never have an org.apache.hcatalog dependency injected in. Thus, It is okay for org.apache.hcatalog to use org.apache.hive.hcatalog packages internally (say HCatUtil, for example), as long as any interfaces only expose org.apache.hcatalog.\* For tests that test org.apache.hcatalog.\*, we must be capable of testing it from a pure org.apache.hcatalog.\* world. It is never okay for org.apache.hive.hcatalog to use org.apache.hcatalog, even in tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5035) [WebHCat] Hardening parameters for Windows
[ https://issues.apache.org/jira/browse/HIVE-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5035: - Attachment: HIVE-5035-2.patch Resync with trunk and fix checkstyle warning. [WebHCat] Hardening parameters for Windows -- Key: HIVE-5035 URL: https://issues.apache.org/jira/browse/HIVE-5035 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5035-1.patch, HIVE-5035-2.patch everything pass to pig/hive/hadoop command line will be quoted. That include: mapreducejar: libjars arg define mapreducestream: cmdenv define arg pig arg execute hive arg define execute NO PRECOMMIT TESTS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5066) [WebHCat] Other code fixes for Windows
[ https://issues.apache.org/jira/browse/HIVE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5066: - Attachment: HIVE-5066-4.patch Fixed checkstyle warnings. [WebHCat] Other code fixes for Windows -- Key: HIVE-5066 URL: https://issues.apache.org/jira/browse/HIVE-5066 Project: Hive Issue Type: Sub-task Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 Attachments: HIVE-5034-1.patch, HIVE-5066-2.patch, HIVE-5066-3.patch, HIVE-5066-4.patch This is equivalent to HCATALOG-526, but updated to sync with latest trunk. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5034) [WebHCat] Make WebHCat work for Windows
[ https://issues.apache.org/jira/browse/HIVE-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved HIVE-5034. -- Resolution: Fixed All sub-tasks are resolved. Close the ticket. [WebHCat] Make WebHCat work for Windows --- Key: HIVE-5034 URL: https://issues.apache.org/jira/browse/HIVE-5034 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.12.0 This is the umbrella Jira to fix WebHCat on Windows. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5458) [WebHCat] Missing test.other.user.name parameter in e2e build.xml
[ https://issues.apache.org/jira/browse/HIVE-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5458: - Status: Patch Available (was: Open) [WebHCat] Missing test.other.user.name parameter in e2e build.xml - Key: HIVE-5458 URL: https://issues.apache.org/jira/browse/HIVE-5458 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5458-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5458) [WebHCat] Missing test.other.user.name parameter in e2e build.xml
[ https://issues.apache.org/jira/browse/HIVE-5458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5458: - Attachment: HIVE-5458-1.patch [WebHCat] Missing test.other.user.name parameter in e2e build.xml - Key: HIVE-5458 URL: https://issues.apache.org/jira/browse/HIVE-5458 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5458-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5458) [WebHCat] Missing test.other.user.name parameter in e2e build.xml
Daniel Dai created HIVE-5458: Summary: [WebHCat] Missing test.other.user.name parameter in e2e build.xml Key: HIVE-5458 URL: https://issues.apache.org/jira/browse/HIVE-5458 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5458-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5507) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness
[ https://issues.apache.org/jira/browse/HIVE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5507: - Attachment: HIVE-5507-1.patch [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness - Key: HIVE-5507 URL: https://issues.apache.org/jira/browse/HIVE-5507 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5507-1.patch When we run templeton e2e tests, we need to specify test.other.user.name parameter for a second templeton user. This is missing in build.xml. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5507) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness
Daniel Dai created HIVE-5507: Summary: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness Key: HIVE-5507 URL: https://issues.apache.org/jira/browse/HIVE-5507 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5507-1.patch When we run templeton e2e tests, we need to specify test.other.user.name parameter for a second templeton user. This is missing in build.xml. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5507) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness
[ https://issues.apache.org/jira/browse/HIVE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5507: - Status: Patch Available (was: Open) [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness - Key: HIVE-5507 URL: https://issues.apache.org/jira/browse/HIVE-5507 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5507-1.patch When we run templeton e2e tests, we need to specify test.other.user.name parameter for a second templeton user. This is missing in build.xml. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2
Daniel Dai created HIVE-5508: Summary: [WebHCat] ignore log collector e2e tests for Hadoop 2 Key: HIVE-5508 URL: https://issues.apache.org/jira/browse/HIVE-5508 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Log collector currently only works with Hadoop 1. If run under Hadoop 2, no log will be collected. Templeton e2e tests check the existence of those logs, so they will fail under Hadoop 2. Need to disable them when run under Hadoop 2. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5508: - Attachment: HIVE-5508-1.patch [WebHCat] ignore log collector e2e tests for Hadoop 2 - Key: HIVE-5508 URL: https://issues.apache.org/jira/browse/HIVE-5508 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5508-1.patch Log collector currently only works with Hadoop 1. If run under Hadoop 2, no log will be collected. Templeton e2e tests check the existence of those logs, so they will fail under Hadoop 2. Need to disable them when run under Hadoop 2. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5509) [WebHCat] TestDriverCurl to use string comparison for jobid
[ https://issues.apache.org/jira/browse/HIVE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5509: - Attachment: HIVE-5509-1.patch [WebHCat] TestDriverCurl to use string comparison for jobid --- Key: HIVE-5509 URL: https://issues.apache.org/jira/browse/HIVE-5509 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5509-1.patch In TestDriverCurl.pm, we sort job status array returned by templeton using: {code} sort { $a-{id} = $b-{id} } {code} However, = is used to compare numbers, jobid is string, so comparison is wrong. This results test JOBS_4 fail in some cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5508: - Status: Patch Available (was: Open) [WebHCat] ignore log collector e2e tests for Hadoop 2 - Key: HIVE-5508 URL: https://issues.apache.org/jira/browse/HIVE-5508 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5508-1.patch Log collector currently only works with Hadoop 1. If run under Hadoop 2, no log will be collected. Templeton e2e tests check the existence of those logs, so they will fail under Hadoop 2. Need to disable them when run under Hadoop 2. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5509) [WebHCat] TestDriverCurl to use string comparison for jobid
Daniel Dai created HIVE-5509: Summary: [WebHCat] TestDriverCurl to use string comparison for jobid Key: HIVE-5509 URL: https://issues.apache.org/jira/browse/HIVE-5509 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5509-1.patch In TestDriverCurl.pm, we sort job status array returned by templeton using: {code} sort { $a-{id} = $b-{id} } {code} However, = is used to compare numbers, jobid is string, so comparison is wrong. This results test JOBS_4 fail in some cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5509) [WebHCat] TestDriverCurl to use string comparison for jobid
[ https://issues.apache.org/jira/browse/HIVE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5509: - Status: Patch Available (was: Open) [WebHCat] TestDriverCurl to use string comparison for jobid --- Key: HIVE-5509 URL: https://issues.apache.org/jira/browse/HIVE-5509 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5509-1.patch In TestDriverCurl.pm, we sort job status array returned by templeton using: {code} sort { $a-{id} = $b-{id} } {code} However, = is used to compare numbers, jobid is string, so comparison is wrong. This results test JOBS_4 fail in some cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
Daniel Dai created HIVE-5510: Summary: [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5510: - Status: Patch Available (was: Open) [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5510: - Attachment: HIVE-5510-1.patch [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-4441) [WebHCat] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Attachment: HIVE-4441-1.patch [WebHCat] WebHCat does not honor user home directory Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Reporter: Daniel Dai Attachments: HIVE-4441-1.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Summary: [HCatalog] WebHCat does not honor user home directory (was: [WebHCat] WebHCat does not honor user home directory) [HCatalog] WebHCat does not honor user home directory - Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Reporter: Daniel Dai Attachments: HIVE-4441-1.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
Daniel Dai created HIVE-: Summary: [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Component/s: HCatalog [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call
[ https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4442: - Component/s: HCatalog [HCatalog] WebHCat should not override user.name parameter for Queue call - Key: HIVE-4442 URL: https://issues.apache.org/jira/browse/HIVE-4442 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Currently templeton for the Queue call uses the user.name to filter the results of the call in addition to the default security. Ideally the filter is an optional parameter to the call independent of the security check. I would suggest a parameter in addition to GET queue (jobs) give you all the jobs a user have permission: GET queue?showall=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Component/s: HCatalog [HCatalog] WebHCat does not honor user home directory - Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4441-1.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call
[ https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4442: - Attachment: HIVE-4442-1.patch [HCatalog] WebHCat should not override user.name parameter for Queue call - Key: HIVE-4442 URL: https://issues.apache.org/jira/browse/HIVE-4442 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4442-1.patch Currently templeton for the Queue call uses the user.name to filter the results of the call in addition to the default security. Ideally the filter is an optional parameter to the call independent of the security check. I would suggest a parameter in addition to GET queue (jobs) give you all the jobs a user have permission: GET queue?showall=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-1.patch Attach patch. The patch also contains e2e tests for HIVE-4442. That is because HIVE-4442 and HIVE-4443 are very intervolved and it is harder to separate the tests. [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: (was: HIVE-4443-1.patch) [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-1.patch [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call
[ https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644674#comment-13644674 ] Daniel Dai commented on HIVE-4442: -- Attach patch. Note the e2e tests is intervolved with HIVE-4443. I include all tests in HIVE-4443. [HCatalog] WebHCat should not override user.name parameter for Queue call - Key: HIVE-4442 URL: https://issues.apache.org/jira/browse/HIVE-4442 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4442-1.patch Currently templeton for the Queue call uses the user.name to filter the results of the call in addition to the default security. Ideally the filter is an optional parameter to the call independent of the security check. I would suggest a parameter in addition to GET queue (jobs) give you all the jobs a user have permission: GET queue?showall=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-: - Attachment: HIVE--1.patch [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE--1.patch Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: (was: HIVE-4443-1.patch) [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-1.patch [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444
[ https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4446: - Attachment: HIVE-4446-1.patch [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE- Key: HIVE-4446 URL: https://issues.apache.org/jira/browse/HIVE-4446 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4446-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4465) webhcat e2e tests succeed regardless of exitvalue
[ https://issues.apache.org/jira/browse/HIVE-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646421#comment-13646421 ] Daniel Dai commented on HIVE-4465: -- +1 webhcat e2e tests succeed regardless of exitvalue - Key: HIVE-4465 URL: https://issues.apache.org/jira/browse/HIVE-4465 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12.0 Attachments: HIVE-4465.patch Currently the webhcat tests that check job status for Pig, Hive, and MR do not check the exit value of the job. So a job can fail and the test will succeed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-1.patch Attach initial patch. [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: (was: HIVE-4531-1.patch) [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-1.patch [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-2.patch [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Release Note: In POST pig/hive/jar/steaming request, if statusdir is set and enablelog=true, webhcat will create a $statusdir/logs directory after task finish. The attempts here include completed attempts and failed attempts. The layout for logs directory is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-3.patch Fix several bugs in HIVE-4531-3.patch [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: HIVE-4531-4.patch Adding documentation. [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Attachment: HIVE-4441-2.patch Find one bug in original patch. Attach HIVE-4441-2.patch. [HCatalog] WebHCat does not honor user home directory - Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4441-1.patch, HIVE-4441-2.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5453) jobsubmission2.conf should use 'timeout' property
[ https://issues.apache.org/jira/browse/HIVE-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5453: - Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1. Change committed to trunk. jobsubmission2.conf should use 'timeout' property - Key: HIVE-5453 URL: https://issues.apache.org/jira/browse/HIVE-5453 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5453.patch TestDriverCurl.pm used to support timeout_seconds, which got renamed to 'timeout'. This makes TestHeartbeat test fail -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5448) webhcat duplicate test TestMapReduce_2 should be removed
[ https://issues.apache.org/jira/browse/HIVE-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5448: - Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1. Patch committed to trunk. webhcat duplicate test TestMapReduce_2 should be removed Key: HIVE-5448 URL: https://issues.apache.org/jira/browse/HIVE-5448 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.12.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.13.0 Attachments: HIVE-5448.1.patch TestMapReduce_2 in jobsubmission.conf should be removed, as it is a duplicate of TestHeartbeat_2 in jobsubmission2.conf NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5535) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022
Daniel Dai created HIVE-5535: Summary: [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 Key: HIVE-5535 URL: https://issues.apache.org/jira/browse/HIVE-5535 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5535-1.patch Complaining no permission to output directory /tmp/templeton_test_out/$runid. This is because /tmp/templeton_test_out/runid is created with umask 022 with user test.other.user.name (the userid of the first test in the group JOBS_1). Other user cannot write to it (JOBS_2, which run as userid test.user.name) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5535) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022
[ https://issues.apache.org/jira/browse/HIVE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5535: - Attachment: HIVE-5535-1.patch [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 --- Key: HIVE-5535 URL: https://issues.apache.org/jira/browse/HIVE-5535 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5535-1.patch Complaining no permission to output directory /tmp/templeton_test_out/$runid. This is because /tmp/templeton_test_out/runid is created with umask 022 with user test.other.user.name (the userid of the first test in the group JOBS_1). Other user cannot write to it (JOBS_2, which run as userid test.user.name) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5535) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022
[ https://issues.apache.org/jira/browse/HIVE-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5535: - Status: Patch Available (was: Open) [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 --- Key: HIVE-5535 URL: https://issues.apache.org/jira/browse/HIVE-5535 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5535-1.patch Complaining no permission to output directory /tmp/templeton_test_out/$runid. This is because /tmp/templeton_test_out/runid is created with umask 022 with user test.other.user.name (the userid of the first test in the group JOBS_1). Other user cannot write to it (JOBS_2, which run as userid test.user.name) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5508) [WebHCat] ignore log collector e2e tests for Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5508: - Attachment: HIVE-5508-2.patch Remove unrelated code (which borrowed from Pig harness). [WebHCat] ignore log collector e2e tests for Hadoop 2 - Key: HIVE-5508 URL: https://issues.apache.org/jira/browse/HIVE-5508 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5508-1.patch, HIVE-5508-2.patch Log collector currently only works with Hadoop 1. If run under Hadoop 2, no log will be collected. Templeton e2e tests check the existence of those logs, so they will fail under Hadoop 2. Need to disable them when run under Hadoop 2. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file
Daniel Dai created HIVE-5541: Summary: [WebHCat] Log collector does not work since we don't close the hdfs status file Key: HIVE-5541 URL: https://issues.apache.org/jira/browse/HIVE-5541 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file
[ https://issues.apache.org/jira/browse/HIVE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5541: - Resolution: Duplicate Status: Resolved (was: Patch Available) Don't realize we already opened HIVE-5540 for that. Close as duplicate. [WebHCat] Log collector does not work since we don't close the hdfs status file --- Key: HIVE-5541 URL: https://issues.apache.org/jira/browse/HIVE-5541 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5541-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file
[ https://issues.apache.org/jira/browse/HIVE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5541: - Status: Patch Available (was: Open) [WebHCat] Log collector does not work since we don't close the hdfs status file --- Key: HIVE-5541 URL: https://issues.apache.org/jira/browse/HIVE-5541 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5541-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5540) webhcat e2e test failures
[ https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5540: - Attachment: HIVE-5540-1.patch webhcat e2e test failures - Key: HIVE-5540 URL: https://issues.apache.org/jira/browse/HIVE-5540 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Daniel Dai Attachments: HIVE-5540-1.patch, test_harnesss_1381796256 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors all of the same type. Expect 1 jobs in logs, but get 1 Test run log file attached commit b612a91f7f09f45474f593f99039ec78d2c03b68 Author: Edward Capriolo ecapri...@apache.org Date: Mon Oct 14 21:40:44 2013 + An explode function that includes the item's position in the array (Niko Stahl via egc) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 13f79535-47bb-0310-9956-ffa450edef68 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6 Author: Brock Noland br...@apache.org Date: Mon Oct 14 21:22:12 2013 + HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 13f79535-47bb-0310-9956-ffa450edef68 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:31:22 2013 + HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 13f79535-47bb-0310-9956-ffa450edef68 commit 83865be207bc044c506eb957c01e8fcbf551b7d1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:14:34 2013 + HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 13f79535-47bb-0310-9956-ffa450edef68 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221 Author: Brock Noland br...@apache.org Date: Mon Oct 14 19:50:55 2013 + HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 13f79535-47bb-0310-9956-ffa450edef68 commit 58a3275477fc2c4f85dfb0a729150732a8230579 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:02:22 2013 + HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 13f79535-47bb-0310-9956-ffa450edef68 commit 95e45ede68be95603c6f43e06d9e68b20218b54f Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:00:44 2013 + HIVE-5507: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 13f79535-47bb-0310-9956-ffa450edef68 commit 976ece58f134e02384e4f54474c1749b85c03934 Author: Jianyong Dai da...@apache.org Date: Mon Oct 14 18:38:29 2013 + HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed (Thejas M Nair via Daniel Dai) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 13f79535-47bb-0310-9956-ffa450edef68 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5541) [WebHCat] Log collector does not work since we don't close the hdfs status file
[ https://issues.apache.org/jira/browse/HIVE-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5541: - Attachment: HIVE-5541-1.patch [WebHCat] Log collector does not work since we don't close the hdfs status file --- Key: HIVE-5541 URL: https://issues.apache.org/jira/browse/HIVE-5541 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5541-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5540) webhcat e2e test failures
[ https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5540: - Status: Patch Available (was: Open) webhcat e2e test failures - Key: HIVE-5540 URL: https://issues.apache.org/jira/browse/HIVE-5540 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Daniel Dai Attachments: HIVE-5540-1.patch, test_harnesss_1381796256 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors all of the same type. Expect 1 jobs in logs, but get 1 Test run log file attached commit b612a91f7f09f45474f593f99039ec78d2c03b68 Author: Edward Capriolo ecapri...@apache.org Date: Mon Oct 14 21:40:44 2013 + An explode function that includes the item's position in the array (Niko Stahl via egc) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 13f79535-47bb-0310-9956-ffa450edef68 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6 Author: Brock Noland br...@apache.org Date: Mon Oct 14 21:22:12 2013 + HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 13f79535-47bb-0310-9956-ffa450edef68 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:31:22 2013 + HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 13f79535-47bb-0310-9956-ffa450edef68 commit 83865be207bc044c506eb957c01e8fcbf551b7d1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:14:34 2013 + HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 13f79535-47bb-0310-9956-ffa450edef68 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221 Author: Brock Noland br...@apache.org Date: Mon Oct 14 19:50:55 2013 + HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 13f79535-47bb-0310-9956-ffa450edef68 commit 58a3275477fc2c4f85dfb0a729150732a8230579 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:02:22 2013 + HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 13f79535-47bb-0310-9956-ffa450edef68 commit 95e45ede68be95603c6f43e06d9e68b20218b54f Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:00:44 2013 + HIVE-5507: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 13f79535-47bb-0310-9956-ffa450edef68 commit 976ece58f134e02384e4f54474c1749b85c03934 Author: Jianyong Dai da...@apache.org Date: Mon Oct 14 18:38:29 2013 + HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed (Thejas M Nair via Daniel Dai) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 13f79535-47bb-0310-9956-ffa450edef68 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5540) webhcat e2e test failures: Expect 1 jobs in logs, but get 1
[ https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5540: - Attachment: HIVE-5540-2.patch Should close the file in finally block instead. Update the patch. webhcat e2e test failures: Expect 1 jobs in logs, but get 1 - Key: HIVE-5540 URL: https://issues.apache.org/jira/browse/HIVE-5540 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Daniel Dai Attachments: HIVE-5540-1.patch, HIVE-5540-2.patch, test_harnesss_1381796256 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors all of the same type. Expect 1 jobs in logs, but get 1 Test run log file attached commit b612a91f7f09f45474f593f99039ec78d2c03b68 Author: Edward Capriolo ecapri...@apache.org Date: Mon Oct 14 21:40:44 2013 + An explode function that includes the item's position in the array (Niko Stahl via egc) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 13f79535-47bb-0310-9956-ffa450edef68 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6 Author: Brock Noland br...@apache.org Date: Mon Oct 14 21:22:12 2013 + HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 13f79535-47bb-0310-9956-ffa450edef68 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:31:22 2013 + HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 13f79535-47bb-0310-9956-ffa450edef68 commit 83865be207bc044c506eb957c01e8fcbf551b7d1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:14:34 2013 + HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 13f79535-47bb-0310-9956-ffa450edef68 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221 Author: Brock Noland br...@apache.org Date: Mon Oct 14 19:50:55 2013 + HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 13f79535-47bb-0310-9956-ffa450edef68 commit 58a3275477fc2c4f85dfb0a729150732a8230579 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:02:22 2013 + HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 13f79535-47bb-0310-9956-ffa450edef68 commit 95e45ede68be95603c6f43e06d9e68b20218b54f Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:00:44 2013 + HIVE-5507: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 13f79535-47bb-0310-9956-ffa450edef68 commit 976ece58f134e02384e4f54474c1749b85c03934 Author: Jianyong Dai da...@apache.org Date: Mon Oct 14 18:38:29 2013 + HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed (Thejas M Nair via Daniel Dai) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 13f79535-47bb-0310-9956-ffa450edef68 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5510: - Attachment: HIVE-5510-2.patch Addressing Eugene's review comments. [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, test_harnesss_1381798977 GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5540) webhcat e2e test failures: Expect 1 jobs in logs, but get 1
[ https://issues.apache.org/jira/browse/HIVE-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5540: - Attachment: HIVE-5540-3.patch Add the comments Eugene suggested. webhcat e2e test failures: Expect 1 jobs in logs, but get 1 - Key: HIVE-5540 URL: https://issues.apache.org/jira/browse/HIVE-5540 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Daniel Dai Attachments: HIVE-5540-1.patch, HIVE-5540-2.patch, HIVE-5540-3.patch, test_harnesss_1381796256 With current state of trunk repo (see below) WebHCat e2e tests have 4 errors all of the same type. Expect 1 jobs in logs, but get 1 Test run log file attached commit b612a91f7f09f45474f593f99039ec78d2c03b68 Author: Edward Capriolo ecapri...@apache.org Date: Mon Oct 14 21:40:44 2013 + An explode function that includes the item's position in the array (Niko Stahl via egc) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532108 13f79535-47bb-0310-9956-ffa450edef68 commit ad18f0747a3448fc8cda2197df6223e8abc93dc6 Author: Brock Noland br...@apache.org Date: Mon Oct 14 21:22:12 2013 + HIVE-5423 - Speed up testing of scalar UDFS (Edward Capriolo via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532103 13f79535-47bb-0310-9956-ffa450edef68 commit 4a2152b0f7df5a3f42f8577a27c2a9269697def1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:31:22 2013 + HIVE-5508 : [WebHCat] ignore log collector e2e tests for Hadoop 2 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532077 13f79535-47bb-0310-9956-ffa450edef68 commit 83865be207bc044c506eb957c01e8fcbf551b7d1 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 20:14:34 2013 + HIVE-5535 : [WebHCat] Webhcat e2e test JOBS_2 fail due to permission when hdfs umask setting is 022 (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532054 13f79535-47bb-0310-9956-ffa450edef68 commit 78da38b50d2264ebdcca0651f7c6f3750eaf1221 Author: Brock Noland br...@apache.org Date: Mon Oct 14 19:50:55 2013 + HIVE-5526 - NPE in ConstantVectorExpression.evaluate(vrg) (Remus Rusanu via Brock Noland) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532044 13f79535-47bb-0310-9956-ffa450edef68 commit 58a3275477fc2c4f85dfb0a729150732a8230579 Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:02:22 2013 + HIVE-5509 : [WebHCat] TestDriverCurl to use string comparison for jobid (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532026 13f79535-47bb-0310-9956-ffa450edef68 commit 95e45ede68be95603c6f43e06d9e68b20218b54f Author: Thejas Madhavan Nair the...@apache.org Date: Mon Oct 14 19:00:44 2013 + HIVE-5507: [WebHCat] test.other.user.name parameter is missing from build.xml in e2e harness (Daniel Dai via Thejas Nair) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532025 13f79535-47bb-0310-9956-ffa450edef68 commit 976ece58f134e02384e4f54474c1749b85c03934 Author: Jianyong Dai da...@apache.org Date: Mon Oct 14 18:38:29 2013 + HIVE-5448: webhcat duplicate test TestMapReduce_2 should be removed (Thejas M Nair via Daniel Dai) git-svn-id: https://svn.apache.org/repos/asf/hive/trunk@1532018 13f79535-47bb-0310-9956-ffa450edef68 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797256#comment-13797256 ] Daniel Dai commented on HIVE-5510: -- bq. build.xml changes concurrency from 5/5 to 1/1. Why? That's unintended. Will remove it bq. TempletonControllerJob.Watcher.run() bq. This also creates a JobState for each child task. What is purpose of this? Just for each child know its parent. bq. We never (AFAIK) get any notifications from Hadoop when a subtask completes. When is this JobState (for child task) ever updated? If not, then why create it? Conversely, maybe it would be useful to track the state of child task, but I think that would require more changes like registering a callback with JobTracker for each child task that we find. Don't know whether this is really useful. Only the parent field for each each child job is meaningful, we don't track other changes for the child job bq. DeleteDelegator bq. it uses System.err for logging. Why not log4j which will be in webhcat.log with WARN log level Yes, we should, will change. bq. JobState: bq. getChildId() is never used This is used by StatusDelegator. Actually getChildren is never used, but it is there for a while, I don't want to remove it in this patch. [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, test_harnesss_1381798977 GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5133) webhcat jobs that need to access metastore fails in secure mode
[ https://issues.apache.org/jira/browse/HIVE-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5133: - Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Patch committed to trunk. webhcat jobs that need to access metastore fails in secure mode --- Key: HIVE-5133 URL: https://issues.apache.org/jira/browse/HIVE-5133 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5133.1.patch, HIVE-5133.1.test.patch, HIVE-5133.2.patch, HIVE-5133.3.patch, HIVE-5133.5.patch, HIVE-5133.6.patch Webhcat job submission requests result in the pig/hive/mr job being run from a map task that it launches. In secure mode, for the pig/hive/mr job that is run to be authorized to perform actions on metastore, it has to have the delegation tokens from the hive metastore. In case of pig/MR job this is needed if hcatalog is being used in the script/job. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success
[ https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802455#comment-13802455 ] Daniel Dai commented on HIVE-5481: -- HIVE-5510 make WebHCat e2e test: TestStreaming -ve tests should also check for job completion success -- Key: HIVE-5481 URL: https://issues.apache.org/jira/browse/HIVE-5481 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5481.1.patch Since TempletonController will anyway succeed for the -ve tests as well. However, the exit value should be non-zero. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5481) WebHCat e2e test: TestStreaming -ve tests should also check for job completion success
[ https://issues.apache.org/jira/browse/HIVE-5481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13802456#comment-13802456 ] Daniel Dai commented on HIVE-5481: -- HIVE-5510 make similar changes. We can wait for HIVE-5510 check in and close the ticket. WebHCat e2e test: TestStreaming -ve tests should also check for job completion success -- Key: HIVE-5481 URL: https://issues.apache.org/jira/browse/HIVE-5481 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Minor Fix For: 0.13.0 Attachments: HIVE-5481.1.patch Since TempletonController will anyway succeed for the -ve tests as well. However, the exit value should be non-zero. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5510: - Attachment: HIVE-5510-3.patch Yes, you are right. And setChildren/setChildren too. Removing all those unused methods. [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, test_harnesss_1381798977 GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444
[ https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803152#comment-13803152 ] Daniel Dai commented on HIVE-4446: -- Thanks Lefty, the document for this Jira looks good for me. There's additional document change not ported to cwiki yet, such as HIVE-5031, HIVE-4531, etc. [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE- Key: HIVE-4446 URL: https://issues.apache.org/jira/browse/HIVE-4446 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Assignee: Lefty Leverenz Attachments: HIVE-4446-1.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5696) WebHCat e2e tests/jobsubmission.conf file is malformed and loosing tests
[ https://issues.apache.org/jira/browse/HIVE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved HIVE-5696. -- Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed +1. Patch committed to trunk. WebHCat e2e tests/jobsubmission.conf file is malformed and loosing tests Key: HIVE-5696 URL: https://issues.apache.org/jira/browse/HIVE-5696 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HIVE-5696.patch there is a misplaced bracket and curly brace (see patch file) which causes the last 3 tests in TestHive to not be executed. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5510: - Attachment: HIVE-5510-4.patch HIVE-5510-4.patch resync with trunk. [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, HIVE-5510-4.patch, test_harnesss_1381798977 GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved HIVE-5510. -- Resolution: Fixed Hadoop Flags: Reviewed Patch committed to trunk. [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, HIVE-5510-4.patch, test_harnesss_1381798977 GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5728) Make ORC InputFormat/OutputFormat available outside Hive
Daniel Dai created HIVE-5728: Summary: Make ORC InputFormat/OutputFormat available outside Hive Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat available outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Attachment: HIVE-5728-1.patch Attach HIVE-5728-1.patch. Summary of changes: 1. Create OrcNewInputFormat/OrcNewOrcputFormat in new api 2. Extract common pieces of OrcNewInputFormat/OrcInputFormat into OrcInputFormatUtils 3. Make several classes/methods public 4. Make WriteOptions configurable through Configuration 5. Add unit tests for newly added InputFormat/OutputFormat Make ORC InputFormat/OutputFormat available outside Hive Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Summary: Make ORC InputFormat/OutputFormat usable outside Hive (was: Make ORC InputFormat/OutputFormat available outside Hive) Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Attachment: HIVE-5728-2.patch Thanks. Fixed the format issue. Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5728) Make ORC InputFormat/OutputFormat usable outside Hive
[ https://issues.apache.org/jira/browse/HIVE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5728: - Attachment: HIVE-5728-3.patch Addressing [~owen.omalley]'s review comments. The code movement is not necessary, I retained the code in OrcInputFormat this time. I also leave OrcStruct non-public, it is not absolutely needed here. Make ORC InputFormat/OutputFormat usable outside Hive - Key: HIVE-5728 URL: https://issues.apache.org/jira/browse/HIVE-5728 Project: Hive Issue Type: Improvement Components: File Formats Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5728-1.patch, HIVE-5728-2.patch, HIVE-5728-3.patch ORC InputFormat/OutputFormat is currently not usable outside Hive. There are several issues need to solve: 1. Several class is not public, eg: OrcStruct 2. There is no InputFormat/OutputFormat for new api (Some tools such as Pig need new api) 3. Has no way to push WriteOption to OutputFormat outside Hive -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HIVE-5098) Fix metastore for SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved HIVE-5098. -- Resolution: Fixed Fix Version/s: 0.13.0 Hadoop Flags: Reviewed The fix is included in datanucleus-rdbms 3.2.9+. Will upgrade datanucleus-rdbms version to embrace it (patch will be included in HIVE-5099). Fix metastore for SQL Server Key: HIVE-5098 URL: https://issues.apache.org/jira/browse/HIVE-5098 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5098-1.patch, HIVE-5098-2.patch We found one problem in testing SQL Server metastore. In Hive code, we use substring function with single parameter in datanucleus query (Expressiontree.java): {code} if (partitionColumnIndex == (partitionColumnCount - 1)) { valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ); } else { valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); } {code} SQL server does not support single parameter substring and datanucleus does not fill the gap. In the attached patch: 1. creates a new jar hive-datanucleusplugin.jar in $HIVE_HOME/lib 2. hive-datanucleusplugin.jar is a datanucleus plugin (include plugin.xml, MANIFEST.MF) 3. The plugin write a specific version of substring implementation for sqlserver (which avoid using single param SUBSTRING, which is not supported in SQLSever) 4. The plugin code only kicks in when the rmdb is sqlserver -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5098) Fix metastore for SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844562#comment-13844562 ] Daniel Dai commented on HIVE-5098: -- Here is the fix in datanucleus: http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-717 Fix metastore for SQL Server Key: HIVE-5098 URL: https://issues.apache.org/jira/browse/HIVE-5098 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5098-1.patch, HIVE-5098-2.patch We found one problem in testing SQL Server metastore. In Hive code, we use substring function with single parameter in datanucleus query (Expressiontree.java): {code} if (partitionColumnIndex == (partitionColumnCount - 1)) { valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ); } else { valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); } {code} SQL server does not support single parameter substring and datanucleus does not fill the gap. In the attached patch: 1. creates a new jar hive-datanucleusplugin.jar in $HIVE_HOME/lib 2. hive-datanucleusplugin.jar is a datanucleus plugin (include plugin.xml, MANIFEST.MF) 3. The plugin write a specific version of substring implementation for sqlserver (which avoid using single param SUBSTRING, which is not supported in SQLSever) 4. The plugin code only kicks in when the rmdb is sqlserver -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5099: - Attachment: HIVE-5099-2.patch Update the patch after http://www.datanucleus.org/servlet/jira/browse/NUCRDBMS-718. Need to upgrade datanucleus version to embrace the change. Some partition publish operation cause OOM in metastore backed by SQL Server Key: HIVE-5099 URL: https://issues.apache.org/jira/browse/HIVE-5099 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch For certain metastore operation combination, metastore operation hangs and metastore server eventually fail due to OOM. This happens when metastore is backed by SQL Server. Here is a testcase to reproduce: {code} CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d STRING); CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3); ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia'); ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure {code} The code cause the issue is in ExpressionTree.java: {code} valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); {code} The snapshot of table partition before the drop partition statement is: {code} PART_ID CREATE_TIMELAST_ACCESS_TIME PART_NAMESD_ID TBL_ID 931376526718 0c=France/d=4 127 33 941376526718 0c=Russia/d=3 128 33 951376526718 0e=Russia 129 34 {code} Datanucleus query try to find the value of a particular key by locating $key= as the start, / as the end. For example, value of c in c=France/d=4 by locating c= as the start, / following as the end. However, this query fail if we try to find value e in e=Russia since there is no tailing /. Other database works since the query plan first filter out the partition not belonging to tbl_repro_oom1. Whether this error surface or not depends on the query optimizer. When this exception happens, metastore keep trying and throw exception. The memory image of metastore contains a large number of exception objects: {code} com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter passed to the LEFT or SUBSTRING function. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686) at org.datanucleus.store.query.Query.executeQuery(Query.java:1791) at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694) at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy4.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at $Proxy5
[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5099: - Attachment: (was: HCATALOG-48-1.patch) Some partition publish operation cause OOM in metastore backed by SQL Server Key: HIVE-5099 URL: https://issues.apache.org/jira/browse/HIVE-5099 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch For certain metastore operation combination, metastore operation hangs and metastore server eventually fail due to OOM. This happens when metastore is backed by SQL Server. Here is a testcase to reproduce: {code} CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d STRING); CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3); ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia'); ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure {code} The code cause the issue is in ExpressionTree.java: {code} valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); {code} The snapshot of table partition before the drop partition statement is: {code} PART_ID CREATE_TIMELAST_ACCESS_TIME PART_NAMESD_ID TBL_ID 931376526718 0c=France/d=4 127 33 941376526718 0c=Russia/d=3 128 33 951376526718 0e=Russia 129 34 {code} Datanucleus query try to find the value of a particular key by locating $key= as the start, / as the end. For example, value of c in c=France/d=4 by locating c= as the start, / following as the end. However, this query fail if we try to find value e in e=Russia since there is no tailing /. Other database works since the query plan first filter out the partition not belonging to tbl_repro_oom1. Whether this error surface or not depends on the query optimizer. When this exception happens, metastore keep trying and throw exception. The memory image of metastore contains a large number of exception objects: {code} com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter passed to the LEFT or SUBSTRING function. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686) at org.datanucleus.store.query.Query.executeQuery(Query.java:1791) at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694) at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy4.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at $Proxy5.get_partitions_by_filter(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor
[jira] [Updated] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-5099: - Attachment: HCATALOG-48-1.patch Attach patch, which: 1. Remove hive-datanucleusplugin 2. Upgrade datanucleus version Some partition publish operation cause OOM in metastore backed by SQL Server Key: HIVE-5099 URL: https://issues.apache.org/jira/browse/HIVE-5099 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HCATALOG-48-1.patch, HIVE-5099-1.patch, HIVE-5099-2.patch For certain metastore operation combination, metastore operation hangs and metastore server eventually fail due to OOM. This happens when metastore is backed by SQL Server. Here is a testcase to reproduce: {code} CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d STRING); CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3); ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia'); ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure {code} The code cause the issue is in ExpressionTree.java: {code} valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); {code} The snapshot of table partition before the drop partition statement is: {code} PART_ID CREATE_TIMELAST_ACCESS_TIME PART_NAMESD_ID TBL_ID 931376526718 0c=France/d=4 127 33 941376526718 0c=Russia/d=3 128 33 951376526718 0e=Russia 129 34 {code} Datanucleus query try to find the value of a particular key by locating $key= as the start, / as the end. For example, value of c in c=France/d=4 by locating c= as the start, / following as the end. However, this query fail if we try to find value e in e=Russia since there is no tailing /. Other database works since the query plan first filter out the partition not belonging to tbl_repro_oom1. Whether this error surface or not depends on the query optimizer. When this exception happens, metastore keep trying and throw exception. The memory image of metastore contains a large number of exception objects: {code} com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter passed to the LEFT or SUBSTRING function. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686) at org.datanucleus.store.query.Query.executeQuery(Query.java:1791) at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694) at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy4.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at $Proxy5.get_partitions_by_filter(Unknown Source
[jira] [Commented] (HIVE-5099) Some partition publish operation cause OOM in metastore backed by SQL Server
[ https://issues.apache.org/jira/browse/HIVE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844746#comment-13844746 ] Daniel Dai commented on HIVE-5099: -- Sorry, please ignore my last comment. Some partition publish operation cause OOM in metastore backed by SQL Server Key: HIVE-5099 URL: https://issues.apache.org/jira/browse/HIVE-5099 Project: Hive Issue Type: Bug Components: Metastore, Windows Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-5099-1.patch, HIVE-5099-2.patch For certain metastore operation combination, metastore operation hangs and metastore server eventually fail due to OOM. This happens when metastore is backed by SQL Server. Here is a testcase to reproduce: {code} CREATE TABLE tbl_repro_oom1 (a STRING, b INT) PARTITIONED BY (c STRING, d STRING); CREATE TABLE tbl_repro_oom_2 (a STRING ) PARTITIONED BY (e STRING); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='France', d=4); ALTER TABLE tbl_repro_oom1 ADD PARTITION (c='Russia', d=3); ALTER TABLE tbl_repro_oom_2 ADD PARTITION (e='Russia'); ALTER TABLE tbl_repro_oom1 DROP PARTITION (c = 'India'); --failure {code} The code cause the issue is in ExpressionTree.java: {code} valString = partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).substring(0, partitionName.substring(partitionName.indexOf(\ + keyEqual + \)+ + keyEqualLength + ).indexOf(\/\)); {code} The snapshot of table partition before the drop partition statement is: {code} PART_ID CREATE_TIMELAST_ACCESS_TIME PART_NAMESD_ID TBL_ID 931376526718 0c=France/d=4 127 33 941376526718 0c=Russia/d=3 128 33 951376526718 0e=Russia 129 34 {code} Datanucleus query try to find the value of a particular key by locating $key= as the start, / as the end. For example, value of c in c=France/d=4 by locating c= as the start, / following as the end. However, this query fail if we try to find value e in e=Russia since there is no tailing /. Other database works since the query plan first filter out the partition not belonging to tbl_repro_oom1. Whether this error surface or not depends on the query optimizer. When this exception happens, metastore keep trying and throw exception. The memory image of metastore contains a large number of exception objects: {code} com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter passed to the LEFT or SUBSTRING function. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerResultSet$FetchBuffer.nextRow(SQLServerResultSet.java:4762) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.fetchBufferNext(SQLServerResultSet.java:1682) at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:955) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207) at org.datanucleus.store.rdbms.query.ForwardQueryResult.init(ForwardQueryResult.java:90) at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:686) at org.datanucleus.store.query.Query.executeQuery(Query.java:1791) at org.datanucleus.store.query.Query.executeWithMap(Query.java:1694) at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:334) at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1715) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1590) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy4.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2163) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at $Proxy5.get_partitions_by_filter(Unknown Source
[jira] [Commented] (HIVE-7072) HCatLoader only loads first region of hbase table
[ https://issues.apache.org/jira/browse/HIVE-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090006#comment-14090006 ] Daniel Dai commented on HIVE-7072: -- +1 HCatLoader only loads first region of hbase table - Key: HIVE-7072 URL: https://issues.apache.org/jira/browse/HIVE-7072 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-7072.2.patch, HIVE-7072.3.patch, HIVE-7072.4.patch Pig needs a config parameter 'pig.noSplitCombination' set to 'true' for it to be able to read HBaseStorageHandler-based tables. This is done in the HBaseLoader at getSplits time, but HCatLoader does not do so, which results in only a partial data load. Thus, we need one more special case definition in HCat, that sets this parameter in the job properties if we detect that we're loading a HBaseStorageHandler based table. (Note, also, that we should not depend directly on the HBaseStorageHandler class, and instead depend on the name of the class, since we do not want a mvn dependency on hive-hbase-handler to be able to compile HCatalog core, since it's conceivable that at some time, there might be a reverse dependency.) The primary issue is one of where this code should go, since it doesn't belong in pig (pig does not know what loader behaviour should be, and this parameter is its interface to a loader), and doesn't belong in the HBaseStorageHandler either, since that's implementing a HiveStorageHandler and is connecting up the two. Thus, this should belong to HCatLoader. Setting this parameter across the board results in poor performance for HCatLoader, so it must only be set when using with HBase. Thus, it belongs in the SpecialCases definition as that was created specifically for these kinds of odd cases, and can be called from within HCatLoader. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7771) ORC PPD fails for some decimal predicates
[ https://issues.apache.org/jira/browse/HIVE-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101829#comment-14101829 ] Daniel Dai commented on HIVE-7771: -- Are you change the search argument to use BigDecimal? If so, shall we change SearchArgumentImpl as well? {code} - literal instanceof HiveDecimal) { + literal instanceof BigDecimal) { ... - } else if (literal instanceof HiveDecimal) { + } else if (literal instanceof BigDecimal) { {code} ORC PPD fails for some decimal predicates - Key: HIVE-7771 URL: https://issues.apache.org/jira/browse/HIVE-7771 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7771.1.patch Some queries like {code} select * from table where dcol=11.22BD; {code} fails when ORC predicate pushdown is enabled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7771) ORC PPD fails for some decimal predicates
[ https://issues.apache.org/jira/browse/HIVE-7771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102564#comment-14102564 ] Daniel Dai commented on HIVE-7771: -- +1, works for me now. ORC PPD fails for some decimal predicates - Key: HIVE-7771 URL: https://issues.apache.org/jira/browse/HIVE-7771 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7771.1.patch, HIVE-7771.2.patch, HIVE-7771.3.patch Some queries like {code} select * from table where dcol=11.22BD; {code} fails when ORC predicate pushdown is enabled. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7222) Support timestamp column statistics in ORC and extend PPD for timestamp
[ https://issues.apache.org/jira/browse/HIVE-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-7222: - Status: Patch Available (was: Open) Support timestamp column statistics in ORC and extend PPD for timestamp --- Key: HIVE-7222 URL: https://issues.apache.org/jira/browse/HIVE-7222 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Daniel Dai Labels: orcfile Attachments: HIVE-7222-1.patch Add column statistics for timestamp columns in ORC. Also extend predicate pushdown to support timestamp column evaluation. -- This message was sent by Atlassian JIRA (v6.2#6252)