[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8307: --- Status: Open (was: Patch Available) null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.1, 0.14.0, 0.13.0 Reporter: Carl Laird Assignee: Ashutosh Chauhan Attachments: HIVE-8307.1.patch, HIVE-8307.patch It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8307: --- Attachment: HIVE-8307.2.patch updated encrypted test. null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Carl Laird Assignee: Ashutosh Chauhan Attachments: HIVE-8307.1.patch, HIVE-8307.2.patch, HIVE-8307.patch It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9511) Switch Tez to 0.6.0
[ https://issues.apache.org/jira/browse/HIVE-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300112#comment-14300112 ] Hive QA commented on HIVE-9511: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695354/HIVE-9511.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7411 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2604/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2604/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2604/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695354 - PreCommit-HIVE-TRUNK-Build Switch Tez to 0.6.0 --- Key: HIVE-9511 URL: https://issues.apache.org/jira/browse/HIVE-9511 Project: Hive Issue Type: Improvement Reporter: Damien Carol Assignee: Damien Carol Attachments: HIVE-9511.patch.txt Tez 0.6.0 has been released. Research to switch to version 0.6.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8307: --- Status: Patch Available (was: Open) null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.1, 0.14.0, 0.13.0 Reporter: Carl Laird Assignee: Ashutosh Chauhan Attachments: HIVE-8307.1.patch, HIVE-8307.2.patch, HIVE-8307.patch It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9538) Exclude thirdparty directory from tarballs
[ https://issues.apache.org/jira/browse/HIVE-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300131#comment-14300131 ] Hive QA commented on HIVE-9538: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695792/HIVE-9538.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7411 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2605/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2605/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2605/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695792 - PreCommit-HIVE-TRUNK-Build Exclude thirdparty directory from tarballs -- Key: HIVE-9538 URL: https://issues.apache.org/jira/browse/HIVE-9538 Project: Hive Issue Type: Improvement Affects Versions: spark-branch, 1.1.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-9538.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9539: --- Description: Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Reporter: Damien Carol Priority: Minor Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9538) Exclude thirdparty directory from tarballs
[ https://issues.apache.org/jira/browse/HIVE-9538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300215#comment-14300215 ] Damien Carol commented on HIVE-9538: Failed test org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion is related to HIVE-9539. It's not related to this patch. Exclude thirdparty directory from tarballs -- Key: HIVE-9538 URL: https://issues.apache.org/jira/browse/HIVE-9538 Project: Hive Issue Type: Improvement Affects Versions: spark-branch, 1.1.0 Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-9538.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9539: --- Description: Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} was: Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Priority: Minor Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9525) Enable constant propagation optimization in few existing tests where it was disabled.
[ https://issues.apache.org/jira/browse/HIVE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300214#comment-14300214 ] Damien Carol commented on HIVE-9525: The last failed test rely on HIVE-9539. It's not related to this path. Enable constant propagation optimization in few existing tests where it was disabled. - Key: HIVE-9525 URL: https://issues.apache.org/jira/browse/HIVE-9525 Project: Hive Issue Type: Test Components: Logical Optimizer Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9525.1.patch, HIVE-9525.patch We have disabled it previously because of issues. But testing again those issues looks like have gone away. We should reenable optimization for these tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
Damien Carol created HIVE-9539: -- Summary: Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Reporter: Damien Carol Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9539: --- Affects Version/s: 1.2.0 Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Priority: Minor Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9539: --- Component/s: HCatalog Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Priority: Minor Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9525) Enable constant propagation optimization in few existing tests where it was disabled.
[ https://issues.apache.org/jira/browse/HIVE-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300156#comment-14300156 ] Hive QA commented on HIVE-9525: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695796/HIVE-9525.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7411 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2606/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2606/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2606/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695796 - PreCommit-HIVE-TRUNK-Build Enable constant propagation optimization in few existing tests where it was disabled. - Key: HIVE-9525 URL: https://issues.apache.org/jira/browse/HIVE-9525 Project: Hive Issue Type: Test Components: Logical Optimizer Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9525.1.patch, HIVE-9525.patch We have disabled it previously because of issues. But testing again those issues looks like have gone away. We should reenable optimization for these tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300206#comment-14300206 ] Damien Carol commented on HIVE-9539: It's a one line patch in this method : {code} @Test public void getHiveVersion() throws Exception { MethodCallRetVal p = doHttpCall(templetonBaseUrl + /version/hive, HTTP_METHOD_TYPE.GET); Assert.assertEquals(HttpStatus.OK_200, p.httpStatusCode); MapString, Object props = JsonBuilder.jsonToMap(p.responseBody); Assert.assertEquals(hive, props.get(module)); Assert.assertTrue(p.getAssertMsg(), ((String) props.get(version)).matches(0.[0-9]+.[0-9]+.*)); } {code} Line 244 should be : {code}((String) props.get(version)).matches([0-9]+.[0-9]+.[0-9]+.*));{code} Instead of {code}((String) props.get(version)).matches(0.[0-9]+.[0-9]+.*));{code} Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Reporter: Damien Carol Priority: Minor Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol reassigned HIVE-9539: -- Assignee: Damien Carol Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Damien Carol Priority: Minor Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300212#comment-14300212 ] Damien Carol commented on HIVE-9539: For record : {noformat} GET http://localhost:52505/templeton/v1/version/hive?user.name=johndoe {noformat} Returns : {noformat} {module:hive,version:1.2.0-SNAPSHOT} {noformat} Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Priority: Minor Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8307) null character in columns.comments schema property breaks jobconf.xml
[ https://issues.apache.org/jira/browse/HIVE-8307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300213#comment-14300213 ] Hive QA commented on HIVE-8307: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695798/HIVE-8307.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7411 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2607/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2607/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2607/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695798 - PreCommit-HIVE-TRUNK-Build null character in columns.comments schema property breaks jobconf.xml - Key: HIVE-8307 URL: https://issues.apache.org/jira/browse/HIVE-8307 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Carl Laird Assignee: Ashutosh Chauhan Attachments: HIVE-8307.1.patch, HIVE-8307.2.patch, HIVE-8307.patch It would appear that the fix for https://issues.apache.org/jira/browse/HIVE-6681 is causing the null character to show up in job config xml files: I get the following when trying to insert into an elasticsearch backed table: [Fatal Error] :336:51: Character reference # 14/06/17 14:40:11 FATAL conf.Configuration: error parsing conf file: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # Exception in thread main java.lang.RuntimeException: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1263) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1129) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1063) at org.apache.hadoop.conf.Configuration.get(Configuration.java:416) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:604) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:1273) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:667) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: org.xml.sax.SAXParseException; lineNumber: 336; columnNumber: 51; Character reference # at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:251) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:300) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1181) ... 11 more Execution failed with exit status: 1 Line 336 of jobconf.xml: propertynamecolumns.comments/namevalue#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;#0;/value/property See https://groups.google.com/forum/#!msg/mongodb-user/lKbha0SzMP8/jvE8ZrJom4AJ for more discussion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9539: --- Status: Patch Available (was: Open) Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Damien Carol Priority: Minor Attachments: HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-9539: --- Attachment: HIVE-9539.patch Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Damien Carol Priority: Minor Attachments: HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300240#comment-14300240 ] Hive QA commented on HIVE-9539: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695813/HIVE-9539.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2608/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2608/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2608/ Messages: {noformat} This message was trimmed, see log for full details Reverted 'ql/src/test/results/clientpositive/bucketmapjoin5.q.out' Reverted 'ql/src/test/results/clientpositive/pcr.q.out' Reverted 'ql/src/test/results/clientpositive/analyze_table_null_partition.q.out' Reverted 'ql/src/test/results/clientpositive/outer_join_ppr.q.java1.7.out' Reverted 'ql/src/test/results/clientpositive/extrapolate_part_stats_partial.q.out' Reverted 'ql/src/test/results/clientpositive/stats3.q.out' Reverted 'ql/src/test/results/clientpositive/join33.q.out' Reverted 'ql/src/test/results/clientpositive/input_part2.q.out' Reverted 'ql/src/test/results/clientpositive/mapjoin_mapjoin.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_dml_5.q.java1.7.out' Reverted 'ql/src/test/results/clientpositive/alter_partition_coltype.q.out' Reverted 'ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out' Reverted 'ql/src/test/results/clientpositive/load_dyn_part8.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_query_oneskew_2.q.out' Reverted 'ql/src/test/results/clientpositive/sample9.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_query_multiskew_2.q.out' Reverted 'ql/src/test/results/clientpositive/groupby_map_ppr.q.out' Reverted 'ql/src/test/results/clientpositive/groupby_sort_6.q.out' Reverted 'ql/src/test/results/clientpositive/sample4.q.out' Reverted 'ql/src/test/results/clientpositive/push_or.q.out' Reverted 'ql/src/test/results/clientpositive/bucketcontext_7.q.out' Reverted 'ql/src/test/results/clientpositive/temp_table_options1.q.out' Reverted 'ql/src/test/results/clientpositive/stats13.q.out' Reverted 'ql/src/test/results/clientpositive/varchar_serde.q.out' Reverted 'ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out' Reverted 'ql/src/test/results/clientpositive/rand_partitionpruner1.q.out' Reverted 'ql/src/test/results/clientpositive/bucketcontext_2.q.out' Reverted 'ql/src/test/results/clientpositive/auto_join_reordering_values.q.out' Reverted 'ql/src/test/results/clientpositive/bucket2.q.out' Reverted 'ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out' Reverted 'ql/src/test/results/clientpositive/filter_join_breaktask.q.out' Reverted 'ql/src/test/results/clientpositive/sort_merge_join_desc_5.q.out' Reverted 'ql/src/test/results/clientpositive/join17.q.out' Reverted 'ql/src/test/results/clientpositive/input_part9.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin7.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin11.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_dml_8.q.java1.7.out' Reverted 'ql/src/test/results/clientpositive/join26.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_dml_12.q.java1.7.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out' Reverted 'ql/src/test/results/clientpositive/ppd_join_filter.q.out' Reverted 'ql/src/test/results/clientpositive/join35.q.out' Reverted 'ql/src/test/results/clientpositive/bucketmapjoin2.q.out' Reverted 'ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out' Reverted 'ql/src/test/results/clientpositive/join_map_ppr.q.out' Reverted 'ql/src/test/results/clientpositive/stats0.q.out' Reverted 'ql/src/test/results/clientpositive/join9.q.out' Reverted 'ql/src/test/results/clientpositive/smb_mapjoin_11.q.out' Reverted 'ql/src/test/results/clientpositive/ppr_allchildsarenull.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_dml_14.q.out' Reverted 'ql/src/test/results/clientpositive/list_bucket_dml_1.q.out' Reverted 'ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out' Reverted 'ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out' Reverted 'ql/src/test/results/clientpositive/sample6.q.out' Reverted 'ql/src/test/results/clientpositive/join_filters_overlap.q.out' Reverted 'ql/src/test/results/clientpositive/bucket_map_join_1.q.out' Reverted 'ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out' Reverted 'ql/src/test/results/clientpositive/tez/transform_ppr2.q.out' Reverted 'ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out' Reverted
[jira] [Updated] (HIVE-9399) ppd_multi_insert.q generate same output in different order, when mapred.reduce.tasks is set to larger than 1
[ https://issues.apache.org/jira/browse/HIVE-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9399: -- Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Chao. ppd_multi_insert.q generate same output in different order, when mapred.reduce.tasks is set to larger than 1 Key: HIVE-9399 URL: https://issues.apache.org/jira/browse/HIVE-9399 Project: Hive Issue Type: Test Reporter: Chao Assignee: Chao Fix For: 1.2.0 Attachments: HIVE-9399.1-spark.patch, HIVE-9399.1.patch, HIVE-9399.2-spark.patch If running ppd_multi_insert.q with {{set mapred.reduce.tasks=3}}, the output order is different, even with {{SORT_QUERY_RESULTS}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9143: -- Status: In Progress (was: Patch Available) select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9143: -- Status: Patch Available (was: In Progress) select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9143: -- Attachment: HIVE-9143.2.patch patch #2 select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9535) add support for NULL ORDER specification in windowed aggregates
[ https://issues.apache.org/jira/browse/HIVE-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9535: -- Description: Cannot explicitly state how nulls order in windowed aggregates. {code} select rnum, c1, c2, c3, rank() over(partition by c1,c2 order by c3 desc nulls last), rank() over(partition by c1,c2 order by c3 desc nulls first) from tolap; {code} while faking with an expression may simulate the intent supporting the ISO standard would be preferred. {code} select rnum, c1, c2, c3, rank() over(partition by c1,c2 order by c3 desc nulls last), rank() over(partition by c1,c2 order by c3 desc nulls first) from tolap; {code} was: Cannot explicitly state how nulls order in windowed aggregates. select rnum, c1, c2, c3, rank() over(partition by c1,c2 order by c3 desc nulls last), rank() over(partition by c1,c2 order by c3 desc nulls first) from tolap while faking with an expression may simulate the intent supporting the ISO standard would be preferred. select rnum, c1, c2, c3, rank() over(partition by c1,c2 order by c3 desc nulls last), rank() over(partition by c1,c2 order by c3 desc nulls first) from tolap add support for NULL ORDER specification in windowed aggregates --- Key: HIVE-9535 URL: https://issues.apache.org/jira/browse/HIVE-9535 Project: Hive Issue Type: Improvement Components: SQL Reporter: N Campbell Cannot explicitly state how nulls order in windowed aggregates. {code} select rnum, c1, c2, c3, rank() over(partition by c1,c2 order by c3 desc nulls last), rank() over(partition by c1,c2 order by c3 desc nulls first) from tolap; {code} while faking with an expression may simulate the intent supporting the ISO standard would be preferred. {code} select rnum, c1, c2, c3, rank() over(partition by c1,c2 order by c3 desc nulls last), rank() over(partition by c1,c2 order by c3 desc nulls first) from tolap; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9533) remove restriction that windowed aggregate ORDER BY can only have one key
[ https://issues.apache.org/jira/browse/HIVE-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9533: -- Description: current restriction makes the existing support very limited. {code} select rnum, c1, c2, c3, sum( c3 ) over(partition by c1 order by c2 , c3) from tolap Error: Error while compiling statement: FAILED: SemanticException Range based Window Frame can have only 1 Sort Key SQLState: 42000 ErrorCode: 4 {code} was: current restriction makes the existing support very limited. select rnum, c1, c2, c3, sum( c3 ) over(partition by c1 order by c2 , c3) from tolap Error: Error while compiling statement: FAILED: SemanticException Range based Window Frame can have only 1 Sort Key SQLState: 42000 ErrorCode: 4 remove restriction that windowed aggregate ORDER BY can only have one key - Key: HIVE-9533 URL: https://issues.apache.org/jira/browse/HIVE-9533 Project: Hive Issue Type: Improvement Components: SQL Reporter: N Campbell current restriction makes the existing support very limited. {code} select rnum, c1, c2, c3, sum( c3 ) over(partition by c1 order by c2 , c3) from tolap Error: Error while compiling statement: FAILED: SemanticException Range based Window Frame can have only 1 Sort Key SQLState: 42000 ErrorCode: 4 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300785#comment-14300785 ] Hive QA commented on HIVE-9468: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695829/HIVE-9468.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7412 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2610/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2610/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2610/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695829 - PreCommit-HIVE-TRUNK-Build Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-9468.patch From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 30489: HIVE-9539 Wrong check of version format in TestWebHCatE2e.getHiveVersion()
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30489/ --- Review request for hive. Bugs: HIVE-9539 https://issues.apache.org/jira/browse/HIVE-9539 Repository: hive-git Description --- HIVE-9539 Wrong check of version format in TestWebHCatE2e.getHiveVersion() Diffs - hcatalog/webhcat/svr/src/test/java/org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java 3bd8705bcd2466214369febd6a3a4bcd9c15416b Diff: https://reviews.apache.org/r/30489/diff/ Testing --- Thanks, Alexander Pivovarov
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9539: -- Status: Patch Available (was: In Progress) Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov reassigned HIVE-9539: - Assignee: Alexander Pivovarov (was: Damien Carol) Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300722#comment-14300722 ] Xuefu Zhang commented on HIVE-9468: --- This appears to be a different problem, and thus is not covered in this fix. Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang Assignee: Xuefu Zhang From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9468: -- Status: Patch Available (was: Open) Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-9468.patch From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9468: -- Attachment: HIVE-9468.patch Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-9468.patch From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov reassigned HIVE-9143: - Assignee: Alexander Pivovarov select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9143: -- Status: Patch Available (was: Open) select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9143: -- Attachment: HIVE-9143.1.patch patch #1 select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30487: HIVE-9143 impl current_user() udf
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30487/ --- (Updated Feb. 1, 2015, 8:20 p.m.) Review request for hive. Changes --- fixed show_functions.q.out Bugs: HIVE-9143 https://issues.apache.org/jira/browse/HIVE-9143 Repository: hive-git Description --- HIVE-9143 impl current_user() udf Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca4cc2e2a44b62f62ddbd4826df092bcfe8 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentUser.java PRE-CREATION ql/src/test/queries/clientpositive/udf_current_user.q PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 36c8743a61c55a714352d358a5d9cc0deb4cef2c ql/src/test/results/clientpositive/udf_current_user.q.out PRE-CREATION shims/common/src/main/java/org/apache/hadoop/hive/shims/Utils.java c851dc2cb28876aef77811ead397429a2338cde4 Diff: https://reviews.apache.org/r/30487/diff/ Testing --- Thanks, Alexander Pivovarov
Do we still need --no-prefix to generate patch using git?
I noticed that HIVE-9538.patch has a and b prefixes. https://issues.apache.org/jira/secure/attachment/12695792/HIVE-9538.patch It was built by Jenkins successfully... https://issues.apache.org/jira/browse/HIVE-9538 Also, Review Board allows to upload patches only with a,b prefixes now. If a,b prefixes are required now How to Contribute wiki should be updated https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-CreatingaPatch
[jira] [Commented] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300763#comment-14300763 ] Hive QA commented on HIVE-9143: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695828/HIVE-9143.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7413 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2609/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2609/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2609/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695828 - PreCommit-HIVE-TRUNK-Build select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 30487: HIVE-9143 impl current_user() udf
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30487/ --- Review request for hive. Bugs: HIVE-9143 https://issues.apache.org/jira/browse/HIVE-9143 Repository: hive-git Description --- HIVE-9143 impl current_user() udf Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 23d77ca4cc2e2a44b62f62ddbd4826df092bcfe8 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCurrentUser.java PRE-CREATION ql/src/test/queries/clientpositive/udf_current_user.q PRE-CREATION ql/src/test/results/clientpositive/show_functions.q.out 36c8743a61c55a714352d358a5d9cc0deb4cef2c ql/src/test/results/clientpositive/udf_current_user.q.out PRE-CREATION shims/common/src/main/java/org/apache/hadoop/hive/shims/Utils.java c851dc2cb28876aef77811ead397429a2338cde4 Diff: https://reviews.apache.org/r/30487/diff/ Testing --- Thanks, Alexander Pivovarov
Re: Do we still need --no-prefix to generate patch using git?
The pre commit tests work with prefix and without. On Feb 1, 2015 12:30 PM, Alexander Pivovarov apivova...@gmail.com wrote: I noticed that HIVE-9538.patch has a and b prefixes. https://issues.apache.org/jira/secure/attachment/12695792/HIVE-9538.patch It was built by Jenkins successfully... https://issues.apache.org/jira/browse/HIVE-9538 Also, Review Board allows to upload patches only with a,b prefixes now. If a,b prefixes are required now How to Contribute wiki should be updated https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-CreatingaPatch
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9539: -- Status: In Progress (was: Patch Available) Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Damien Carol Priority: Minor Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9539: -- Attachment: HIVE-9539.2.patch patch #2 Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Damien Carol Priority: Minor Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9143) select user(), current_user()
[ https://issues.apache.org/jira/browse/HIVE-9143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300794#comment-14300794 ] Alexander Pivovarov commented on HIVE-9143: --- One test failed (TestWebHCatE2e.getHiveVersion). Failed test does not relate to the patch. The test was fixed in HIVE-9539 select user(), current_user() - Key: HIVE-9143 URL: https://issues.apache.org/jira/browse/HIVE-9143 Project: Hive Issue Type: Improvement Affects Versions: 0.13.0 Reporter: Hari Sekhon Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9143.1.patch, HIVE-9143.2.patch Feature request to add support for determining in HQL session which user I am currently connected as - an old MySQL ability: {code}mysql select user(), current_user(); +++ | user() | current_user() | +++ | root@localhost | root@localhost | +++ 1 row in set (0.00 sec) {code} which doesn't seem to have a counterpart in Hive at this time: {code}0: jdbc:hive2://host:100 select user(); Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid function 'user' (state=42000,code=4) 0: jdbc:hive2://host:100 select current_user(); Error: Error while compiling statement: FAILED: SemanticException [Error 10011]: Line 1:7 Invalid function 'current_user' (state=42000,code=10011){code} Regards, Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9468) Test groupby3_map_skew.q fails due to decimal precision difference
[ https://issues.apache.org/jira/browse/HIVE-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-9468: - Assignee: Xuefu Zhang Test groupby3_map_skew.q fails due to decimal precision difference -- Key: HIVE-9468 URL: https://issues.apache.org/jira/browse/HIVE-9468 Project: Hive Issue Type: Bug Components: Tests Reporter: Xuefu Zhang Assignee: Xuefu Zhang From test run, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/682/testReport: {code} Running: diff -a /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/groupby3_map_skew.q.out /home/hiveptest/54.177.132.58-hiveptest-1/apache-svn-spark-source/itests/qtest/../../ql/src/test/results/clientpositive/groupby3_map_skew.q.out 162c162 130091.0260.182 256.10355987055016 98.00.0 142.92680950752379 143.06995106518903 20428.07288 20469.0109 --- 130091.0260.182 256.10355987055016 98.00.0 142.9268095075238 143.06995106518906 20428.07288 20469.0109 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9492) Enable caching in MapInput for Spark
[ https://issues.apache.org/jira/browse/HIVE-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-9492: -- Attachment: HIVE-9492.2-spark.patch Attached patch v2 that added some unit tests. Enable caching in MapInput for Spark Key: HIVE-9492 URL: https://issues.apache.org/jira/browse/HIVE-9492 Project: Hive Issue Type: Bug Components: Spark Reporter: Xuefu Zhang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-9492.1-spark.patch, HIVE-9492.2-spark.patch, prototype.patch Because of the IOContext problem (HIVE-8920, HIVE-9084), RDD caching is currently disabled in MapInput. Prototyping shows that the problem can solved. Thus, we should formalize the prototype and enable the caching. A good query to test this is: {code} from (select * from dec union all select * from dec2) s insert overwrite table dec3 select s.name, sum(s.value) group by s.name insert overwrite table dec4 select s.name, s.value order by s.value; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9492) Enable caching in MapInput for Spark
[ https://issues.apache.org/jira/browse/HIVE-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300798#comment-14300798 ] Jimmy Xiang commented on HIVE-9492: --- Has some problem to put it on RB now, will try again later. As to the configuration parameter, it is for the purpose to disable the caching to avoid the overhead if it doesn't help. Enable caching in MapInput for Spark Key: HIVE-9492 URL: https://issues.apache.org/jira/browse/HIVE-9492 Project: Hive Issue Type: Bug Components: Spark Reporter: Xuefu Zhang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-9492.1-spark.patch, HIVE-9492.2-spark.patch, prototype.patch Because of the IOContext problem (HIVE-8920, HIVE-9084), RDD caching is currently disabled in MapInput. Prototyping shows that the problem can solved. Thus, we should formalize the prototype and enable the caching. A good query to test this is: {code} from (select * from dec union all select * from dec2) s insert overwrite table dec3 select s.name, sum(s.value) group by s.name insert overwrite table dec4 select s.name, s.value order by s.value; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9528) SemanticException: Ambiguous column reference
[ https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300800#comment-14300800 ] Navis commented on HIVE-9528: - [~ychena], before HIVE-7733, column information was overwritten by last column with same name, which possibly making invalid result. Anyway, the query you've mentioned is not working in mysql either (works in psql, though). Can we resolve this as a not-problem? SemanticException: Ambiguous column reference - Key: HIVE-9528 URL: https://issues.apache.org/jira/browse/HIVE-9528 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Yongzhi Chen When running the following query: {code} SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select * from sim a join sim2 b on a.simstr=b.simstr) app Error: Error while compiling statement: FAILED: SemanticException [Error 10007]: Ambiguous column reference simstr in app (state=42000,code=10007) {code} This query works fine in hive 0.10 In the apache trunk, following workaround will work: {code} SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim a join sim2 b on a.simstr=b.simstr) app; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9528) SemanticException: Ambiguous column reference
[ https://issues.apache.org/jira/browse/HIVE-9528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis reassigned HIVE-9528: --- Assignee: Navis SemanticException: Ambiguous column reference - Key: HIVE-9528 URL: https://issues.apache.org/jira/browse/HIVE-9528 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Yongzhi Chen Assignee: Navis When running the following query: {code} SELECT if( COUNT(*) = 0, 'true', 'false' ) as RESULT FROM ( select * from sim a join sim2 b on a.simstr=b.simstr) app Error: Error while compiling statement: FAILED: SemanticException [Error 10007]: Ambiguous column reference simstr in app (state=42000,code=10007) {code} This query works fine in hive 0.10 In the apache trunk, following workaround will work: {code} SELECT if(COUNT(*) = 0, 'true', 'false') as RESULT FROM (select a.* from sim a join sim2 b on a.simstr=b.simstr) app; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30151: Remove Extract Operator its friends from codebase.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30151/#review70534 --- Ship it! Ship It! - Navis Ryu On Jan. 30, 2015, 7:46 p.m., Ashutosh Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30151/ --- (Updated Jan. 30, 2015, 7:46 p.m.) Review request for hive and Navis Ryu. Bugs: HIVE-9416 https://issues.apache.org/jira/browse/HIVE-9416 Repository: hive-git Description --- Remove Extract Operator its friends from codebase. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/ExtractOperator.java c299d3a ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java f3c382a ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java 2e6a880 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExtractOperator.java 7f4bb64 ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketingSortingReduceSinkOptimizer.java 24ca89f ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java f36f843 ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOptimizer.java 137956c ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/CorrelationUtilities.java 630a9eb ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplication.java 3fead79 ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/OpProcFactory.java adca50d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingInferenceOptimizer.java 7954767 ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingOpProcFactory.java cf02bec ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 94b4621 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java c9a5ce5 ql/src/java/org/apache/hadoop/hive/ql/plan/ExtractDesc.java 6762155 ql/src/java/org/apache/hadoop/hive/ql/plan/SelectDesc.java fa6b548 ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 41862e6 ql/src/test/results/clientpositive/bucket1.q.out 13ec735 ql/src/test/results/clientpositive/bucket2.q.out 32a77c3 ql/src/test/results/clientpositive/bucket3.q.out ff7173e ql/src/test/results/clientpositive/bucket4.q.out b99d12f ql/src/test/results/clientpositive/bucket5.q.out 5992d6d ql/src/test/results/clientpositive/bucket6.q.out 5b23d7d ql/src/test/results/clientpositive/bucketsortoptimize_insert_1.q.out 75de953 ql/src/test/results/clientpositive/bucketsortoptimize_insert_2.q.out 599b8b9 ql/src/test/results/clientpositive/bucketsortoptimize_insert_3.q.out 7456ab0 ql/src/test/results/clientpositive/bucketsortoptimize_insert_4.q.out fd99597 ql/src/test/results/clientpositive/bucketsortoptimize_insert_5.q.out 8130ab9 ql/src/test/results/clientpositive/bucketsortoptimize_insert_6.q.out 627aba0 ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 9b058c8 ql/src/test/results/clientpositive/dynpart_sort_opt_vectorization.q.out 0baa446 ql/src/test/results/clientpositive/dynpart_sort_optimization.q.out 494bfa3 ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out b6e7b88 ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_static.q.out fc6d2ae ql/src/test/results/clientpositive/load_dyn_part2.q.out 26f318a ql/src/test/results/clientpositive/ptf.q.out 2317347 ql/src/test/results/clientpositive/ptf_streaming.q.out 427e635 ql/src/test/results/clientpositive/smb_mapjoin_20.q.out 999dabd ql/src/test/results/clientpositive/smb_mapjoin_21.q.out 539b70e ql/src/test/results/clientpositive/spark/bucket2.q.out 5eb28fa ql/src/test/results/clientpositive/spark/bucket3.q.out 1b1010a ql/src/test/results/clientpositive/spark/bucket4.q.out 7dd49ac ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_2.q.out 365306e ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_4.q.out 3846de7 ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_6.q.out 5b559c4 ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_7.q.out cefc6aa ql/src/test/results/clientpositive/spark/bucketsortoptimize_insert_8.q.out ca44d7c ql/src/test/results/clientpositive/spark/disable_merge_for_bucketing.q.out 3864c44 ql/src/test/results/clientpositive/spark/load_dyn_part2.q.out a8cef34 ql/src/test/results/clientpositive/spark/ptf.q.out deebf3a ql/src/test/results/clientpositive/spark/ptf_streaming.q.out cd77c5f ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out
[jira] [Commented] (HIVE-9416) Get rid of Extract Operator
[ https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300801#comment-14300801 ] Navis commented on HIVE-9416: - +1 Get rid of Extract Operator --- Key: HIVE-9416 URL: https://issues.apache.org/jira/browse/HIVE-9416 Project: Hive Issue Type: Task Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.7.patch, HIVE-9416.patch {{Extract Operator}} has been there for legacy reasons. But there is no functionality it provides which cant be provided by {{Select Operator}} Instead of having two operators, one being subset of another we should just get rid of {{Extract}} and simplify our codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300802#comment-14300802 ] Hive QA commented on HIVE-9539: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695846/HIVE-9539.2.patch {color:green}SUCCESS:{color} +1 7412 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2611/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2611/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2611/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12695846 - PreCommit-HIVE-TRUNK-Build Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9496) Sl4j warning in hive command
[ https://issues.apache.org/jira/browse/HIVE-9496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300807#comment-14300807 ] Alexander Pivovarov commented on HIVE-9496: --- slf4j-log4j12-1.7.5.jar is required for jdbc clients. This is why it's included to hive-jdbc-standalone.jar currently hive/lib contains both hive-jdbc.jar and hive-jdbc-standalone.jar {code} hive]$ ls packaging/target/apache-hive-1.2.0-SNAPSHOT-bin/apache-hive-1.2.0-SNAPSHOT-bin/lib/ | grep jdbc hive-jdbc-1.2.0-SNAPSHOT.jar hive-jdbc-1.2.0-SNAPSHOT-standalone.jar {code} Probably hive-jdbc-standalone.jar should not be placed to hive/lib Question is, where to put hive-jdbc-1.2.0-SNAPSHOT-standalone.jar? maybe just to the root of hive? {code} $ ls apache-hive-1.2.0-SNAPSHOT-bin/ bin conf derby.log examples hcatalog lib LICENSE NOTICE README.txt RELEASE_NOTES.txt scripts {code} Sl4j warning in hive command Key: HIVE-9496 URL: https://issues.apache.org/jira/browse/HIVE-9496 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.14.0 Environment: HDP 2.2.0 on CentOS. With Horton Sand Box and my own cluster. Reporter: Philippe Kernevez Priority: Minor Each time 'hive' command is ran, we have an Sl4J warning about multiple jars containing SL4J classes. This bug is similar to Hive-6162, but doesn't seems to be solved. Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-1084/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-1084/hive/lib/hive-jdbc-0.14.0.2.2.0.0-1084-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300861#comment-14300861 ] Xuefu Zhang commented on HIVE-9211: --- [~chengxiang li], are the remain test failures are related to the patch somehow? Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch, HIVE-9211.5-spark.patch, HIVE-9211.6-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/#review70538 --- Sorry for late review... The patch looks good, and I see there already were a lot of great discussions! Thanks. I left just one minor comments below. ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java https://reviews.apache.org/r/30281/#comment115753 This seems duplicate, since it has been checked before invoking writeMap(...) - Dong Chen On Jan. 29, 2015, 5:12 p.m., Sergio Pena wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/ --- (Updated Jan. 29, 2015, 5:12 p.m.) Review request for hive, Ryan Blue, cheng xu, and Dong Chen. Bugs: HIVE-9333 https://issues.apache.org/jira/browse/HIVE-9333 Repository: hive-git Description --- This patch moves the ParquetHiveSerDe.serialize() implementation to DataWritableWriter class in order to save time in materializing data on serialize(). Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java ea4109d358f7c48d1e2042e5da299475de4a0a29 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java 060b1b722d32f3b2f88304a1a73eb249e150294b ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 41b5f1c3b0ab43f734f8a211e3e03d5060c75434 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java a693aff18516d133abf0aae4847d3fe00b9f1c96 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java 667d3671547190d363107019cd9a2d105d26d336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 007a665529857bcec612f638a157aa5043562a15 serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/30281/diff/ Testing --- The tests run were the following: 1. JMH (Java microbenchmark) This benchmark called parquet serialize/write methods using text writable objects. Class.method Before Change (ops/s) After Change (ops/s) --- ParquetHiveSerDe.serialize: 19,113 249,528 - 19x speed increase DataWritableWriter.write: 5,033 5,201 - 3.34% speed increase 2. Write 20 million rows (~1GB file) from Text to Parquet I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format using the following statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text; Time (s) it took to write the whole file BEFORE changes: 93.758 s Time (s) it took to write the whole file AFTER changes: 83.903 s It got a 10% of speed inscrease. Thanks, Sergio Pena
[jira] [Created] (HIVE-9540) Enable infer_bucket_sort_dyn_part.q for TestMiniSparkOnYarnCliDriver test. [Spark Branch]
Chengxiang Li created HIVE-9540: --- Summary: Enable infer_bucket_sort_dyn_part.q for TestMiniSparkOnYarnCliDriver test. [Spark Branch] Key: HIVE-9540 URL: https://issues.apache.org/jira/browse/HIVE-9540 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li infer_bucket_sort_dyn_part.q output changes on TestMiniSparkOnYarnCliDriver test, we should figure out why and try to enable it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300858#comment-14300858 ] Hive QA commented on HIVE-9211: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695857/HIVE-9211.6-spark.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7407 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/703/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/703/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-703/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695857 - PreCommit-HIVE-SPARK-Build Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch, HIVE-9211.5-spark.patch, HIVE-9211.6-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9495) Map Side aggregation affecting map performance
[ https://issues.apache.org/jira/browse/HIVE-9495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9495: -- Description: When trying to run a simple aggregation query with hive.map.aggr=true, map tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. e.g. Consider the query: {code} INSERT OVERWRITE TABLE lineitem_tgt_agg select alias.a0 as a0, alias.a2 as a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 from ( select alias.a0 as a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as a4 from ( select lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount as double) as a3, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax as double) as a4 from lineitem_sf500 ) alias group by alias.a0 ) alias; {code} The above query was run with ~376GB of data / ~3billion records in the source. It takes ~10 minutes with hive.map.aggr=false. With map side aggregation set to true, the map tasks don't complete even after an hour. was: When trying to run a simple aggregation query with hive.map.aggr=true, map tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. e.g. Consider the query: INSERT OVERWRITE TABLE lineitem_tgt_agg SELECT alias.a0 as a0, alias.a2 as a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 FROM (SELECT alias.a0 as a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as a4 FROM (SELECT lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) AS DOUBLE) as a1, lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount AS DOUBLE) as a3, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax AS DOUBLE) as a4 FROM lineitem_sf500) alias GROUP BY alias.a0) alias; The above query was run with ~376GB of data / ~3billion records in the source. It takes ~10 minutes with hive.map.aggr=false. With map side aggregation set to true, the map tasks don't complete even after an hour. Map Side aggregation affecting map performance -- Key: HIVE-9495 URL: https://issues.apache.org/jira/browse/HIVE-9495 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: RHEL 6.4 Hortonworks Hadoop 2.2 Reporter: Anand Sridharan Attachments: profiler_screenshot.PNG When trying to run a simple aggregation query with hive.map.aggr=true, map tasks take a lot of time in Hive 0.14 as against with hive.map.aggr=false. e.g. Consider the query: {code} INSERT OVERWRITE TABLE lineitem_tgt_agg select alias.a0 as a0, alias.a2 as a1, alias.a1 as a2, alias.a3 as a3, alias.a4 as a4 from ( select alias.a0 as a0, SUM(alias.a1) as a1, SUM(alias.a2) as a2, SUM(alias.a3) as a3, SUM(alias.a4) as a4 from ( select lineitem_sf500.l_orderkey as a0, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * (1 - lineitem_sf500.l_discount) * (1 + lineitem_sf500.l_tax) as double) as a1, lineitem_sf500.l_quantity as a2, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_discount as double) as a3, CAST(lineitem_sf500.l_quantity * lineitem_sf500.l_extendedprice * lineitem_sf500.l_tax as double) as a4 from lineitem_sf500 ) alias group by alias.a0 ) alias; {code} The above query was run with ~376GB of data / ~3billion records in the source. It takes ~10 minutes with hive.map.aggr=false. With map side aggregation set to true, the map tasks don't complete even after an hour. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-9211: Attachment: HIVE-9211.6-spark.patch [~xuefuz], the output of infer_bucket_sort_dyn_part.q changes during the test, so i remote it from the miniSparkOnYarn.query.files, and created extra HIVE-9540 to track it. Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch, HIVE-9211.5-spark.patch, HIVE-9211.6-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9492) Enable caching in MapInput for Spark
[ https://issues.apache.org/jira/browse/HIVE-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300820#comment-14300820 ] Hive QA commented on HIVE-9492: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695848/HIVE-9492.2-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7363 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/702/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/702/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-702/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695848 - PreCommit-HIVE-SPARK-Build Enable caching in MapInput for Spark Key: HIVE-9492 URL: https://issues.apache.org/jira/browse/HIVE-9492 Project: Hive Issue Type: Bug Components: Spark Reporter: Xuefu Zhang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-9492.1-spark.patch, HIVE-9492.2-spark.patch, prototype.patch Because of the IOContext problem (HIVE-8920, HIVE-9084), RDD caching is currently disabled in MapInput. Prototyping shows that the problem can solved. Thus, we should formalize the prototype and enable the caching. A good query to test this is: {code} from (select * from dec union all select * from dec2) s insert overwrite table dec3 select s.name, sum(s.value) group by s.name insert overwrite table dec4 select s.name, s.value order by s.value; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6131) New columns after table alter result in null values despite data
[ https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300839#comment-14300839 ] Liao, Xiaoge commented on HIVE-6131: how did this bug fix? New columns after table alter result in null values despite data Key: HIVE-6131 URL: https://issues.apache.org/jira/browse/HIVE-6131 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: James Vaughan Priority: Minor Attachments: HIVE-6131.1.patch Hi folks, I found and verified a bug on our CDH 4.0.3 install of Hive when adding columns to tables with Partitions using 'REPLACE COLUMNS'. I dug through the Jira a little bit and didn't see anything for it so hopefully this isn't just noise on the radar. Basically, when you alter a table with partitions and then reupload data to that partition, it doesn't seem to recognize the extra data that actually exists in HDFS- as in, returns NULL values on the new column despite having the data and recognizing the new column in the metadata. Here's some steps to reproduce using a basic table: 1. Run this hive command: CREATE TABLE jvaughan_test (col1 string) partitioned by (day string); 2. Create a simple file on the system with a couple of entries, something like hi and hi2 separated by newlines. 3. Run this hive command, pointing it at the file: LOAD DATA LOCAL INPATH 'FILEDIR' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02'); 4. Confirm the data with: SELECT * FROM jvaughan_test WHERE day = '2014-01-02'; 5. Alter the column definitions: ALTER TABLE jvaughan_test REPLACE COLUMNS (col1 string, col2 string); 6. Edit your file and add a second column using the default separator (ctrl+v, then ctrl+a in Vim) and add two more entries, such as hi3 on the first row and hi4 on the second 7. Run step 3 again 8. Check the data again like in step 4 For me, this is the results that get returned: hive select * from jvaughan_test where day = '2014-01-01'; OK hiNULL2014-01-02 hi2 NULL2014-01-02 This is despite the fact that there is data in the file stored by the partition in HDFS. Let me know if you need any other information. The only workaround for me currently is to drop partitions for any I'm replacing data in and THEN reupload the new data file. Thanks, -James -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9492) Enable caching in MapInput for Spark
[ https://issues.apache.org/jira/browse/HIVE-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300846#comment-14300846 ] Lefty Leverenz commented on HIVE-9492: -- Doc review: Please capitalize Hive and Spark in the parameter description, and change initial If to Whether. And shouldn't mapinput be MapInput? {code} +HIVE_CACHE_MAPINPUT(hive.exec.cache.mapinput, false, +Whether Hive (in Spark mode only) should cache MapInput if it applies.), {code} Enable caching in MapInput for Spark Key: HIVE-9492 URL: https://issues.apache.org/jira/browse/HIVE-9492 Project: Hive Issue Type: Bug Components: Spark Reporter: Xuefu Zhang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-9492.1-spark.patch, HIVE-9492.2-spark.patch, prototype.patch Because of the IOContext problem (HIVE-8920, HIVE-9084), RDD caching is currently disabled in MapInput. Prototyping shows that the problem can solved. Thus, we should formalize the prototype and enable the caching. A good query to test this is: {code} from (select * from dec union all select * from dec2) s insert overwrite table dec3 select s.name, sum(s.value) group by s.name insert overwrite table dec4 select s.name, s.value order by s.value; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9416) Get rid of Extract Operator
[ https://issues.apache.org/jira/browse/HIVE-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-9416: --- Resolution: Fixed Fix Version/s: 1.2.0 Status: Resolved (was: Patch Available) Committed to trunk. Get rid of Extract Operator --- Key: HIVE-9416 URL: https://issues.apache.org/jira/browse/HIVE-9416 Project: Hive Issue Type: Task Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.0 Attachments: HIVE-9416.1.patch, HIVE-9416.2.patch, HIVE-9416.3.patch, HIVE-9416.4.patch, HIVE-9416.5.patch, HIVE-9416.6.patch, HIVE-9416.7.patch, HIVE-9416.patch {{Extract Operator}} has been there for legacy reasons. But there is no functionality it provides which cant be provided by {{Select Operator}} Instead of having two operators, one being subset of another we should just get rid of {{Extract}} and simplify our codebase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9542) SparkSessionImpl calcualte wrong cores number in TestSparkCliDriver [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-9542: Summary: SparkSessionImpl calcualte wrong cores number in TestSparkCliDriver [Spark Branch] (was: SparkSessionImpl calcualte wrong number of cores number in TestSparkCliDriver [Spark Branch]) SparkSessionImpl calcualte wrong cores number in TestSparkCliDriver [Spark Branch] -- Key: HIVE-9542 URL: https://issues.apache.org/jira/browse/HIVE-9542 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li TestSparkCliDriver launch local spark cluster with [2,2,1024], which means 2 executor with 2 cores for each execuotr, HoS get the core number as 2 instead of 4. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9211) Research on build mini HoS cluster on YARN for unit test[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-9211: Attachment: HIVE-9211.7-spark.patch TestSparkCliDriver launch local spark cluster with \[2,2,1024\], which means 2 executor with 2 cores for each execuotr, while HoS use spark.executor.cores values to caculate all cores number, so TestSparkCliDriver set reduce partition number as 2 instead of 4. Currently caculation logic of cores number is spark-invaded and easy to be broken, we may handle it in a better way after SPARK-5080 is resoved. groupby2.q and join1.q is failed due to the previous reason during EXPLAIN queries, and HIVE-9542 is created for this issue. ql_rewrite_gbtoidx_cbo_2.q failed on TestMinimrCliDriver as i add result order tag to the qfile before and did not update TestMinimrCliDriver output. encryption_join_with_different_encryption_keys.q failure should not related to this patch from the log file. Research on build mini HoS cluster on YARN for unit test[Spark Branch] -- Key: HIVE-9211 URL: https://issues.apache.org/jira/browse/HIVE-9211 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M5 Attachments: HIVE-9211.1-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.2-spark.patch, HIVE-9211.3-spark.patch, HIVE-9211.4-spark.patch, HIVE-9211.5-spark.patch, HIVE-9211.6-spark.patch, HIVE-9211.7-spark.patch HoS on YARN is a common use case in product environment, we'd better enable unit test for this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9541) Update people page with new PMC members
[ https://issues.apache.org/jira/browse/HIVE-9541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-9541: Attachment: HIVE-9541.1.patch Update people page with new PMC members --- Key: HIVE-9541 URL: https://issues.apache.org/jira/browse/HIVE-9541 Project: Hive Issue Type: Improvement Components: Website Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Priority: Trivial Attachments: HIVE-9541.1.patch Move [~jdere], [~owen.omalley], [~prasanth_j], [~vikram.dixit] and [~szehon] from committer list to PMC list. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9541) Update people page with new PMC members
Prasanth Jayachandran created HIVE-9541: --- Summary: Update people page with new PMC members Key: HIVE-9541 URL: https://issues.apache.org/jira/browse/HIVE-9541 Project: Hive Issue Type: Improvement Components: Website Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Priority: Trivial Move [~jdere], [~owen.omalley], [~prasanth_j], [~vikram.dixit] and [~szehon] from committer list to PMC list. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300930#comment-14300930 ] Hive QA commented on HIVE-9333: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12695870/HIVE-9333.6.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7407 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2612/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2612/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2612/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12695870 - PreCommit-HIVE-TRUNK-Build Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9542) SparkSessionImpl calcualte wrong number of cores number in TestSparkCliDriver [Spark Branch]
Chengxiang Li created HIVE-9542: --- Summary: SparkSessionImpl calcualte wrong number of cores number in TestSparkCliDriver [Spark Branch] Key: HIVE-9542 URL: https://issues.apache.org/jira/browse/HIVE-9542 Project: Hive Issue Type: Sub-task Reporter: Chengxiang Li TestSparkCliDriver launch local spark cluster with [2,2,1024], which means 2 executor with 2 cores for each execuotr, HoS get the core number as 2 instead of 4. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 30281: Move parquet serialize implementation to DataWritableWriter to improve write speeds
On Feb. 2, 2015, 2:27 a.m., Dong Chen wrote: ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java, line 215 https://reviews.apache.org/r/30281/diff/4/?file=840163#file840163line215 This seems duplicate, since it has been checked before invoking writeMap(...) Thanks Dong. I did not see this extra validation. - Sergio --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/#review70538 --- On Ene. 29, 2015, 5:12 p.m., Sergio Pena wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30281/ --- (Updated Ene. 29, 2015, 5:12 p.m.) Review request for hive, Ryan Blue, cheng xu, and Dong Chen. Bugs: HIVE-9333 https://issues.apache.org/jira/browse/HIVE-9333 Repository: hive-git Description --- This patch moves the ParquetHiveSerDe.serialize() implementation to DataWritableWriter class in order to save time in materializing data on serialize(). Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java ea4109d358f7c48d1e2042e5da299475de4a0a29 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 9caa4ed169ba92dbd863e4a2dc6d06ab226a4465 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriteSupport.java 060b1b722d32f3b2f88304a1a73eb249e150294b ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/DataWritableWriter.java 41b5f1c3b0ab43f734f8a211e3e03d5060c75434 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java e52c4bc0b869b3e60cb4bfa9e11a09a0d605ac28 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java a693aff18516d133abf0aae4847d3fe00b9f1c96 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetOutputFormat.java 667d3671547190d363107019cd9a2d105d26d336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 007a665529857bcec612f638a157aa5043562a15 serde/src/java/org/apache/hadoop/hive/serde2/io/ParquetWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/30281/diff/ Testing --- The tests run were the following: 1. JMH (Java microbenchmark) This benchmark called parquet serialize/write methods using text writable objects. Class.method Before Change (ops/s) After Change (ops/s) --- ParquetHiveSerDe.serialize: 19,113 249,528 - 19x speed increase DataWritableWriter.write: 5,033 5,201 - 3.34% speed increase 2. Write 20 million rows (~1GB file) from Text to Parquet I wrote a ~1Gb file in Textfile format, then convert it to a Parquet format using the following statement: CREATE TABLE parquet STORED AS parquet AS SELECT * FROM text; Time (s) it took to write the whole file BEFORE changes: 93.758 s Time (s) it took to write the whole file AFTER changes: 83.903 s It got a 10% of speed inscrease. Thanks, Sergio Pena
[jira] [Updated] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9333: -- Attachment: HIVE-9333.6.patch Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9333: -- Status: Patch Available (was: Open) Remove an repeated validation from writeMap(). Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9333: -- Attachment: (was: HIVE-9333.4.patch) Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9333: -- Attachment: (was: HIVE-9333.3.patch) Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9333: -- Attachment: (was: HIVE-9333.2.patch) Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9333) Move parquet serialize implementation to DataWritableWriter to improve write speeds
[ https://issues.apache.org/jira/browse/HIVE-9333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-9333: -- Status: Open (was: Patch Available) Move parquet serialize implementation to DataWritableWriter to improve write speeds --- Key: HIVE-9333 URL: https://issues.apache.org/jira/browse/HIVE-9333 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9333.5.patch, HIVE-9333.6.patch The serialize process on ParquetHiveSerDe parses a Hive object to a Writable object by looping through all the Hive object children, and creating new Writables objects per child. These final writables objects are passed in to the Parquet writing function, and parsed again on the DataWritableWriter class by looping through the ArrayWritable object. These two loops (ParquetHiveSerDe.serialize() and DataWritableWriter.write() may be reduced to use just one loop into the DataWritableWriter.write() method in order to increment the writing process speed for Hive parquet. In order to achieve this, we can wrap the Hive object and object inspector on ParquetHiveSerDe.serialize() method into an object that implements the Writable object and thus avoid the loop that serialize() does, and leave the loop parser to the DataWritableWriter.write() method. We can see how ORC does this with the OrcSerde.OrcSerdeRow class. Writable objects are organized differently on any kind of storage formats, so I don't think it is necessary to create and keep the writable objects in the serialize() method as they won't be used until the writing process starts (DataWritableWriter.write()). This performance issue was found using microbenchmark tests from HIVE-8121. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9234) HiveServer2 leaks FileSystem objects in FileSystem.CACHE
[ https://issues.apache.org/jira/browse/HIVE-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300915#comment-14300915 ] Thejas M Nair commented on HIVE-9234: - The leak was in FileSystem.CACHE, it would keep growing. The Key in the CACHE includes UGI object. To clear the entries in CACHE corresponding to the UGI object, closeAllForUGI should be called once the UGI is no longer going to be used. The fix is in changing the sequence - to do ShimLoader.getHadoopShims().closeAllForUGI(sessionUgi) as the final step of HiveSessionImplwithUGI.close, after super.close(). super.close() was indirectly adding another entry into the CACHE. HiveServer2 leaks FileSystem objects in FileSystem.CACHE Key: HIVE-9234 URL: https://issues.apache.org/jira/browse/HIVE-9234 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.12.1, 0.14.0, 0.13.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.15.0, 1.0.0 Attachments: HIVE-9234.1.patch, HIVE-9234.2.patch, HIVE-9234.2.patch, HIVE-9234.branch-14.patch Running over extended period (48+ hrs), we've noticed HiveServer2 leaking FileSystem objects in FileSystem.CACHE. Linked jiras were previous attempts to fix it, but the issue still seems to be there. A workaround is to disable the caching (by setting {{fs.hdfs.impl.disable.cache}} and {{fs.file.impl.disable.cache}} to {{true}}), but creating new FileSystem objects is expensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14300917#comment-14300917 ] Thejas M Nair commented on HIVE-9539: - +1 Thanks for the patch [~damien.carol] and [~apivovarov] Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)