[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Affects Version/s: 0.13.0 TestPig_11 fails with Pig 12.1 and earlier -- Key: HIVE-7056 URL: https://issues.apache.org/jira/browse/HIVE-7056 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for *hcatalog-core-*.jar etc. In Pig 12.1 it's looking for hcatalog-core-*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls:
[jira] [Updated] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
[ https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7057: - Status: Patch Available (was: Open) webhcat e2e deployment scripts don't have x bit set --- Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7057.patch also, update env.sh to use latest Pig release NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6965) Transaction manager should use RDBMS time instead of machine time
[ https://issues.apache.org/jira/browse/HIVE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-6965: - Attachment: HIVE-6965.patch Transaction manager should use RDBMS time instead of machine time - Key: HIVE-6965 URL: https://issues.apache.org/jira/browse/HIVE-6965 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-6965.patch Current TxnHandler and CompactionTxnHandler use System.currentTimeMillis() when they need to determine the time (such as heartbeating transactions). In situations where there are multiple Thrift metastore services or users are using an embedded metastore this will lead to issues. We should instead be using time from the RDBMS, which is guaranteed to be the same for all users. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1
TestHive_7 is explained by https://issues.apache.org/jira/browse/HIVE-6521, which is in trunk but not 13.1 On Tue, May 13, 2014 at 6:50 PM, Eugene Koifman ekoif...@hortonworks.comwrote: I downloaded src tar, built it and ran webhcat e2e tests. I see 2 failures (which I don't see on trunk) TestHive_7 fails with got percentComplete map 100% reduce 0%, expected map 100% reduce 100% TestHeartbeat_1 fails to even launch the job. This looks like the root cause ERROR | 13 May 2014 18:24:00,394 | org.apache.hive.hcatalog.templeton.CatchallExceptionMapper | java.lang.NullPointerException at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:312) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:479) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:170) at org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:107) at org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:103) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hive.hcatalog.templeton.LauncherDelegator.queueAsUser(LauncherDelegator.java:103) at org.apache.hive.hcatalog.templeton.LauncherDelegator.enqueueController(LauncherDelegator.java:81) at org.apache.hive.hcatalog.templeton.JarDelegator.run(JarDelegator.java:55) at org.apache.hive.hcatalog.templeton.Server.mapReduceJar(Server.java:711) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185) at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480) at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360) at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350) at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538) at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1360) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:392) at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:87) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1331) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:477) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965) at
[jira] [Updated] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names
[ https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6187: --- Assignee: Carl Steinbach Add test to verify that DESCRIBE TABLE works with quoted table names Key: HIVE-6187 URL: https://issues.apache.org/jira/browse/HIVE-6187 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Andy Mok Assignee: Carl Steinbach Fix For: 0.14.0 Attachments: HIVE-6187.1.patch Backticks around tables named after special keywords, such as items, allow us to create, drop, and alter the table. For example {code:sql} CREATE TABLE foo.`items` (bar INT); DROP TABLE foo.`items`; ALTER TABLE `items` RENAME TO `items_`; {code} However, we cannot call {code:sql} DESCRIBE foo.`items`; DESCRIBE `items`; {code} The DESCRIBE query does not permit backticks to surround table names. The error returned is {code:sql} FAILED: SemanticException [Error 10001]: Table not found `items` {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
[ https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7057: - Description: also, update env.sh to use latest Pig release NO PRECOMMIT TESTS was:also, update env.sh to use latest Pig release webhcat e2e deployment scripts don't have x bit set --- Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman also, update env.sh to use latest Pig release NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5664) Drop cascade database fails when the db has any tables with indexes
[ https://issues.apache.org/jira/browse/HIVE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti updated HIVE-5664: -- Attachment: HIVE-5664.2.patch.txt Rebased on latest trunk and fixed the .out changes. Drop cascade database fails when the db has any tables with indexes --- Key: HIVE-5664 URL: https://issues.apache.org/jira/browse/HIVE-5664 Project: Hive Issue Type: Bug Components: Indexing, Metastore Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.14.0 Attachments: HIVE-5664.1.patch.txt, HIVE-5664.2.patch.txt {code} CREATE DATABASE db2; USE db2; CREATE TABLE tab1 (id int, name string); CREATE INDEX idx1 ON TABLE tab1(id) as 'COMPACT' with DEFERRED REBUILD IN TABLE tab1_indx; DROP DATABASE db2 CASCADE; {code} Last DDL fails with the following error: {code} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database does not exist: db2 Hive.log has following exception 2013-10-27 20:46:16,629 ERROR exec.DDLTask (DDLTask.java:execute(434)) - org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: db2 at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3473) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:231) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1441) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1219) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: NoSuchObjectException(message:db2.tab1_indx table not found) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1376) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) at com.sun.proxy.$Proxy7.get_table(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:890) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:660) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:652) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:546) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy8.dropDatabase(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:284) at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3470) ... 18 more {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: How to remote debug WebHCat?
Should this be documented in the wiki? -- Lefty On Tue, May 13, 2014 at 3:17 PM, Eugene Koifman ekoif...@hortonworks.comwrote: if you take webhcat_server.sh as currently in trunk, it supports startDebug option that will let you attach a debugger to the process On Mon, May 12, 2014 at 11:13 PM, Na Yang ny...@maprtech.com wrote: Hi Folks, Is there a way to remote debug webhcat? If so, how to enable the remote debug? Thanks, Na -- Thanks, Eugene -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Updated] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
[ https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-7050: - Attachment: HIVE-7050.2.patch Addressed [~xuefuz]'s review comments. Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE - Key: HIVE-7050 URL: https://issues.apache.org/jira/browse/HIVE-7050 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7050.1.patch, HIVE-7050.2.patch There is currently no way to display the column level stats from hive CLI. It will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
[ https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997303#comment-13997303 ] Prasanth J commented on HIVE-7050: -- Column stats are stored only when a column is specified and only when FORMATTED is specified. It does NOT show for EXTENDED because extended output does not show the column names at the top which makes it difficult to comprehend the column stats output. Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE - Key: HIVE-7050 URL: https://issues.apache.org/jira/browse/HIVE-7050 Project: Hive Issue Type: Bug Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7050.1.patch, HIVE-7050.2.patch There is currently no way to display the column level stats from hive CLI. It will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7054) Support ELT UDF in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepesh Khandelwal updated HIVE-7054: - Status: Patch Available (was: Open) Support ELT UDF in vectorized mode -- Key: HIVE-7054 URL: https://issues.apache.org/jira/browse/HIVE-7054 Project: Hive Issue Type: New Feature Components: Vectorization Affects Versions: 0.14.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Attachments: HIVE-7054.patch Implement support for ELT udf in vectorized execution mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable
[ https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996047#comment-13996047 ] Hive QA commented on HIVE-7049: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12644526/HIVE-7049.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/188/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/188/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12644526 Unable to deserialize AVRO data when file schema and record schema are different and nullable - Key: HIVE-7049 URL: https://issues.apache.org/jira/browse/HIVE-7049 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-7049.1.patch It mainly happens when 1 )file schema and record schema are not same 2 ) Record schema is nullable but file schema is not. The potential code location is at class AvroDeserialize {noformat} if(AvroSerdeUtils.isNullableType(recordSchema)) { return deserializeNullableUnion(datum, fileSchema, recordSchema, columnType); } {noformat} In the above code snippet, recordSchema is verified if it is nullable. But the file schema is not checked. I tested with these values: {noformat} recordSchema= [null,string] fielSchema= string {noformat} And i got the following exception line numbers might not be the same due to mu debugged code version. {noformat} org.apache.avro.AvroRuntimeException: Not a union: string at org.apache.avro.Schema.getTypes(Schema.java:272) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
Eugene Koifman created HIVE-7057: Summary: webhcat e2e deployment scripts don't have x bit set Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman also, update env.sh to use latest Pig release -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7054) Support ELT UDF in vectorized mode
Deepesh Khandelwal created HIVE-7054: Summary: Support ELT UDF in vectorized mode Key: HIVE-7054 URL: https://issues.apache.org/jira/browse/HIVE-7054 Project: Hive Issue Type: New Feature Components: Vectorization Affects Versions: 0.14.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Fix For: 0.14.0 Implement support for ELT udf in vectorized execution mode. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-860: --- Fix Version/s: (was: 0.13.0) 0.14.0 Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.14.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5810) create a function add_date as exists in mysql
[ https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997273#comment-13997273 ] Anandha L Ranganathan commented on HIVE-5810: - I could have implemented in existing function with these feature, but that will break already existing implementation in production. create a function add_date as exists in mysql Key: HIVE-5810 URL: https://issues.apache.org/jira/browse/HIVE-5810 Project: Hive Issue Type: Improvement Reporter: Anandha L Ranganathan Assignee: Anandha L Ranganathan Attachments: HIVE-5810.2.patch, HIVE-5810.patch Original Estimate: 40h Remaining Estimate: 40h MySQL has ADDDATE(date,INTERVAL expr unit). Similarly in Hive we can have (date,unit,expr). Here Unit is DAY/Month/Year For example, add_date('2013-11-09','DAY',2) will return 2013-11-11. add_date('2013-11-09','Month',2) will return 2014-01-09. add_date('2013-11-09','Year',2) will return 2014-11-11. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7033) grant statements should check if the role exists
[ https://issues.apache.org/jira/browse/HIVE-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996018#comment-13996018 ] Hive QA commented on HIVE-7033: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12644480/HIVE-7033.4.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5506 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHadoopVersion org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getPigVersion org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getStatus org.apache.hive.hcatalog.templeton.TestWebHCatE2e.invalidPath {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/183/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/183/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12644480 grant statements should check if the role exists Key: HIVE-7033 URL: https://issues.apache.org/jira/browse/HIVE-7033 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7033.1.patch, HIVE-7033.2.patch, HIVE-7033.3.patch, HIVE-7033.4.patch The following grant statement that grants to a role that does not exist succeeds, but it should result in an error. grant all on t1 to role nosuchrole; -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6986) MatchPath fails with small resultExprString
[ https://issues.apache.org/jira/browse/HIVE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997507#comment-13997507 ] Furcy Pin commented on HIVE-6986: - Hi Ashutosh, r.startsWith(select) wouldn't match upper cases, and unfortunately there is no built-in String.startsWithIgnoreCase in Java I believe. r.toLowerCase().equals(select) would work of course, but I wanted to preserve the initial (futile) optimisation with the r.substring(0,6), so that we don't cast the whole query to lowercase, but just its prefix. MatchPath fails with small resultExprString --- Key: HIVE-6986 URL: https://issues.apache.org/jira/browse/HIVE-6986 Project: Hive Issue Type: Bug Components: UDF Reporter: Furcy Pin Priority: Trivial Attachments: HIVE-6986.1.patch When using MatchPath, a query like this: select year from matchpath(on flights_tiny sort by fl_num, year, month, day_of_month arg1('LATE.LATE+'), arg2('LATE'), arg3(arr_delay 15), arg4('year') ) ; will fail with error message FAILED: StringIndexOutOfBoundsException String index out of range: 6 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7048) CompositeKeyHBaseFactory should not use FamilyFilter
[ https://issues.apache.org/jira/browse/HIVE-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7048: -- Priority: Blocker (was: Major) CompositeKeyHBaseFactory should not use FamilyFilter Key: HIVE-7048 URL: https://issues.apache.org/jira/browse/HIVE-7048 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Swarnim Kulkarni Priority: Blocker HIVE-6411 introduced a more generic way to provide composite key implementations via custom factory implementations. However it seems like the CompositeHBaseKeyFactory implementation uses a FamilyFilter for row key scans which doesn't seem appropriate. This should be investigated further and if possible replaced with a RowRangeScanFilter. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 21430: HIVE-6994 - parquet-hive createArray strips null elements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21430/ --- Review request for hive. Repository: hive-git Description --- The createArray method in ParquetHiveSerDe strips null values from resultant ArrayWritables. This patch: - removes an incorrect if null check in createArray - simplifies ParquetHiveSerDe - total refactor of TestParquetHiveSerDe for better test coverage and easier regression testing Diffs - data/files/parquet_create.txt ccd48ee ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java b689336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 3b56fc7 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetHiveSerDe.java PRE-CREATION ql/src/test/queries/clientpositive/parquet_create.q 0b976bd ql/src/test/results/clientpositive/parquet_create.q.out 3220be5 Diff: https://reviews.apache.org/r/21430/diff/ Testing --- Thanks, justin coffey
[jira] [Commented] (HIVE-6290) Add support for hbase filters for composite keys
[ https://issues.apache.org/jira/browse/HIVE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997376#comment-13997376 ] Lefty Leverenz commented on HIVE-6290: -- Does this need any user doc? Add support for hbase filters for composite keys Key: HIVE-6290 URL: https://issues.apache.org/jira/browse/HIVE-6290 Project: Hive Issue Type: Sub-task Components: HBase Handler Affects Versions: 0.12.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Fix For: 0.14.0 Attachments: HIVE-6290.1.patch.txt, HIVE-6290.2.patch.txt, HIVE-6290.3.patch.txt Add support for filters to be provided via the composite key class -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7059) Hive 13 shell slow start
Emil A. Siemes created HIVE-7059: Summary: Hive 13 shell slow start Key: HIVE-7059 URL: https://issues.apache.org/jira/browse/HIVE-7059 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.13.0 Reporter: Emil A. Siemes Priority: Minor The shell startup time can be reduced by 2-4s by removing HBase jar files from classpath in /usr/lib/hive/bin/hive For interactive queries saving a couple of seconds is a big gain. Somehow the cli seems to eagerly connect to zk even if it's not needed as in hive --version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table
[ https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-6473: --- Attachment: HIVE-6473.1.patch Attaching same file with different extension, see which one buildbot picks up. Allow writing HFiles via HBaseStorageHandler table -- Key: HIVE-6473 URL: https://issues.apache.org/jira/browse/HIVE-6473 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, HIVE-6473.1.patch.txt Generating HFiles for bulkload into HBase could be more convenient. Right now we require the user to register a new table with the appropriate output format. This patch allows the exact same functionality, but through an existing table managed by the HBaseStorageHandler. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator
[ https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995386#comment-13995386 ] Xuefu Zhang commented on HIVE-6901: --- [~ashutoshc] Would you mind reviewing the one-line change? Thanks. Explain plan doesn't show operator tree for the fetch operator -- Key: HIVE-6901 URL: https://issues.apache.org/jira/browse/HIVE-6901 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.12.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Priority: Minor Attachments: HIVE-6109.10.patch, HIVE-6901.1.patch, HIVE-6901.2.patch, HIVE-6901.3.patch, HIVE-6901.4.patch, HIVE-6901.5.patch, HIVE-6901.6.patch, HIVE-6901.7.patch, HIVE-6901.8.patch, HIVE-6901.9.patch, HIVE-6901.patch Explaining a simple select query that involves a MR phase doesn't show processor tree for the fetch operator. {code} hive explain select d from test; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: ... Stage: Stage-0 Fetch Operator limit: -1 {code} It would be nice if the operator tree is shown even if there is only one node. Please note that in local execution, the operator tree is complete: {code} hive explain select * from test; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: TableScan alias: test Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d (type: int) outputColumnNames: _col0 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column stats: NONE ListSink {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable
[ https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997378#comment-13997378 ] Mohammad Kamrul Islam commented on HIVE-7049: - Thank [~xuefuz] for the comments. I believe it is a valid Avro schema evolution. Please see the following comments copied from the link: http://avro.apache.org/docs/1.7.6/spec.html#Schema+Resolution {noformat} * if reader's is a union, but writer's is not The first schema in the reader's union that matches the writer's schema is recursively resolved against it. If none match, an error is signalled. * if writer's is a union, but reader's is not If the reader's schema matches the selected writer's schema, it is recursively resolved against it. If they do not match, an error is signalled. {noformat} Moreover, i tested a similar scenarios using pure avro code where i wrote using schema string and read it using [null,string]. Unable to deserialize AVRO data when file schema and record schema are different and nullable - Key: HIVE-7049 URL: https://issues.apache.org/jira/browse/HIVE-7049 Project: Hive Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: HIVE-7049.1.patch It mainly happens when 1 )file schema and record schema are not same 2 ) Record schema is nullable but file schema is not. The potential code location is at class AvroDeserialize {noformat} if(AvroSerdeUtils.isNullableType(recordSchema)) { return deserializeNullableUnion(datum, fileSchema, recordSchema, columnType); } {noformat} In the above code snippet, recordSchema is verified if it is nullable. But the file schema is not checked. I tested with these values: {noformat} recordSchema= [null,string] fielSchema= string {noformat} And i got the following exception line numbers might not be the same due to mu debugged code version. {noformat} org.apache.avro.AvroRuntimeException: Not a union: string at org.apache.avro.Schema.getTypes(Schema.java:272) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188) at org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487) at org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead
[ https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997185#comment-13997185 ] Gopal V commented on HIVE-6430: --- I can confirm that if I do an mvn install once, this problem goes away for a day (always fails exactly only on the first build of the day with the patch). If I had to guess, that's because my maven update interval is once-a-day for snapshots. Once you commit this, the .m2/ version from apache-snapshots will match up and my builds won't break anymore (hopefully). Commit this and if it breaks again for me, I'll post an addendum as a new patch. MapJoin hash table has large memory overhead Key: HIVE-6430 URL: https://issues.apache.org/jira/browse/HIVE-6430 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, HIVE-6430.patch Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 for row) can take several hundred bytes, which is ridiculous. I am reducing the size of MJKey and MJRowContainer in other jiras, but in general we don't need to have java hash table there. We can either use primitive-friendly hashtable like the one from HPPC (Apache-licenced), or some variation, to map primitive keys to single row storage structure without an object per row (similar to vectorization). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6768) remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6768: Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks for the contribution Eugene! remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties --- Key: HIVE-6768 URL: https://issues.apache.org/jira/browse/HIVE-6768 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-6768.patch now that MAPREDUCE-5806 is fixed we can remove override-container-log4j.properties and and all the logic around this which was introduced in HIVE-5511 to work around MAPREDUCE-5806 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7055) config not propagating for PTFOperator
[ https://issues.apache.org/jira/browse/HIVE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997701#comment-13997701 ] Harish Butani commented on HIVE-7055: - +1 config not propagating for PTFOperator -- Key: HIVE-7055 URL: https://issues.apache.org/jira/browse/HIVE-7055 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-7055.patch e.g. setting hive.join.cache.size has no effect and task nodes always got default value of 25000 -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 21430: HIVE-6994 - parquet-hive createArray strips null elements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21430/#review42983 --- Ship it! - Szehon Ho On May 14, 2014, 9:16 a.m., justin coffey wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21430/ --- (Updated May 14, 2014, 9:16 a.m.) Review request for hive. Repository: hive-git Description --- The createArray method in ParquetHiveSerDe strips null values from resultant ArrayWritables. This patch: - removes an incorrect if null check in createArray - simplifies ParquetHiveSerDe - total refactor of TestParquetHiveSerDe for better test coverage and easier regression testing Diffs - data/files/parquet_create.txt ccd48ee ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java b689336 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 3b56fc7 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetHiveSerDe.java PRE-CREATION ql/src/test/queries/clientpositive/parquet_create.q 0b976bd ql/src/test/results/clientpositive/parquet_create.q.out 3220be5 Diff: https://reviews.apache.org/r/21430/diff/ Testing --- Thanks, justin coffey
[jira] [Updated] (HIVE-7033) grant statements should check if the role exists
[ https://issues.apache.org/jira/browse/HIVE-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7033: Attachment: HIVE-7033.1.patch grant statements should check if the role exists Key: HIVE-7033 URL: https://issues.apache.org/jira/browse/HIVE-7033 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7033.1.patch, HIVE-7033.2.patch The following grant statement that grants to a role that does not exist succeeds, but it should result in an error. grant all on t1 to role nosuchrole; -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7060) Column stats give incorrect min and distinct_count
[ https://issues.apache.org/jira/browse/HIVE-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7060: -- Description: It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} desc formatted UserVisits_web_text_none avgTimeOnSite ... # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} was: It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite PREHOOK: type: DESCTABLE PREHOOK: Input: default@uservisits_web_text_none POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite POSTHOOK: type: DESCTABLE POSTHOOK: Input: default@uservisits_web_text_none # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} Column stats give incorrect min and distinct_count -- Key: HIVE-7060 URL: https://issues.apache.org/jira/browse/HIVE-7060 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.13.0 Reporter: Xuefu Zhang It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} desc formatted UserVisits_web_text_none avgTimeOnSite ... # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7060) Column stats give incorrect min and distinct_count
Xuefu Zhang created HIVE-7060: - Summary: Column stats give incorrect min and distinct_count Key: HIVE-7060 URL: https://issues.apache.org/jira/browse/HIVE-7060 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.13.0 Reporter: Xuefu Zhang It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeO from UserVisits_web_text_nonenSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) (code} The statisitics for the column: {code} PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite PREHOOK: type: DESCTABLE PREHOOK: Input: default@uservisits_web_text_none POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite POSTHOOK: type: DESCTABLE POSTHOOK: Input: default@uservisits_web_text_none # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7055) cofig not propagating for PTFOperator
Ashutosh Chauhan created HIVE-7055: -- Summary: cofig not propagating for PTFOperator Key: HIVE-7055 URL: https://issues.apache.org/jira/browse/HIVE-7055 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.13.0, 0.12.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan e.g. setting hive.join.cache.size has no effect and task nodes always got default value of 25000 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3159: - Status: Open (was: Patch Available) Looks like a couple tests failed. Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Jakob Homan Assignee: Mohammad Kamrul Islam Attachments: HIVE-3159.10.patch, HIVE-3159.4.patch, HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159.7.patch, HIVE-3159.9.patch, HIVE-3159v1.patch Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7042) Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2
[ https://issues.apache.org/jira/browse/HIVE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7042: --- Status: Patch Available (was: Open) Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2 -- Key: HIVE-7042 URL: https://issues.apache.org/jira/browse/HIVE-7042 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Attachments: HIVE-7042.1.patch, HIVE-7042.1.patch.txt stats_partscan_1_23.q and orc_createas1.q should use HiveInputFormat as opposed to CombineHiveInputFormat. RCFile uses DefaultCodec for compression (uses DEFLATE) which is not splittable. Hence using CombineHiveIF will yield different results for these tests. ORC should use HiveIF to generate ORC splits. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7060) Column stats give incorrect min and distinct_count
[ https://issues.apache.org/jira/browse/HIVE-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997824#comment-13997824 ] Ashutosh Chauhan commented on HIVE-7060: HIVE-4561 and this seems like to have same root cause Column stats give incorrect min and distinct_count -- Key: HIVE-7060 URL: https://issues.apache.org/jira/browse/HIVE-7060 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.13.0 Reporter: Xuefu Zhang It seems that the result from column statistics isn't correct on two measures for numeric columns: min (which is always 0) and distinct count. Here is an example: {code} select count(distinct avgTimeOnSite), min(avgTimeOnSite) from UserVisits_web_text_none; ... OK 9 1 Time taken: 9.747 seconds, Fetched: 1 row(s) {code} The statisitics for the column: {code} desc formatted UserVisits_web_text_none avgTimeOnSite ... # col_name data_type min max num_nulls distinct_count avg_col_len max_col_len num_trues num_falses comment avgTimeOnSite int 0 9 0 11 null nullnull {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
Eugene Koifman created HIVE-7056: Summary: TestPig_11 fails with Pig 12.1 and earlier Key: HIVE-7056 URL: https://issues.apache.org/jira/browse/HIVE-7056 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for *hcatalog-core-*.jar etc. In Pig 12.1 it's looking for hcatalog-core-*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar: No such file or directory ls:
[jira] [Commented] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
[ https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997797#comment-13997797 ] Thejas M Nair commented on HIVE-7057: - +1 webhcat e2e deployment scripts don't have x bit set --- Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7057.patch also, update env.sh to use latest Pig release NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead
[ https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6430: --- Attachment: HIVE-6430.14.patch Reproed it on SVN. It is not related to this patch, fixing anyway. I'm assuming +1 stands... MapJoin hash table has large memory overhead Key: HIVE-6430 URL: https://issues.apache.org/jira/browse/HIVE-6430 Project: Hive Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, HIVE-6430.14.patch, HIVE-6430.patch Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 for row) can take several hundred bytes, which is ridiculous. I am reducing the size of MJKey and MJRowContainer in other jiras, but in general we don't need to have java hash table there. We can either use primitive-friendly hashtable like the one from HPPC (Apache-licenced), or some variation, to map primitive keys to single row storage structure without an object per row (similar to vectorization). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename
[ https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Weeks updated HIVE-6938: --- Attachment: HIVE-6938.3.patch Updated to use global switch until HIVE-6936 is resolved. This means all tables will be treated the same until input formats have access to table properties. Add Support for Parquet Column Rename - Key: HIVE-6938 URL: https://issues.apache.org/jira/browse/HIVE-6938 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.0 Reporter: Daniel Weeks Assignee: Daniel Weeks Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch, HIVE-6938.3.patch Parquet was originally introduced without 'replace columns' support in ql. In addition, the default behavior for parquet is to access columns by name as opposed to by index by the Serde. Parquet should allow for either columnar (index based) access or name based access because it can support either. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6304) Update HCatReader/Writer docs to reflect recent changes
[ https://issues.apache.org/jira/browse/HIVE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6304: Fix Version/s: (was: 0.13.0) 0.14.0 Update HCatReader/Writer docs to reflect recent changes --- Key: HIVE-6304 URL: https://issues.apache.org/jira/browse/HIVE-6304 Project: Hive Issue Type: Improvement Components: Documentation Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.14.0 HIVE-6248 made changes to the HCatReader and HCatWriter classes. Those changes need to be reflect in the [HCatReader/Writer docs|https://cwiki.apache.org/confluence/display/Hive/HCatalog+ReaderWriter] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7034) Explain result of TezWork is not deterministic
[ https://issues.apache.org/jira/browse/HIVE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993850#comment-13993850 ] Hive QA commented on HIVE-7034: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12643888/HIVE-7034.1.patch.txt {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5433 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/154/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/154/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12643888 Explain result of TezWork is not deterministic -- Key: HIVE-7034 URL: https://issues.apache.org/jira/browse/HIVE-7034 Project: Hive Issue Type: Task Components: Tests Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.14.0 Attachments: HIVE-7034.1.patch.txt Recent failure on tez tests are caused by different iteration order of HashMap implementations. Let's fix that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7015) Failing to inherit group/permission should not fail the operation
[ https://issues.apache.org/jira/browse/HIVE-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7015: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Failing to inherit group/permission should not fail the operation - Key: HIVE-7015 URL: https://issues.apache.org/jira/browse/HIVE-7015 Project: Hive Issue Type: Bug Components: Security Affects Versions: 0.14.0 Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 0.14.0 Attachments: HIVE-7015.patch In the previous changes, chgrp and chmod were put on the critical path of directory creation and file copy/mv These should not be, for instance existing users may not have hive-users in the same group as hive group, so chgrp would fail if they turn on the flag hive.warehouse.subdir.inherit.perms. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5370) format_number udf should take user specifed format as argument
[ https://issues.apache.org/jira/browse/HIVE-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5370: Fix Version/s: (was: 0.13.0) 0.14.0 format_number udf should take user specifed format as argument -- Key: HIVE-5370 URL: https://issues.apache.org/jira/browse/HIVE-5370 Project: Hive Issue Type: Improvement Components: UDF Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Priority: Minor Fix For: 0.14.0 Attachments: D13185.1.patch, D13185.2.patch, HIVE-5370.patch, HIVE-5370.patch Currently, format_number udf formats the number to #,###,###.##, but it should also take a user specified format as optional input. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler
[ https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992856#comment-13992856 ] Swarnim Kulkarni commented on HIVE-6411: RB updated. Support more generic way of using composite key for HBaseHandler Key: HIVE-6411 URL: https://issues.apache.org/jira/browse/HIVE-6411 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6411.1.patch.txt, HIVE-6411.10.patch.txt, HIVE-6411.11.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5342) Remove pre hadoop-0.20.0 related codes
[ https://issues.apache.org/jira/browse/HIVE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-5342: - Attachment: HIVE-5342.2.patch The test report link didn't look like it was for this patch. Uploading patch again. Remove pre hadoop-0.20.0 related codes -- Key: HIVE-5342 URL: https://issues.apache.org/jira/browse/HIVE-5342 Project: Hive Issue Type: Task Reporter: Navis Assignee: Jason Dere Priority: Trivial Attachments: D13047.1.patch, HIVE-5342.1.patch, HIVE-5342.2.patch, HIVE-5342.2.patch Recently, we discussed not supporting hadoop-0.20.0. If it would be done like that or not, 0.17 related codes would be removed before that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization
[ https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993106#comment-13993106 ] Thejas M Nair commented on HIVE-6846: - That failure does not indicate a product problem. In fact there is no reason to set local scratch dirs to 777 . That change was part of HIVE-5486. The idea is that in HS2, with doAs enabled, the files/subdirs under scratch dir happens as the end user. But in case of local file system, this is not true, all file creation happens as the HS2 server user. There are some changes that Vaibhav and Vikram have been working on to create the base scratch dir as the actual user running the query. That will help address this issue. The test issue is already not there in trunk. I don't think this should block 0.13.1 release. allow safe set commands with sql standard authorization --- Key: HIVE-6846 URL: https://issues.apache.org/jira/browse/HIVE-6846 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.13.0 Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch HIVE-6827 disables all set commands when SQL standard authorization is turned on, but not all set commands are unsafe. We should allow safe set commands. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5664) Drop cascade database fails when the db has any tables with indexes
[ https://issues.apache.org/jira/browse/HIVE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994148#comment-13994148 ] Hive QA commented on HIVE-5664: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12610898/HIVE-5664.1.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5436 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/157/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/157/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12610898 Drop cascade database fails when the db has any tables with indexes --- Key: HIVE-5664 URL: https://issues.apache.org/jira/browse/HIVE-5664 Project: Hive Issue Type: Bug Components: Indexing, Metastore Affects Versions: 0.10.0, 0.11.0, 0.12.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 0.14.0 Attachments: HIVE-5664.1.patch.txt {code} CREATE DATABASE db2; USE db2; CREATE TABLE tab1 (id int, name string); CREATE INDEX idx1 ON TABLE tab1(id) as 'COMPACT' with DEFERRED REBUILD IN TABLE tab1_indx; DROP DATABASE db2 CASCADE; {code} Last DDL fails with the following error: {code} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Database does not exist: db2 Hive.log has following exception 2013-10-27 20:46:16,629 ERROR exec.DDLTask (DDLTask.java:execute(434)) - org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: db2 at org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3473) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:231) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1441) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1219) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: NoSuchObjectException(message:db2.tab1_indx table not found) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1376) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) at com.sun.proxy.$Proxy7.get_table(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:890) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:660) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:652) at
[jira] [Commented] (HIVE-6900) HostUtil.getTaskLogUrl signature change causes compilation to fail
[ https://issues.apache.org/jira/browse/HIVE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993250#comment-13993250 ] Szehon Ho commented on HIVE-6900: - [~jdere] We were looking at this change and were wondering, why is the log talking about MR1 when it is a Hadoop23Shims? Isn't the condition of the else statement still MR2, but to handle local mode case? Seems like TaskServlet is in MR1. HostUtil.getTaskLogUrl signature change causes compilation to fail -- Key: HIVE-6900 URL: https://issues.apache.org/jira/browse/HIVE-6900 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.13.0, 0.14.0 Reporter: Chris Drome Assignee: Jason Dere Fix For: 0.14.0 Attachments: HIVE-6900.1.patch.txt, HIVE-6900.2.patch The signature for HostUtil.getTaskLogUrl has changed between Hadoop-2.3 and Hadoop-2.4. Code in shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java works with Hadoop-2.3 method and causes compilation failure with Hadoop-2.4. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6985) sql std auth - privileges grants to public role not being honored
[ https://issues.apache.org/jira/browse/HIVE-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6985: --- Fix Version/s: 0.13.1 sql std auth - privileges grants to public role not being honored - Key: HIVE-6985 URL: https://issues.apache.org/jira/browse/HIVE-6985 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 0.14.0, 0.13.1 Attachments: HIVE-6985.1.patch, HIVE-6985.2.patch, HIVE-6985.3.patch When a privilege is granted to public role, the privilege is supposed to be applicable for all users. However, the privilege check fails for users, even if the have public role in the list of current roles. Note that the issue is only with public role. Grant of privileges of other role are not affected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6826) Hive-tez has issues when different partitions work off of different input types
[ https://issues.apache.org/jira/browse/HIVE-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992672#comment-13992672 ] Sushanth Sowmyan commented on HIVE-6826: Yup, on re-testing, it seems to pass on my setup as well. I'm going to ignore the initial failure report. Thanks for checking up on it! Hive-tez has issues when different partitions work off of different input types --- Key: HIVE-6826 URL: https://issues.apache.org/jira/browse/HIVE-6826 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: HIVE-6826.1.patch, HIVE-6826.2.patch create table test (key int, value string) partitioned by (p int) stored as textfile; insert into table test partition (p=1) select * from src limit 10; alter table test set fileformat orc; insert into table test partition (p=2) select * from src limit 10; describe test; select * from test where p=1 and key 0; select * from test where p=2 and key 0; select * from test where key 0; throws a classcast exception -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6910) Invalid column access info for partitioned table
[ https://issues.apache.org/jira/browse/HIVE-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994033#comment-13994033 ] Hive QA commented on HIVE-6910: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12643891/HIVE-6910.4.patch.txt {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5433 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/155/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/155/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12643891 Invalid column access info for partitioned table Key: HIVE-6910 URL: https://issues.apache.org/jira/browse/HIVE-6910 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6910.1.patch.txt, HIVE-6910.2.patch.txt, HIVE-6910.3.patch.txt, HIVE-6910.4.patch.txt From http://www.mail-archive.com/user@hive.apache.org/msg11324.html neededColumnIDs in TS is only for non-partition columns. But ColumnAccessAnalyzer is calculating it on all columns. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7030) Remove hive.hadoop.classpath from hiveserver2.cmd
[ https://issues.apache.org/jira/browse/HIVE-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993815#comment-13993815 ] Vaibhav Gumashta commented on HIVE-7030: Committed to trunk. Thanks for the contribution [~hsubramaniyan]! Thanks for pointing out the issue [~leftylev]! Remove hive.hadoop.classpath from hiveserver2.cmd - Key: HIVE-7030 URL: https://issues.apache.org/jira/browse/HIVE-7030 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 0.14.0 Attachments: HIVE-7030.1.patch This parameter is not used anywhere and should be removed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5315) Cannot attach debugger to Hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994197#comment-13994197 ] Hive QA commented on HIVE-5315: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12605156/HIVE-5315.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5436 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/161/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/161/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12605156 Cannot attach debugger to Hiveserver2 -- Key: HIVE-5315 URL: https://issues.apache.org/jira/browse/HIVE-5315 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Fix For: 0.14.0 Attachments: HIVE-5315.patch In current implementation, bin/hive retrieves HADOOP_VERSION like as follows {code} HADOOP_VERSION=$($HADOOP version | awk '{if (NR == 1) {print $2;}}'); {code} But, sometimes, hadoop version doesn't show version information at the first line. If HADOOP_VERSION is not retrieve collectly, Hive or related processes will not be up. I faced this situation when I try to debug Hiveserver2 with debug option like as follows {code} -Xdebug -Xrunjdwp:trunsport=dt_socket,suspend=n,server=y,address=9876 {code} Then, hadoop version shows -Xdebug... at the first line. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4867) Deduplicate columns appearing in both the key list and value list of ReduceSinkOperator
[ https://issues.apache.org/jira/browse/HIVE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4867: Status: Patch Available (was: Open) Deduplicate columns appearing in both the key list and value list of ReduceSinkOperator --- Key: HIVE-4867 URL: https://issues.apache.org/jira/browse/HIVE-4867 Project: Hive Issue Type: Improvement Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4867.1.patch.txt A ReduceSinkOperator emits data in the format of keys and values. Right now, a column may appear in both the key list and value list, which result in unnecessary overhead for shuffling. Example: We have a query shown below ... {code:sql} explain select ss_ticket_number from store_sales cluster by ss_ticket_number; {\code} The plan is ... {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: store_sales TableScan alias: store_sales Select Operator expressions: expr: ss_ticket_number type: int outputColumnNames: _col0 Reduce Output Operator key expressions: expr: _col0 type: int sort order: + Map-reduce partition columns: expr: _col0 type: int tag: -1 value expressions: expr: _col0 type: int Reduce Operator Tree: Extract File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Stage: Stage-0 Fetch Operator limit: -1 {\code} The column 'ss_ticket_number' is in both the key list and value list of the ReduceSinkOperator. The type of ss_ticket_number is int. For this case, BinarySortableSerDe will introduce 1 byte more for every int in the key. LazyBinarySerDe will also introduce overhead when recording the length of a int. For every int, 10 bytes should be a rough estimation of the size of data emitted from the Map phase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7043) When using the tez session pool via hive, once sessions time out, all queries go to the default queue
[ https://issues.apache.org/jira/browse/HIVE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-7043: - Status: Patch Available (was: Open) When using the tez session pool via hive, once sessions time out, all queries go to the default queue - Key: HIVE-7043 URL: https://issues.apache.org/jira/browse/HIVE-7043 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: HIVE-7043.1.patch When using a tez session pool to run multiple queries, once the sessions time out, we always end up using the default queue to launch queries. The load balancing doesn't work in this case. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase
[ https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2365: Fix Version/s: (was: 0.13.0) 0.14.0 SQL support for bulk load into HBase Key: HIVE-2365 URL: https://issues.apache.org/jira/browse/HIVE-2365 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: John Sichi Assignee: Nick Dimiduk Fix For: 0.14.0 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch Support the as simple as this SQL for bulk load from Hive into HBase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5150) UnsatisfiedLinkError when running hive unit tests on Windows
[ https://issues.apache.org/jira/browse/HIVE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5150: Fix Version/s: (was: 0.13.0) 0.14.0 UnsatisfiedLinkError when running hive unit tests on Windows Key: HIVE-5150 URL: https://issues.apache.org/jira/browse/HIVE-5150 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.12.0 Environment: Windows Reporter: shanyu zhao Assignee: shanyu zhao Fix For: 0.14.0 Attachments: HIVE-5150.patch When running any hive unit tests against hadoop 2.0, it will fail with error like this: [junit] Exception in thread main java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z [junit] at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method) [junit] at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:423) [junit] at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:933) [junit] at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:177) [junit] at org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:164) This is due to the test process failed to find hadoop.dll. This is related to YARN-729. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7000) Several issues with javadoc generation
[ https://issues.apache.org/jira/browse/HIVE-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7000: --- Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Harish! Several issues with javadoc generation -- Key: HIVE-7000 URL: https://issues.apache.org/jira/browse/HIVE-7000 Project: Hive Issue Type: Improvement Reporter: Harish Butani Assignee: Harish Butani Fix For: 0.14.0 Attachments: HIVE-7000.1.patch 1. Ran 'mvn javadoc:javadoc -Phadoop-2'. Encountered several issues - Generated classes are included in the javadoc - generation fails in the top level hcatalog folder because its src folder contains no java files. Patch attached to fix these issues. 2. Tried mvn javadoc:aggregate -Phadoop-2 - cannot get an aggregated javadoc for all of hive - tried setting 'aggregate' parameter to true. Didn't work There are several questions in StackOverflow about multiple project javadoc. Seems like this is broken. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6549) remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6549: - Attachment: HIVE-6549.2.patch adressed [~thejas]'s comments remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh - Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor Attachments: HIVE-6549.2.patch, HIVE-6549.patch this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME hcatalog/bin/hive-config.sh is not used NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5810) create a function add_date as exists in mysql
[ https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997272#comment-13997272 ] Anandha L Ranganathan commented on HIVE-5810: - The existing function DATE_ADD did not support to add day , month and year as Unit. The expr unit is always days. But this function has the support for DAY,MONTH and YEAR. ADD_DATE(date, expr unit, INTERVAL). create a function add_date as exists in mysql Key: HIVE-5810 URL: https://issues.apache.org/jira/browse/HIVE-5810 Project: Hive Issue Type: Improvement Reporter: Anandha L Ranganathan Assignee: Anandha L Ranganathan Attachments: HIVE-5810.2.patch, HIVE-5810.patch Original Estimate: 40h Remaining Estimate: 40h MySQL has ADDDATE(date,INTERVAL expr unit). Similarly in Hive we can have (date,unit,expr). Here Unit is DAY/Month/Year For example, add_date('2013-11-09','DAY',2) will return 2013-11-11. add_date('2013-11-09','Month',2) will return 2014-01-09. add_date('2013-11-09','Year',2) will return 2014-11-11. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler
[ https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996823#comment-13996823 ] Swarnim Kulkarni commented on HIVE-6411: Thanks for the review Xuefu. I think we can now mark HIVE-6290 as resolved as well as the patch for that was included as a part of this JIRA. Support more generic way of using composite key for HBaseHandler Key: HIVE-6411 URL: https://issues.apache.org/jira/browse/HIVE-6411 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.10.patch.txt, HIVE-6411.11.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt HIVE-2599 introduced using custom object for the row key. But it forces key objects to extend HBaseCompositeKey, which is again extension of LazyStruct. If user provides proper Object and OI, we can replace internal key and keyOI with those. Initial implementation is based on factory interface. {code} public interface HBaseKeyFactory { void init(SerDeParameters parameters, Properties properties) throws SerDeException; ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException; LazyObjectBase createObject(ObjectInspector inspector) throws SerDeException; } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7056) WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Summary: WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13 (was: TestPig_11 fails with Pig 12.1 and earlier) WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13 --- Key: HIVE-7056 URL: https://issues.apache.org/jira/browse/HIVE-7056 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for \*hcatalog-core-\*.jar etc. In Pig 12.1 it's looking for hcatalog-core-\*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found
[ https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7035: - Status: Patch Available (was: Open) Templeton returns 500 for user errors - when job cannot be found Key: HIVE-7035 URL: https://issues.apache.org/jira/browse/HIVE-7035 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7035.patch curl -i 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman' should return HTTP Status code 4xx when no such job exists; it currently returns 500. {noformat} {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_201304291205_0015' doesn't exist in RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol PBServiceImpl.java:120)\r\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat org.apache.hado op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Serve r.java:2053)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat java.security.AccessController.doPrivileged(Native Method)\r\n\tat javax.security.auth.Subject.doAs(Subject.ja va:415)\r\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n} {noformat} NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6965) Transaction manager should use RDBMS time instead of machine time
[ https://issues.apache.org/jira/browse/HIVE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998111#comment-13998111 ] Ashutosh Chauhan commented on HIVE-6965: +1 Transaction manager should use RDBMS time instead of machine time - Key: HIVE-6965 URL: https://issues.apache.org/jira/browse/HIVE-6965 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-6965.patch Current TxnHandler and CompactionTxnHandler use System.currentTimeMillis() when they need to determine the time (such as heartbeating transactions). In situations where there are multiple Thrift metastore services or users are using an embedded metastore this will lead to issues. We should instead be using time from the RDBMS, which is guaranteed to be the same for all users. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7066) hive-exec jar is missing avro-mapred
David Chen created HIVE-7066: Summary: hive-exec jar is missing avro-mapred Key: HIVE-7066 URL: https://issues.apache.org/jira/browse/HIVE-7066 Project: Hive Issue Type: Bug Reporter: David Chen Running a simple query that reads an Avro table caused the following exception to be thrown on the cluster side: {code} java.lang.RuntimeException: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365) at org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276) at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445) at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat Serialization trace: outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc) aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) at org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) at org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850) at org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864) at org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334) ... 13 more Caused by: java.lang.IllegalArgumentException: Unable to create serializer org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45) at org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56) at org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) at
[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization
[ https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993126#comment-13993126 ] Sushanth Sowmyan commented on HIVE-6846: Awesome, good to hear. I will not consider this a blocker for 0.13.1. Thanks! allow safe set commands with sql standard authorization --- Key: HIVE-6846 URL: https://issues.apache.org/jira/browse/HIVE-6846 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.13.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.13.0 Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch HIVE-6827 disables all set commands when SQL standard authorization is turned on, but not all set commands are unsafe. We should allow safe set commands. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7025) TTL on hive tables
[ https://issues.apache.org/jira/browse/HIVE-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992808#comment-13992808 ] Edward Capriolo commented on HIVE-7025: --- We do something similar however we also have the ability to delete partitions over a certain age. Hive already has a property inside every table called retention that we could consider using. This code is a good first step but I have one question. Isn't this code rather racey? If we have multiple CLIs running threads they could all be simultaneously deleting tables, and a CLI with a system with a misconfiguration clock could potentially delete all the tables. I think if we do this it should be a stand alone piece. TTL on hive tables -- Key: HIVE-7025 URL: https://issues.apache.org/jira/browse/HIVE-7025 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-7025.1.patch.txt Add self destruction properties for temporary tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5538) Turn on vectorization by default.
[ https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-5538: --- Attachment: HIVE-5538.5.patch Turn on vectorization by default. - Key: HIVE-5538 URL: https://issues.apache.org/jira/browse/HIVE-5538 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch, HIVE-5538.4.patch, HIVE-5538.5.patch Vectorization should be turned on by default, so that users don't have to specifically enable vectorization. Vectorization code validates and ensures that a query falls back to row mode if it is not supported on vectorized code path. -- This message was sent by Atlassian JIRA (v6.2#6252)