[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Affects Version/s: 0.13.0

 TestPig_11 fails with Pig 12.1 and earlier
 --

 Key: HIVE-7056
 URL: https://issues.apache.org/jira/browse/HIVE-7056
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
 looking for *hcatalog-core-*.jar etc.  In Pig 12.1 it's looking for 
 hcatalog-core-*.jar, which doesn't work with Hive 0.13.
 The TestPig_11 job fails with
 {noformat}
 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
 during parsing: Error during parsing. Could not resolve 
 org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
 Failed to parse: Pig script failed to parse: 
 file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
 org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
 resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
   at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
   at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
   at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
   at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
   at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
   at org.apache.pig.Main.run(Main.java:478)
   at org.apache.pig.Main.main(Main.java:156)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: 
 file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
 org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
 resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
   at 
 org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
   at 
 org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
   ... 16 more
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
 Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
 java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
   at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
   at 
 org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
   ... 24 more
 {noformat}
 the key to this is 
 {noformat}
 ls: 
 /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
  No such file or directory
 ls: 
 /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
  No such file or directory
 ls: 
 

[jira] [Updated] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7057:
-

Status: Patch Available  (was: Open)

 webhcat e2e deployment scripts don't have x bit set
 ---

 Key: HIVE-7057
 URL: https://issues.apache.org/jira/browse/HIVE-7057
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-7057.patch


 also, update env.sh to use latest Pig release
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6965) Transaction manager should use RDBMS time instead of machine time

2014-05-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-6965:
-

Attachment: HIVE-6965.patch

 Transaction manager should use RDBMS time instead of machine time
 -

 Key: HIVE-6965
 URL: https://issues.apache.org/jira/browse/HIVE-6965
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-6965.patch


 Current TxnHandler and CompactionTxnHandler use System.currentTimeMillis() 
 when they need to determine the time (such as heartbeating transactions).  In 
 situations where there are multiple Thrift metastore services or users are 
 using an embedded metastore this will lead to issues.  We should instead be 
 using time from the RDBMS, which is guaranteed to be the same for all users.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [VOTE] Apache Hive 0.13.1 Release Candidate 1

2014-05-14 Thread Eugene Koifman
TestHive_7 is explained by https://issues.apache.org/jira/browse/HIVE-6521,
which is in trunk but not 13.1


On Tue, May 13, 2014 at 6:50 PM, Eugene Koifman ekoif...@hortonworks.comwrote:

 I downloaded src tar, built it and ran webhcat e2e tests.
 I see 2 failures (which I don't see on trunk)

 TestHive_7 fails with
 got percentComplete map 100% reduce 0%,  expected  map 100% reduce 100%

 TestHeartbeat_1 fails to even launch the job.  This looks like the root
 cause

 ERROR | 13 May 2014 18:24:00,394 |
 org.apache.hive.hcatalog.templeton.CatchallExceptionMapper |
 java.lang.NullPointerException
 at
 org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:312)
 at
 org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:479)
 at
 org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:170)
 at
 org.apache.hadoop.util.GenericOptionsParser.init(GenericOptionsParser.java:153)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:107)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator$1.run(LauncherDelegator.java:103)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator.queueAsUser(LauncherDelegator.java:103)
 at
 org.apache.hive.hcatalog.templeton.LauncherDelegator.enqueueController(LauncherDelegator.java:81)
 at
 org.apache.hive.hcatalog.templeton.JarDelegator.run(JarDelegator.java:55)
 at
 org.apache.hive.hcatalog.templeton.Server.mapReduceJar(Server.java:711)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
 at
 com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:185)
 at
 com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
 at
 com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)
 at
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at
 com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
 at
 com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
 at
 com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1480)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1411)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1360)
 at
 com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1350)
 at
 com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
 at
 com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:538)
 at
 com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:716)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at
 org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1360)
 at
 org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:392)
 at
 org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:87)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1331)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:477)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031)
 at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965)
 at
 

[jira] [Updated] (HIVE-6187) Add test to verify that DESCRIBE TABLE works with quoted table names

2014-05-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6187:
---

Assignee: Carl Steinbach

 Add test to verify that DESCRIBE TABLE works with quoted table names
 

 Key: HIVE-6187
 URL: https://issues.apache.org/jira/browse/HIVE-6187
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Andy Mok
Assignee: Carl Steinbach
 Fix For: 0.14.0

 Attachments: HIVE-6187.1.patch


 Backticks around tables named after special keywords, such as items, allow us 
 to create, drop, and alter the table. For example
 {code:sql}
 CREATE TABLE foo.`items` (bar INT);
 DROP TABLE foo.`items`;
 ALTER TABLE `items` RENAME TO `items_`;
 {code}
 However, we cannot call
 {code:sql}
 DESCRIBE foo.`items`;
 DESCRIBE `items`;
 {code}
 The DESCRIBE query does not permit backticks to surround table names. The 
 error returned is
 {code:sql}
 FAILED: SemanticException [Error 10001]: Table not found `items`
 {code} 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7057:
-

Description: 
also, update env.sh to use latest Pig release

NO PRECOMMIT TESTS


  was:also, update env.sh to use latest Pig release


 webhcat e2e deployment scripts don't have x bit set
 ---

 Key: HIVE-7057
 URL: https://issues.apache.org/jira/browse/HIVE-7057
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 also, update env.sh to use latest Pig release
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5664) Drop cascade database fails when the db has any tables with indexes

2014-05-14 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-5664:
--

Attachment: HIVE-5664.2.patch.txt

Rebased on latest trunk and fixed the .out changes.

 Drop cascade database fails when the db has any tables with indexes
 ---

 Key: HIVE-5664
 URL: https://issues.apache.org/jira/browse/HIVE-5664
 Project: Hive
  Issue Type: Bug
  Components: Indexing, Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.14.0

 Attachments: HIVE-5664.1.patch.txt, HIVE-5664.2.patch.txt


 {code}
 CREATE DATABASE db2; 
 USE db2; 
 CREATE TABLE tab1 (id int, name string); 
 CREATE INDEX idx1 ON TABLE tab1(id) as 'COMPACT' with DEFERRED REBUILD IN 
 TABLE tab1_indx; 
 DROP DATABASE db2 CASCADE;
 {code}
 Last DDL fails with the following error:
 {code}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Database does not exist: db2
 Hive.log has following exception
 2013-10-27 20:46:16,629 ERROR exec.DDLTask (DDLTask.java:execute(434)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: db2
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3473)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:231)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1441)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1219)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: NoSuchObjectException(message:db2.tab1_indx table not found)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1376)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
 at com.sun.proxy.$Proxy7.get_table(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:890)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:660)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:652)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:546)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
 at com.sun.proxy.$Proxy8.dropDatabase(Unknown Source)
 at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:284)
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3470)
 ... 18 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: How to remote debug WebHCat?

2014-05-14 Thread Lefty Leverenz
Should this be documented in the wiki?

-- Lefty


On Tue, May 13, 2014 at 3:17 PM, Eugene Koifman ekoif...@hortonworks.comwrote:

 if you take webhcat_server.sh as currently in trunk, it supports startDebug
 option that will let you attach a debugger to the process


 On Mon, May 12, 2014 at 11:13 PM, Na Yang ny...@maprtech.com wrote:

  Hi Folks,
 
  Is there a way to remote debug webhcat? If so, how to enable the remote
  debug?
 
  Thanks,
  Na
 



 --

 Thanks,
 Eugene

 --
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to
 which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.



[jira] [Updated] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE

2014-05-14 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-7050:
-

Attachment: HIVE-7050.2.patch

Addressed [~xuefuz]'s review comments.

 Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
 -

 Key: HIVE-7050
 URL: https://issues.apache.org/jira/browse/HIVE-7050
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7050.1.patch, HIVE-7050.2.patch


 There is currently no way to display the column level stats from hive CLI. It 
 will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7050) Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE

2014-05-14 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997303#comment-13997303
 ] 

Prasanth J commented on HIVE-7050:
--

Column stats are stored only when a column is specified and only when FORMATTED 
is specified. It does NOT show for EXTENDED because extended output does not 
show the column names at the top which makes it difficult to comprehend the 
column stats output.

 Display table level column stats in DESCRIBE EXTENDED/FORMATTED TABLE
 -

 Key: HIVE-7050
 URL: https://issues.apache.org/jira/browse/HIVE-7050
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7050.1.patch, HIVE-7050.2.patch


 There is currently no way to display the column level stats from hive CLI. It 
 will be good to show them in DESCRIBE EXTENDED/FORMATTED TABLE



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7054) Support ELT UDF in vectorized mode

2014-05-14 Thread Deepesh Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepesh Khandelwal updated HIVE-7054:
-

Status: Patch Available  (was: Open)

 Support ELT UDF in vectorized mode
 --

 Key: HIVE-7054
 URL: https://issues.apache.org/jira/browse/HIVE-7054
 Project: Hive
  Issue Type: New Feature
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0

 Attachments: HIVE-7054.patch


 Implement support for ELT udf in vectorized execution mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable

2014-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996047#comment-13996047
 ] 

Hive QA commented on HIVE-7049:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12644526/HIVE-7049.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/188/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/188/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12644526

 Unable to deserialize AVRO data when file schema and record schema are 
 different and nullable
 -

 Key: HIVE-7049
 URL: https://issues.apache.org/jira/browse/HIVE-7049
 Project: Hive
  Issue Type: Bug
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-7049.1.patch


 It mainly happens when 
 1 )file schema and record schema are not same
 2 ) Record schema is nullable  but file schema is not.
 The potential code location is at class AvroDeserialize
  
 {noformat}
  if(AvroSerdeUtils.isNullableType(recordSchema)) {
   return deserializeNullableUnion(datum, fileSchema, recordSchema, 
 columnType);
 }
 {noformat}
 In the above code snippet, recordSchema is verified if it is nullable. But 
 the file schema is not checked.
 I tested with these values:
 {noformat}
 recordSchema= [null,string]
 fielSchema= string
 {noformat}
 And i got the following exception line numbers might not be the same due to 
 mu debugged code version.
 {noformat}
 org.apache.avro.AvroRuntimeException: Not a union: string 
 at org.apache.avro.Schema.getTypes(Schema.java:272)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set

2014-05-14 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7057:


 Summary: webhcat e2e deployment scripts don't have x bit set
 Key: HIVE-7057
 URL: https://issues.apache.org/jira/browse/HIVE-7057
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman
Assignee: Eugene Koifman


also, update env.sh to use latest Pig release



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7054) Support ELT UDF in vectorized mode

2014-05-14 Thread Deepesh Khandelwal (JIRA)
Deepesh Khandelwal created HIVE-7054:


 Summary: Support ELT UDF in vectorized mode
 Key: HIVE-7054
 URL: https://issues.apache.org/jira/browse/HIVE-7054
 Project: Hive
  Issue Type: New Feature
  Components: Vectorization
Affects Versions: 0.14.0
Reporter: Deepesh Khandelwal
Assignee: Deepesh Khandelwal
 Fix For: 0.14.0


Implement support for ELT udf in vectorized execution mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-860) Persistent distributed cache

2014-05-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-860:
---

Fix Version/s: (was: 0.13.0)
   0.14.0

 Persistent distributed cache
 

 Key: HIVE-860
 URL: https://issues.apache.org/jira/browse/HIVE-860
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.12.0
Reporter: Zheng Shao
Assignee: Brock Noland
 Fix For: 0.14.0

 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
 HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch


 DistributedCache is shared across multiple jobs, if the hdfs file name is the 
 same.
 We need to make sure Hive put the same file into the same location every time 
 and do not overwrite if the file content is the same.
 We can achieve 2 different results:
 A1. Files added with the same name, timestamp, and md5 in the same session 
 will have a single copy in distributed cache.
 A2. Filed added with the same name, timestamp, and md5 will have a single 
 copy in distributed cache.
 A2 has a bigger benefit in sharing but may raise a question on when Hive 
 should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5810) create a function add_date as exists in mysql

2014-05-14 Thread Anandha L Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997273#comment-13997273
 ] 

Anandha L Ranganathan commented on HIVE-5810:
-

I could have implemented in existing function with these feature, but that will 
break already existing implementation in production. 

 create a function add_date   as exists in mysql 
 

 Key: HIVE-5810
 URL: https://issues.apache.org/jira/browse/HIVE-5810
 Project: Hive
  Issue Type: Improvement
Reporter: Anandha L Ranganathan
Assignee: Anandha L Ranganathan
 Attachments: HIVE-5810.2.patch, HIVE-5810.patch

   Original Estimate: 40h
  Remaining Estimate: 40h

 MySQL has ADDDATE(date,INTERVAL expr unit).
 Similarly in Hive we can have  (date,unit,expr). 
 Here Unit is DAY/Month/Year
 For example,
 add_date('2013-11-09','DAY',2) will return 2013-11-11.
 add_date('2013-11-09','Month',2) will return 2014-01-09.
 add_date('2013-11-09','Year',2) will return 2014-11-11.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7033) grant statements should check if the role exists

2014-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996018#comment-13996018
 ] 

Hive QA commented on HIVE-7033:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12644480/HIVE-7033.4.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5506 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_load_dyn_part1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHadoopVersion
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getPigVersion
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getStatus
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.invalidPath
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/183/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/183/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12644480

 grant statements should check if the role exists
 

 Key: HIVE-7033
 URL: https://issues.apache.org/jira/browse/HIVE-7033
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-7033.1.patch, HIVE-7033.2.patch, HIVE-7033.3.patch, 
 HIVE-7033.4.patch


 The following grant statement that grants to a role that does not exist 
 succeeds, but it should result in an error.
  grant all on t1 to role nosuchrole;



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6986) MatchPath fails with small resultExprString

2014-05-14 Thread Furcy Pin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997507#comment-13997507
 ] 

Furcy Pin commented on HIVE-6986:
-

Hi Ashutosh,

r.startsWith(select) wouldn't match upper cases, and unfortunately there is 
no built-in String.startsWithIgnoreCase in Java I believe.

r.toLowerCase().equals(select) would work of course, but I wanted to preserve 
the initial (futile) optimisation with the r.substring(0,6), so that
we don't cast the whole query to lowercase, but just its prefix.

 MatchPath fails with small resultExprString
 ---

 Key: HIVE-6986
 URL: https://issues.apache.org/jira/browse/HIVE-6986
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Furcy Pin
Priority: Trivial
 Attachments: HIVE-6986.1.patch


 When using MatchPath, a query like this:
 select year
 from matchpath(on 
 flights_tiny 
 sort by fl_num, year, month, day_of_month  
   arg1('LATE.LATE+'), 
   arg2('LATE'), arg3(arr_delay  15), 
 arg4('year') 
)
 ;
 will fail with error message 
 FAILED: StringIndexOutOfBoundsException String index out of range: 6



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7048) CompositeKeyHBaseFactory should not use FamilyFilter

2014-05-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7048:
--

Priority: Blocker  (was: Major)

 CompositeKeyHBaseFactory should not use FamilyFilter
 

 Key: HIVE-7048
 URL: https://issues.apache.org/jira/browse/HIVE-7048
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Swarnim Kulkarni
Priority: Blocker

 HIVE-6411 introduced a more generic way to provide composite key 
 implementations via custom factory implementations. However it seems like the 
 CompositeHBaseKeyFactory implementation uses a FamilyFilter for row key scans 
 which doesn't seem appropriate. This should be investigated further and if 
 possible replaced with a RowRangeScanFilter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 21430: HIVE-6994 - parquet-hive createArray strips null elements

2014-05-14 Thread justin coffey

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21430/
---

Review request for hive.


Repository: hive-git


Description
---

The createArray method in ParquetHiveSerDe strips null values from resultant 
ArrayWritables.

This patch:
- removes an incorrect if null check in createArray
- simplifies ParquetHiveSerDe
- total refactor of TestParquetHiveSerDe for better test coverage and easier 
regression testing


Diffs
-

  data/files/parquet_create.txt ccd48ee 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
b689336 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
3b56fc7 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetHiveSerDe.java
 PRE-CREATION 
  ql/src/test/queries/clientpositive/parquet_create.q 0b976bd 
  ql/src/test/results/clientpositive/parquet_create.q.out 3220be5 

Diff: https://reviews.apache.org/r/21430/diff/


Testing
---


Thanks,

justin coffey



[jira] [Commented] (HIVE-6290) Add support for hbase filters for composite keys

2014-05-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997376#comment-13997376
 ] 

Lefty Leverenz commented on HIVE-6290:
--

Does this need any user doc?

 Add support for hbase filters for composite keys
 

 Key: HIVE-6290
 URL: https://issues.apache.org/jira/browse/HIVE-6290
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Handler
Affects Versions: 0.12.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Fix For: 0.14.0

 Attachments: HIVE-6290.1.patch.txt, HIVE-6290.2.patch.txt, 
 HIVE-6290.3.patch.txt


 Add support for filters to be provided via the composite key class



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7059) Hive 13 shell slow start

2014-05-14 Thread Emil A. Siemes (JIRA)
Emil A. Siemes created HIVE-7059:


 Summary: Hive 13 shell slow start
 Key: HIVE-7059
 URL: https://issues.apache.org/jira/browse/HIVE-7059
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 0.13.0
Reporter: Emil A. Siemes
Priority: Minor


The shell startup time can be reduced by 2-4s by removing HBase jar files from 
classpath in /usr/lib/hive/bin/hive

For interactive queries saving a couple of seconds is a big gain. Somehow the 
cli seems to eagerly connect to zk even if it's not needed as in hive --version.





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6473) Allow writing HFiles via HBaseStorageHandler table

2014-05-14 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-6473:
---

Attachment: HIVE-6473.1.patch

Attaching same file with different extension, see which one buildbot picks up.

 Allow writing HFiles via HBaseStorageHandler table
 --

 Key: HIVE-6473
 URL: https://issues.apache.org/jira/browse/HIVE-6473
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
 Attachments: HIVE-6473.0.patch.txt, HIVE-6473.1.patch, 
 HIVE-6473.1.patch.txt


 Generating HFiles for bulkload into HBase could be more convenient. Right now 
 we require the user to register a new table with the appropriate output 
 format. This patch allows the exact same functionality, but through an 
 existing table managed by the HBaseStorageHandler.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6901) Explain plan doesn't show operator tree for the fetch operator

2014-05-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13995386#comment-13995386
 ] 

Xuefu Zhang commented on HIVE-6901:
---

[~ashutoshc] Would you mind reviewing the one-line change? Thanks.

 Explain plan doesn't show operator tree for the fetch operator
 --

 Key: HIVE-6901
 URL: https://issues.apache.org/jira/browse/HIVE-6901
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.12.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
Priority: Minor
 Attachments: HIVE-6109.10.patch, HIVE-6901.1.patch, 
 HIVE-6901.2.patch, HIVE-6901.3.patch, HIVE-6901.4.patch, HIVE-6901.5.patch, 
 HIVE-6901.6.patch, HIVE-6901.7.patch, HIVE-6901.8.patch, HIVE-6901.9.patch, 
 HIVE-6901.patch


 Explaining a simple select query that involves a MR phase doesn't show 
 processor tree for the fetch operator.
 {code}
 hive explain select d from test;
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Map Operator Tree:
 ...
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {code}
 It would be nice if the operator tree is shown even if there is only one node.
 Please note that in local execution, the operator tree is complete:
 {code}
 hive explain select * from test;
 OK
 STAGE DEPENDENCIES:
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-0
 Fetch Operator
   limit: -1
   Processor Tree:
 TableScan
   alias: test
   Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE Column 
 stats: NONE
   Select Operator
 expressions: d (type: int)
 outputColumnNames: _col0
 Statistics: Num rows: 8 Data size: 34 Basic stats: COMPLETE 
 Column stats: NONE
 ListSink
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7049) Unable to deserialize AVRO data when file schema and record schema are different and nullable

2014-05-14 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997378#comment-13997378
 ] 

Mohammad Kamrul Islam commented on HIVE-7049:
-

Thank [~xuefuz] for the comments.

I believe it is a valid Avro schema evolution.
Please see the following comments copied from  the link:
http://avro.apache.org/docs/1.7.6/spec.html#Schema+Resolution
{noformat}
* if reader's is a union, but writer's is not
The first schema in the reader's union that matches the writer's schema is 
recursively resolved against it. If none match, an error is signalled.
* if writer's is a union, but reader's is not
If the reader's schema matches the selected writer's schema, it is recursively 
resolved against it. If they do not match, an error is signalled.
{noformat}

Moreover, i tested a similar scenarios using pure avro code where i wrote using 
schema string and read it using [null,string].

 Unable to deserialize AVRO data when file schema and record schema are 
 different and nullable
 -

 Key: HIVE-7049
 URL: https://issues.apache.org/jira/browse/HIVE-7049
 Project: Hive
  Issue Type: Bug
Reporter: Mohammad Kamrul Islam
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-7049.1.patch


 It mainly happens when 
 1 )file schema and record schema are not same
 2 ) Record schema is nullable  but file schema is not.
 The potential code location is at class AvroDeserialize
  
 {noformat}
  if(AvroSerdeUtils.isNullableType(recordSchema)) {
   return deserializeNullableUnion(datum, fileSchema, recordSchema, 
 columnType);
 }
 {noformat}
 In the above code snippet, recordSchema is verified if it is nullable. But 
 the file schema is not checked.
 I tested with these values:
 {noformat}
 recordSchema= [null,string]
 fielSchema= string
 {noformat}
 And i got the following exception line numbers might not be the same due to 
 mu debugged code version.
 {noformat}
 org.apache.avro.AvroRuntimeException: Not a union: string 
 at org.apache.avro.Schema.getTypes(Schema.java:272)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserializeNullableUnion(AvroDeserializer.java:275)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.worker(AvroDeserializer.java:205)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.workerBase(AvroDeserializer.java:188)
 at 
 org.apache.hadoop.hive.serde2.avro.AvroDeserializer.deserialize(AvroDeserializer.java:174)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.verifyNullableType(TestAvroDeserializer.java:487)
 at 
 org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeNullableTypes(TestAvroDeserializer.java:407)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6430) MapJoin hash table has large memory overhead

2014-05-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997185#comment-13997185
 ] 

Gopal V commented on HIVE-6430:
---

I can confirm that if I do an mvn install once, this problem goes away for a 
day (always fails exactly only on the first build of the day with the patch). 

If I had to guess, that's because my maven update interval is once-a-day for 
snapshots. Once you commit this, the .m2/ version from apache-snapshots will 
match up and my builds won't break anymore (hopefully).

Commit this and if it breaks again for me, I'll post an addendum as a new 
patch. 

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
 HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
 HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
 HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6768) remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties

2014-05-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6768:


   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks for the contribution Eugene!


 remove 
 hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
 ---

 Key: HIVE-6768
 URL: https://issues.apache.org/jira/browse/HIVE-6768
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 0.14.0

 Attachments: HIVE-6768.patch


 now that MAPREDUCE-5806 is fixed we can remove 
 override-container-log4j.properties and and all the logic around this which 
 was introduced in HIVE-5511 to work around MAPREDUCE-5806
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7055) config not propagating for PTFOperator

2014-05-14 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997701#comment-13997701
 ] 

Harish Butani commented on HIVE-7055:
-

+1

 config not propagating for PTFOperator
 --

 Key: HIVE-7055
 URL: https://issues.apache.org/jira/browse/HIVE-7055
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.12.0, 0.13.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-7055.patch


 e.g. setting hive.join.cache.size has no effect and task nodes always got 
 default value of 25000



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 21430: HIVE-6994 - parquet-hive createArray strips null elements

2014-05-14 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21430/#review42983
---

Ship it!


- Szehon Ho


On May 14, 2014, 9:16 a.m., justin coffey wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/21430/
 ---
 
 (Updated May 14, 2014, 9:16 a.m.)
 
 
 Review request for hive.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 The createArray method in ParquetHiveSerDe strips null values from resultant 
 ArrayWritables.
 
 This patch:
 - removes an incorrect if null check in createArray
 - simplifies ParquetHiveSerDe
 - total refactor of TestParquetHiveSerDe for better test coverage and easier 
 regression testing
 
 
 Diffs
 -
 
   data/files/parquet_create.txt ccd48ee 
   
 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
 b689336 
   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
 3b56fc7 
   
 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetHiveSerDe.java
  PRE-CREATION 
   ql/src/test/queries/clientpositive/parquet_create.q 0b976bd 
   ql/src/test/results/clientpositive/parquet_create.q.out 3220be5 
 
 Diff: https://reviews.apache.org/r/21430/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 justin coffey
 




[jira] [Updated] (HIVE-7033) grant statements should check if the role exists

2014-05-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-7033:


Attachment: HIVE-7033.1.patch

 grant statements should check if the role exists
 

 Key: HIVE-7033
 URL: https://issues.apache.org/jira/browse/HIVE-7033
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-7033.1.patch, HIVE-7033.2.patch


 The following grant statement that grants to a role that does not exist 
 succeeds, but it should result in an error.
  grant all on t1 to role nosuchrole;



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7060) Column stats give incorrect min and distinct_count

2014-05-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-7060:
--

Description: 
It seems that the result from column statistics isn't correct on two measures 
for numeric columns: min (which is always 0) and distinct count. Here is an 
example:

{code}
select count(distinct avgTimeOnSite), min(avgTimeOnSite) from 
UserVisits_web_text_none;
...
OK
9   1
Time taken: 9.747 seconds, Fetched: 1 row(s)
{code}

The statisitics for the column:
{code}
desc formatted UserVisits_web_text_none avgTimeOnSite
...
# col_name  data_type   min max 
num_nulls   distinct_count  avg_col_len 
max_col_len num_trues   num_falses  
comment

avgTimeOnSite   int 0   9   
0   11  null
nullnull   
{code}

  was:
It seems that the result from column statistics isn't correct on two measures 
for numeric columns: min (which is always 0) and distinct count. Here is an 
example:

{code}
select count(distinct avgTimeOnSite), min(avgTimeOnSite) from 
UserVisits_web_text_none;
...
OK
9   1
Time taken: 9.747 seconds, Fetched: 1 row(s)
{code}

The statisitics for the column:
{code}
PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
PREHOOK: type: DESCTABLE
PREHOOK: Input: default@uservisits_web_text_none
POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
POSTHOOK: type: DESCTABLE
POSTHOOK: Input: default@uservisits_web_text_none
# col_name  data_type   min max 
num_nulls   distinct_count  avg_col_len 
max_col_len num_trues   num_falses  
comment

avgTimeOnSite   int 0   9   
0   11  null
nullnull   
{code}


 Column stats give incorrect min and distinct_count
 --

 Key: HIVE-7060
 URL: https://issues.apache.org/jira/browse/HIVE-7060
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that the result from column statistics isn't correct on two measures 
 for numeric columns: min (which is always 0) and distinct count. Here is an 
 example:
 {code}
 select count(distinct avgTimeOnSite), min(avgTimeOnSite) from 
 UserVisits_web_text_none;
 ...
 OK
 9 1
 Time taken: 9.747 seconds, Fetched: 1 row(s)
 {code}
 The statisitics for the column:
 {code}
 desc formatted UserVisits_web_text_none avgTimeOnSite
 ...
 # col_name  data_type   min max   
   num_nulls   distinct_count  avg_col_len 
 max_col_len num_trues   num_falses
   comment
 avgTimeOnSite   int 0   9 
   0   11  null
 nullnull   
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7060) Column stats give incorrect min and distinct_count

2014-05-14 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created HIVE-7060:
-

 Summary: Column stats give incorrect min and distinct_count
 Key: HIVE-7060
 URL: https://issues.apache.org/jira/browse/HIVE-7060
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.13.0
Reporter: Xuefu Zhang


It seems that the result from column statistics isn't correct on two measures 
for numeric columns: min (which is always 0) and distinct count. Here is an 
example:

{code}
select count(distinct avgTimeOnSite), min(avgTimeO from 
UserVisits_web_text_nonenSite) from UserVisits_web_text_none;
...
OK
9   1
Time taken: 9.747 seconds, Fetched: 1 row(s)
(code}

The statisitics for the column:
{code}
PREHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
PREHOOK: type: DESCTABLE
PREHOOK: Input: default@uservisits_web_text_none
POSTHOOK: query: desc formatted UserVisits_web_text_none avgTimeOnSite
POSTHOOK: type: DESCTABLE
POSTHOOK: Input: default@uservisits_web_text_none
# col_name  data_type   min max 
num_nulls   distinct_count  avg_col_len 
max_col_len num_trues   num_falses  
comment

avgTimeOnSite   int 0   9   
0   11  null
nullnull   
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7055) cofig not propagating for PTFOperator

2014-05-14 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-7055:
--

 Summary: cofig not propagating for PTFOperator
 Key: HIVE-7055
 URL: https://issues.apache.org/jira/browse/HIVE-7055
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 0.13.0, 0.12.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


e.g. setting hive.join.cache.size has no effect and task nodes always got 
default value of 25000



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-3159) Update AvroSerde to determine schema of new tables

2014-05-14 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3159:
-

Status: Open  (was: Patch Available)

Looks like a couple tests failed.

 Update AvroSerde to determine schema of new tables
 --

 Key: HIVE-3159
 URL: https://issues.apache.org/jira/browse/HIVE-3159
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Affects Versions: 0.12.0
Reporter: Jakob Homan
Assignee: Mohammad Kamrul Islam
 Attachments: HIVE-3159.10.patch, HIVE-3159.4.patch, 
 HIVE-3159.5.patch, HIVE-3159.6.patch, HIVE-3159.7.patch, HIVE-3159.9.patch, 
 HIVE-3159v1.patch


 Currently when writing tables to Avro one must manually provide an Avro 
 schema that matches what is being delivered by Hive. It'd be better to have 
 the serde infer this schema by converting the table's TypeInfo into an 
 appropriate AvroSchema.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7042) Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2

2014-05-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7042:
---

Status: Patch Available  (was: Open)

 Fix stats_partscan_1_23.q and orc_createas1.q for hadoop-2
 --

 Key: HIVE-7042
 URL: https://issues.apache.org/jira/browse/HIVE-7042
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
 Attachments: HIVE-7042.1.patch, HIVE-7042.1.patch.txt


 stats_partscan_1_23.q and orc_createas1.q should use HiveInputFormat as 
 opposed to CombineHiveInputFormat. RCFile uses DefaultCodec for compression 
 (uses DEFLATE) which is not splittable. Hence using CombineHiveIF will yield 
 different results for these tests. ORC should use HiveIF to generate ORC 
 splits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7060) Column stats give incorrect min and distinct_count

2014-05-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997824#comment-13997824
 ] 

Ashutosh Chauhan commented on HIVE-7060:


HIVE-4561 and this seems like to have same root cause

 Column stats give incorrect min and distinct_count
 --

 Key: HIVE-7060
 URL: https://issues.apache.org/jira/browse/HIVE-7060
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 0.13.0
Reporter: Xuefu Zhang

 It seems that the result from column statistics isn't correct on two measures 
 for numeric columns: min (which is always 0) and distinct count. Here is an 
 example:
 {code}
 select count(distinct avgTimeOnSite), min(avgTimeOnSite) from 
 UserVisits_web_text_none;
 ...
 OK
 9 1
 Time taken: 9.747 seconds, Fetched: 1 row(s)
 {code}
 The statisitics for the column:
 {code}
 desc formatted UserVisits_web_text_none avgTimeOnSite
 ...
 # col_name  data_type   min max   
   num_nulls   distinct_count  avg_col_len 
 max_col_len num_trues   num_falses
   comment
 avgTimeOnSite   int 0   9 
   0   11  null
 nullnull   
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier

2014-05-14 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-7056:


 Summary: TestPig_11 fails with Pig 12.1 and earlier
 Key: HIVE-7056
 URL: https://issues.apache.org/jira/browse/HIVE-7056
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman


on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
looking for *hcatalog-core-*.jar etc.  In Pig 12.1 it's looking for 
hcatalog-core-*.jar, which doesn't work with Hive 0.13.

The TestPig_11 job fails with
{noformat}
2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
during parsing: Error during parsing. Could not resolve 
org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Failed to parse: Pig script failed to parse: 
file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at 
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.Main.main(Main.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: 
file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
at 
org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
at 
org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
at 
org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
at 
org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
at 
org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
at 
org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
at 
org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
at 
org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
... 16 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
at 
org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
... 24 more
{noformat}

the key to this is 
{noformat}
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
 No such file or directory
ls: 
/private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar:
 No such file or directory
ls: 

[jira] [Commented] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set

2014-05-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997797#comment-13997797
 ] 

Thejas M Nair commented on HIVE-7057:
-

+1


 webhcat e2e deployment scripts don't have x bit set
 ---

 Key: HIVE-7057
 URL: https://issues.apache.org/jira/browse/HIVE-7057
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-7057.patch


 also, update env.sh to use latest Pig release
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6430) MapJoin hash table has large memory overhead

2014-05-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-6430:
---

Attachment: HIVE-6430.14.patch

Reproed it on SVN. It is not related to this patch, fixing anyway. I'm assuming 
+1 stands...

 MapJoin hash table has large memory overhead
 

 Key: HIVE-6430
 URL: https://issues.apache.org/jira/browse/HIVE-6430
 Project: Hive
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-6430.01.patch, HIVE-6430.02.patch, 
 HIVE-6430.03.patch, HIVE-6430.04.patch, HIVE-6430.05.patch, 
 HIVE-6430.06.patch, HIVE-6430.07.patch, HIVE-6430.08.patch, 
 HIVE-6430.09.patch, HIVE-6430.10.patch, HIVE-6430.11.patch, 
 HIVE-6430.12.patch, HIVE-6430.12.patch, HIVE-6430.13.patch, 
 HIVE-6430.14.patch, HIVE-6430.patch


 Right now, in some queries, I see that storing e.g. 4 ints (2 for key and 2 
 for row) can take several hundred bytes, which is ridiculous. I am reducing 
 the size of MJKey and MJRowContainer in other jiras, but in general we don't 
 need to have java hash table there.  We can either use primitive-friendly 
 hashtable like the one from HPPC (Apache-licenced), or some variation, to map 
 primitive keys to single row storage structure without an object per row 
 (similar to vectorization).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6938) Add Support for Parquet Column Rename

2014-05-14 Thread Daniel Weeks (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Weeks updated HIVE-6938:
---

Attachment: HIVE-6938.3.patch

Updated to use global switch until HIVE-6936 is resolved.  This means all 
tables will be treated the same until input formats have access to table 
properties.

 Add Support for Parquet Column Rename
 -

 Key: HIVE-6938
 URL: https://issues.apache.org/jira/browse/HIVE-6938
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.13.0
Reporter: Daniel Weeks
Assignee: Daniel Weeks
 Attachments: HIVE-6938.1.patch, HIVE-6938.2.patch, HIVE-6938.2.patch, 
 HIVE-6938.3.patch


 Parquet was originally introduced without 'replace columns' support in ql.  
 In addition, the default behavior for parquet is to access columns by name as 
 opposed to by index by the Serde.  
 Parquet should allow for either columnar (index based) access or name based 
 access because it can support either.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6304) Update HCatReader/Writer docs to reflect recent changes

2014-05-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6304:


Fix Version/s: (was: 0.13.0)
   0.14.0

 Update HCatReader/Writer docs to reflect recent changes
 ---

 Key: HIVE-6304
 URL: https://issues.apache.org/jira/browse/HIVE-6304
 Project: Hive
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 0.14.0


 HIVE-6248 made changes to the HCatReader and HCatWriter classes.  Those 
 changes need to be reflect in the [HCatReader/Writer 
 docs|https://cwiki.apache.org/confluence/display/Hive/HCatalog+ReaderWriter]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7034) Explain result of TezWork is not deterministic

2014-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993850#comment-13993850
 ] 

Hive QA commented on HIVE-7034:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643888/HIVE-7034.1.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5433 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/154/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/154/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643888

 Explain result of TezWork is not deterministic
 --

 Key: HIVE-7034
 URL: https://issues.apache.org/jira/browse/HIVE-7034
 Project: Hive
  Issue Type: Task
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.14.0

 Attachments: HIVE-7034.1.patch.txt


 Recent failure on tez tests are caused by different iteration order of 
 HashMap implementations. Let's fix that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7015) Failing to inherit group/permission should not fail the operation

2014-05-14 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7015:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

 Failing to inherit group/permission should not fail the operation
 -

 Key: HIVE-7015
 URL: https://issues.apache.org/jira/browse/HIVE-7015
 Project: Hive
  Issue Type: Bug
  Components: Security
Affects Versions: 0.14.0
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 0.14.0

 Attachments: HIVE-7015.patch


 In the previous changes, chgrp and chmod were put on the critical path of 
 directory creation and file copy/mv
 These should not be, for instance existing users may not have hive-users in 
 the same group as hive group, so chgrp would fail if they turn on the flag 
 hive.warehouse.subdir.inherit.perms.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5370) format_number udf should take user specifed format as argument

2014-05-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5370:


Fix Version/s: (was: 0.13.0)
   0.14.0

 format_number udf should take user specifed format as argument
 --

 Key: HIVE-5370
 URL: https://issues.apache.org/jira/browse/HIVE-5370
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
Priority: Minor
 Fix For: 0.14.0

 Attachments: D13185.1.patch, D13185.2.patch, HIVE-5370.patch, 
 HIVE-5370.patch


 Currently, format_number udf formats the number to #,###,###.##, but it 
 should also take a user specified format as optional input.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler

2014-05-14 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992856#comment-13992856
 ] 

Swarnim Kulkarni commented on HIVE-6411:


RB updated.

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.10.patch.txt, 
 HIVE-6411.11.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, 
 HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, 
 HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5342) Remove pre hadoop-0.20.0 related codes

2014-05-14 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-5342:
-

Attachment: HIVE-5342.2.patch

The test report link didn't look like it was for this patch.  Uploading patch 
again.

 Remove pre hadoop-0.20.0 related codes
 --

 Key: HIVE-5342
 URL: https://issues.apache.org/jira/browse/HIVE-5342
 Project: Hive
  Issue Type: Task
Reporter: Navis
Assignee: Jason Dere
Priority: Trivial
 Attachments: D13047.1.patch, HIVE-5342.1.patch, HIVE-5342.2.patch, 
 HIVE-5342.2.patch


 Recently, we discussed not supporting hadoop-0.20.0. If it would be done like 
 that or not, 0.17 related codes would be removed before that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-05-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993106#comment-13993106
 ] 

Thejas M Nair commented on HIVE-6846:
-

That failure does not indicate a product problem. In fact there is no reason to 
set local scratch dirs to 777 . That change was part of  HIVE-5486. The idea is 
that in HS2, with doAs enabled, the files/subdirs under scratch dir happens as 
the end user. But in case of local file system, this is not true, all file 
creation happens as the HS2 server user.
There are some changes that Vaibhav and Vikram have been working on to create 
the base scratch dir as the actual user running the query. That will help 
address this issue. The test issue is already not there in trunk.

I don't think this should block 0.13.1 release.


 allow safe set commands with sql standard authorization
 ---

 Key: HIVE-6846
 URL: https://issues.apache.org/jira/browse/HIVE-6846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch


 HIVE-6827 disables all set commands when SQL standard authorization is turned 
 on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5664) Drop cascade database fails when the db has any tables with indexes

2014-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994148#comment-13994148
 ] 

Hive QA commented on HIVE-5664:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12610898/HIVE-5664.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5436 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_database_drop
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/157/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/157/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12610898

 Drop cascade database fails when the db has any tables with indexes
 ---

 Key: HIVE-5664
 URL: https://issues.apache.org/jira/browse/HIVE-5664
 Project: Hive
  Issue Type: Bug
  Components: Indexing, Metastore
Affects Versions: 0.10.0, 0.11.0, 0.12.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
 Fix For: 0.14.0

 Attachments: HIVE-5664.1.patch.txt


 {code}
 CREATE DATABASE db2; 
 USE db2; 
 CREATE TABLE tab1 (id int, name string); 
 CREATE INDEX idx1 ON TABLE tab1(id) as 'COMPACT' with DEFERRED REBUILD IN 
 TABLE tab1_indx; 
 DROP DATABASE db2 CASCADE;
 {code}
 Last DDL fails with the following error:
 {code}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. Database does not exist: db2
 Hive.log has following exception
 2013-10-27 20:46:16,629 ERROR exec.DDLTask (DDLTask.java:execute(434)) - 
 org.apache.hadoop.hive.ql.metadata.HiveException: Database does not exist: db2
 at 
 org.apache.hadoop.hive.ql.exec.DDLTask.dropDatabase(DDLTask.java:3473)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:231)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1441)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1219)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1047)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:790)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:623)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
 Caused by: NoSuchObjectException(message:db2.tab1_indx table not found)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1376)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103)
 at com.sun.proxy.$Proxy7.get_table(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:890)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:660)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:652)
 at 
 

[jira] [Commented] (HIVE-6900) HostUtil.getTaskLogUrl signature change causes compilation to fail

2014-05-14 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993250#comment-13993250
 ] 

Szehon Ho commented on HIVE-6900:
-

[~jdere] We were looking at this change and were wondering, why is the log 
talking about MR1 when it is a Hadoop23Shims?  Isn't the condition of the else 
statement still MR2, but to handle local mode case?  Seems like TaskServlet is 
in MR1.

 HostUtil.getTaskLogUrl signature change causes compilation to fail
 --

 Key: HIVE-6900
 URL: https://issues.apache.org/jira/browse/HIVE-6900
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.13.0, 0.14.0
Reporter: Chris Drome
Assignee: Jason Dere
 Fix For: 0.14.0

 Attachments: HIVE-6900.1.patch.txt, HIVE-6900.2.patch


 The signature for HostUtil.getTaskLogUrl has changed between Hadoop-2.3 and 
 Hadoop-2.4.
 Code in 
 shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 
 works with Hadoop-2.3 method and causes compilation failure with Hadoop-2.4.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6985) sql std auth - privileges grants to public role not being honored

2014-05-14 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6985:
---

Fix Version/s: 0.13.1

 sql std auth - privileges grants to public role not being honored
 -

 Key: HIVE-6985
 URL: https://issues.apache.org/jira/browse/HIVE-6985
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Priority: Critical
 Fix For: 0.14.0, 0.13.1

 Attachments: HIVE-6985.1.patch, HIVE-6985.2.patch, HIVE-6985.3.patch


 When a privilege is granted to public role, the privilege is supposed to be 
 applicable for all users.
 However, the privilege check fails for users, even if the have public role in 
 the list of current roles.
 Note that the issue is only with public role. Grant of privileges of other 
 role are not affected.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6826) Hive-tez has issues when different partitions work off of different input types

2014-05-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992672#comment-13992672
 ] 

Sushanth Sowmyan commented on HIVE-6826:


Yup, on re-testing, it seems to pass on my setup as well. I'm going to ignore 
the initial failure report. Thanks for checking up on it!

 Hive-tez has issues when different partitions work off of different input 
 types
 ---

 Key: HIVE-6826
 URL: https://issues.apache.org/jira/browse/HIVE-6826
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.13.0, 0.14.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: HIVE-6826.1.patch, HIVE-6826.2.patch


 create table test (key int, value string) partitioned by (p int) stored as 
 textfile;
 insert into table test partition (p=1) select * from src limit 10;
 alter table test set fileformat orc;
 insert into table test partition (p=2) select * from src limit 10;
 describe test;
 select * from test where p=1 and key  0;
 select * from test where p=2 and key  0;
 select * from test where key  0;
 throws a classcast exception



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6910) Invalid column access info for partitioned table

2014-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994033#comment-13994033
 ] 

Hive QA commented on HIVE-6910:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12643891/HIVE-6910.4.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5433 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/155/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/155/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12643891

 Invalid column access info for partitioned table
 

 Key: HIVE-6910
 URL: https://issues.apache.org/jira/browse/HIVE-6910
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0, 0.12.0, 0.13.0
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6910.1.patch.txt, HIVE-6910.2.patch.txt, 
 HIVE-6910.3.patch.txt, HIVE-6910.4.patch.txt


 From http://www.mail-archive.com/user@hive.apache.org/msg11324.html
 neededColumnIDs in TS is only for non-partition columns. But 
 ColumnAccessAnalyzer is calculating it on all columns.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7030) Remove hive.hadoop.classpath from hiveserver2.cmd

2014-05-14 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993815#comment-13993815
 ] 

Vaibhav Gumashta commented on HIVE-7030:


Committed to trunk.

Thanks for the contribution [~hsubramaniyan]! Thanks for pointing out the issue 
[~leftylev]!

 Remove hive.hadoop.classpath from hiveserver2.cmd
 -

 Key: HIVE-7030
 URL: https://issues.apache.org/jira/browse/HIVE-7030
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 0.14.0

 Attachments: HIVE-7030.1.patch


 This parameter is not used anywhere and should be removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5315) Cannot attach debugger to Hiveserver2

2014-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994197#comment-13994197
 ] 

Hive QA commented on HIVE-5315:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12605156/HIVE-5315.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5436 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/161/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/161/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12605156

 Cannot attach debugger to Hiveserver2 
 --

 Key: HIVE-5315
 URL: https://issues.apache.org/jira/browse/HIVE-5315
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Kousuke Saruta
Assignee: Kousuke Saruta
 Fix For: 0.14.0

 Attachments: HIVE-5315.patch


 In current implementation, bin/hive retrieves HADOOP_VERSION like as follows
 {code}
 HADOOP_VERSION=$($HADOOP version | awk '{if (NR == 1) {print $2;}}');
 {code}
 But, sometimes, hadoop version doesn't show version information at the 
 first line.
 If HADOOP_VERSION is not retrieve collectly, Hive or related processes will 
 not be up.
 I faced this situation when I try to debug Hiveserver2 with debug option like 
 as follows 
 {code}
 -Xdebug -Xrunjdwp:trunsport=dt_socket,suspend=n,server=y,address=9876
 {code}
 Then, hadoop version shows -Xdebug... at the first line.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4867) Deduplicate columns appearing in both the key list and value list of ReduceSinkOperator

2014-05-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4867:


Status: Patch Available  (was: Open)

 Deduplicate columns appearing in both the key list and value list of 
 ReduceSinkOperator
 ---

 Key: HIVE-4867
 URL: https://issues.apache.org/jira/browse/HIVE-4867
 Project: Hive
  Issue Type: Improvement
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4867.1.patch.txt


 A ReduceSinkOperator emits data in the format of keys and values. Right now, 
 a column may appear in both the key list and value list, which result in 
 unnecessary overhead for shuffling. 
 Example:
 We have a query shown below ...
 {code:sql}
 explain select ss_ticket_number from store_sales cluster by ss_ticket_number;
 {\code}
 The plan is ...
 {code}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 store_sales 
   TableScan
 alias: store_sales
 Select Operator
   expressions:
 expr: ss_ticket_number
 type: int
   outputColumnNames: _col0
   Reduce Output Operator
 key expressions:
   expr: _col0
   type: int
 sort order: +
 Map-reduce partition columns:
   expr: _col0
   type: int
 tag: -1
 value expressions:
   expr: _col0
   type: int
   Reduce Operator Tree:
 Extract
   File Output Operator
 compressed: false
 GlobalTableId: 0
 table:
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   Stage: Stage-0
 Fetch Operator
   limit: -1
 {\code}
 The column 'ss_ticket_number' is in both the key list and value list of the 
 ReduceSinkOperator. The type of ss_ticket_number is int. For this case, 
 BinarySortableSerDe will introduce 1 byte more for every int in the key. 
 LazyBinarySerDe will also introduce overhead when recording the length of a 
 int. For every int, 10 bytes should be a rough estimation of the size of data 
 emitted from the Map phase. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7043) When using the tez session pool via hive, once sessions time out, all queries go to the default queue

2014-05-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7043:
-

Status: Patch Available  (was: Open)

 When using the tez session pool via hive, once sessions time out, all queries 
 go to the default queue
 -

 Key: HIVE-7043
 URL: https://issues.apache.org/jira/browse/HIVE-7043
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Fix For: 0.14.0

 Attachments: HIVE-7043.1.patch


 When using a tez session pool to run multiple queries, once the sessions time 
 out, we always end up using the default queue to launch queries. The load 
 balancing doesn't work in this case.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase

2014-05-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2365:


Fix Version/s: (was: 0.13.0)
   0.14.0

 SQL support for bulk load into HBase
 

 Key: HIVE-2365
 URL: https://issues.apache.org/jira/browse/HIVE-2365
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: John Sichi
Assignee: Nick Dimiduk
 Fix For: 0.14.0

 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, 
 HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch


 Support the as simple as this SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5150) UnsatisfiedLinkError when running hive unit tests on Windows

2014-05-14 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5150:


Fix Version/s: (was: 0.13.0)
   0.14.0

 UnsatisfiedLinkError when running hive unit tests on Windows
 

 Key: HIVE-5150
 URL: https://issues.apache.org/jira/browse/HIVE-5150
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.12.0
 Environment: Windows
Reporter: shanyu zhao
Assignee: shanyu zhao
 Fix For: 0.14.0

 Attachments: HIVE-5150.patch


 When running any hive unit tests against hadoop 2.0, it will fail with error 
 like this:
 [junit] Exception in thread main java.lang.UnsatisfiedLinkError: 
 org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z
 [junit]   at 
 org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
 [junit]   at 
 org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:423)
 [junit]   at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:933)
 [junit]   at 
 org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:177)
 [junit]   at 
 org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:164)
 This is due to the test process failed to find hadoop.dll. This is related to 
 YARN-729.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7000) Several issues with javadoc generation

2014-05-14 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7000:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Harish!

 Several issues with javadoc generation
 --

 Key: HIVE-7000
 URL: https://issues.apache.org/jira/browse/HIVE-7000
 Project: Hive
  Issue Type: Improvement
Reporter: Harish Butani
Assignee: Harish Butani
 Fix For: 0.14.0

 Attachments: HIVE-7000.1.patch


 1.
 Ran 'mvn  javadoc:javadoc -Phadoop-2'.  Encountered several issues
 - Generated classes are included in the javadoc
 - generation fails in the top level hcatalog folder because its src folder  
 contains  no java files.
 Patch attached to fix these issues.
 2.
 Tried mvn javadoc:aggregate -Phadoop-2 
 - cannot get an aggregated javadoc for all of hive
 - tried setting 'aggregate' parameter to true. Didn't work
 There are several questions in StackOverflow about multiple project javadoc. 
 Seems like this is broken. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6549) remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-6549:
-

Attachment: HIVE-6549.2.patch

adressed [~thejas]'s comments

 remove templeton.jar from webhcat-default.xml, remove 
 hcatalog/bin/hive-config.sh
 -

 Key: HIVE-6549
 URL: https://issues.apache.org/jira/browse/HIVE-6549
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.12.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
Priority: Minor
 Attachments: HIVE-6549.2.patch, HIVE-6549.patch


 this property is no longer used
 also removed corresponding AppConfig.TEMPLETON_JAR_NAME
 hcatalog/bin/hive-config.sh is not used
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-5810) create a function add_date as exists in mysql

2014-05-14 Thread Anandha L Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997272#comment-13997272
 ] 

Anandha L Ranganathan commented on HIVE-5810:
-

The existing function DATE_ADD did not support to add day , month and year as 
Unit.  The expr unit is always days. 

But this function has the support for DAY,MONTH and YEAR.
ADD_DATE(date, expr unit, INTERVAL).

 create a function add_date   as exists in mysql 
 

 Key: HIVE-5810
 URL: https://issues.apache.org/jira/browse/HIVE-5810
 Project: Hive
  Issue Type: Improvement
Reporter: Anandha L Ranganathan
Assignee: Anandha L Ranganathan
 Attachments: HIVE-5810.2.patch, HIVE-5810.patch

   Original Estimate: 40h
  Remaining Estimate: 40h

 MySQL has ADDDATE(date,INTERVAL expr unit).
 Similarly in Hive we can have  (date,unit,expr). 
 Here Unit is DAY/Month/Year
 For example,
 add_date('2013-11-09','DAY',2) will return 2013-11-11.
 add_date('2013-11-09','Month',2) will return 2014-01-09.
 add_date('2013-11-09','Year',2) will return 2014-11-11.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6411) Support more generic way of using composite key for HBaseHandler

2014-05-14 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996823#comment-13996823
 ] 

Swarnim Kulkarni commented on HIVE-6411:


Thanks for the review Xuefu. I think we can now mark HIVE-6290 as resolved as 
well as the patch for that was included as a part of this JIRA.

 Support more generic way of using composite key for HBaseHandler
 

 Key: HIVE-6411
 URL: https://issues.apache.org/jira/browse/HIVE-6411
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-6411.1.patch.txt, HIVE-6411.10.patch.txt, 
 HIVE-6411.11.patch.txt, HIVE-6411.2.patch.txt, HIVE-6411.3.patch.txt, 
 HIVE-6411.4.patch.txt, HIVE-6411.5.patch.txt, HIVE-6411.6.patch.txt, 
 HIVE-6411.7.patch.txt, HIVE-6411.8.patch.txt, HIVE-6411.9.patch.txt


 HIVE-2599 introduced using custom object for the row key. But it forces key 
 objects to extend HBaseCompositeKey, which is again extension of LazyStruct. 
 If user provides proper Object and OI, we can replace internal key and keyOI 
 with those. 
 Initial implementation is based on factory interface.
 {code}
 public interface HBaseKeyFactory {
   void init(SerDeParameters parameters, Properties properties) throws 
 SerDeException;
   ObjectInspector createObjectInspector(TypeInfo type) throws SerDeException;
   LazyObjectBase createObject(ObjectInspector inspector) throws 
 SerDeException;
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7056) WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7056:
-

Summary: WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13  
(was: TestPig_11 fails with Pig 12.1 and earlier)

 WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13
 ---

 Key: HIVE-7056
 URL: https://issues.apache.org/jira/browse/HIVE-7056
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman

 on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is 
 looking for \*hcatalog-core-\*.jar etc.  In Pig 12.1 it's looking for 
 hcatalog-core-\*.jar, which doesn't work with Hive 0.13.
 The TestPig_11 job fails with
 {noformat}
 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception 
 during parsing: Error during parsing. Could not resolve 
 org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
 Failed to parse: Pig script failed to parse: 
 file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
 org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
 resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196)
   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678)
   at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411)
   at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344)
   at org.apache.pig.PigServer.executeBatch(PigServer.java:369)
   at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
   at 
 org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
   at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
   at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
   at org.apache.pig.Main.run(Main.java:478)
   at org.apache.pig.Main.main(Main.java:156)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 Caused by: 
 file hcatloadstore.pig, line 19, column 34 pig script failed to validate: 
 org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not 
 resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., 
 org.apache.pig.builtin., org.apache.pig.impl.builtin.]
   at 
 org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299)
   at 
 org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
   at 
 org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
   at 
 org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188)
   ... 16 more
 Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: 
 Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, 
 java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
   at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653)
   at 
 org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296)
   ... 24 more
 {noformat}
 the key to this is 
 {noformat}
 ls: 
 /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar:
  No such file or directory
 ls: 
 /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
  

[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found

2014-05-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-7035:
-

Status: Patch Available  (was: Open)

 Templeton returns 500 for user errors - when job cannot be found
 

 Key: HIVE-7035
 URL: https://issues.apache.org/jira/browse/HIVE-7035
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 0.13.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-7035.patch


 curl -i 
 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman'
  should return HTTP Status code 4xx when no such job exists; it currently 
 returns 500.
 {noformat}
 {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: 
 Application with id 'application_201304291205_0015' doesn't exist in 
 RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager
 .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol
 PBServiceImpl.java:120)\r\n\tat 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat
  org.apache.hado
 op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat
  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat 
 org.apache.hadoop.ipc.Server$Handler$1.run(Serve
 r.java:2053)\r\n\tat 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat 
 java.security.AccessController.doPrivileged(Native Method)\r\n\tat 
 javax.security.auth.Subject.doAs(Subject.ja
 va:415)\r\n\tat 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat
  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n}
 {noformat}
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6965) Transaction manager should use RDBMS time instead of machine time

2014-05-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998111#comment-13998111
 ] 

Ashutosh Chauhan commented on HIVE-6965:


+1

 Transaction manager should use RDBMS time instead of machine time
 -

 Key: HIVE-6965
 URL: https://issues.apache.org/jira/browse/HIVE-6965
 Project: Hive
  Issue Type: Bug
  Components: Locking
Affects Versions: 0.13.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-6965.patch


 Current TxnHandler and CompactionTxnHandler use System.currentTimeMillis() 
 when they need to determine the time (such as heartbeating transactions).  In 
 situations where there are multiple Thrift metastore services or users are 
 using an embedded metastore this will lead to issues.  We should instead be 
 using time from the RDBMS, which is guaranteed to be the same for all users.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7066) hive-exec jar is missing avro-mapred

2014-05-14 Thread David Chen (JIRA)
David Chen created HIVE-7066:


 Summary: hive-exec jar is missing avro-mapred
 Key: HIVE-7066
 URL: https://issues.apache.org/jira/browse/HIVE-7066
 Project: Hive
  Issue Type: Bug
Reporter: David Chen


Running a simple query that reads an Avro table caused the following exception 
to be thrown on the cluster side:

{code}
java.lang.RuntimeException: 
org.apache.hive.com.esotericsoftware.kryo.KryoException: 
java.lang.IllegalArgumentException: Unable to create serializer 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
Serialization trace:
outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:365)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:276)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:254)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:445)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:438)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:587)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:191)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:394)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.hive.com.esotericsoftware.kryo.KryoException: 
java.lang.IllegalArgumentException: Unable to create serializer 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
Serialization trace:
outputFileFormatClass (org.apache.hadoop.hive.ql.plan.PartitionDesc)
aliasToPartnInfo (org.apache.hadoop.hive.ql.plan.MapWork)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:942)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:850)
at 
org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:864)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:334)
... 13 more
Caused by: java.lang.IllegalArgumentException: Unable to create serializer 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer for 
class: org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat
at 
org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:45)
at 
org.apache.hive.com.esotericsoftware.kryo.factories.ReflectionSerializerFactory.makeSerializer(ReflectionSerializerFactory.java:26)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.newDefaultSerializer(Kryo.java:343)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:336)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.registerImplicit(DefaultClassResolver.java:56)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:476)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:148)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
at 

[jira] [Commented] (HIVE-6846) allow safe set commands with sql standard authorization

2014-05-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993126#comment-13993126
 ] 

Sushanth Sowmyan commented on HIVE-6846:


Awesome, good to hear. I will not consider this a blocker for 0.13.1.

Thanks!

 allow safe set commands with sql standard authorization
 ---

 Key: HIVE-6846
 URL: https://issues.apache.org/jira/browse/HIVE-6846
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 0.13.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.13.0

 Attachments: HIVE-6846.1.patch, HIVE-6846.2.patch, HIVE-6846.3.patch


 HIVE-6827 disables all set commands when SQL standard authorization is turned 
 on, but not all set commands are unsafe. We should allow safe set commands.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7025) TTL on hive tables

2014-05-14 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992808#comment-13992808
 ] 

Edward Capriolo commented on HIVE-7025:
---

We do something similar however we also have the ability to delete partitions 
over a certain age. Hive already has a property inside every table called 
retention that we could consider using.

This code is a good first step but I have one question. Isn't this code rather 
racey? If we have multiple CLIs running threads they could all be 
simultaneously deleting tables, and a CLI with a system with a misconfiguration 
clock could potentially delete all the tables. I think if we do this it should 
be a stand alone piece. 

 TTL on hive tables
 --

 Key: HIVE-7025
 URL: https://issues.apache.org/jira/browse/HIVE-7025
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7025.1.patch.txt


 Add self destruction properties for temporary tables.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5538) Turn on vectorization by default.

2014-05-14 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-5538:
---

Attachment: HIVE-5538.5.patch

 Turn on vectorization by default.
 -

 Key: HIVE-5538
 URL: https://issues.apache.org/jira/browse/HIVE-5538
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-5538.1.patch, HIVE-5538.2.patch, HIVE-5538.3.patch, 
 HIVE-5538.4.patch, HIVE-5538.5.patch


   Vectorization should be turned on by default, so that users don't have to 
 specifically enable vectorization. 
   Vectorization code validates and ensures that a query falls back to row 
 mode if it is not supported on vectorized code path. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)