[jira] [Commented] (HIVE-5132) Can't access to hwi due to "No Java compiler available"

2014-07-16 Thread Shengjun Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063336#comment-14063336
 ] 

Shengjun Xin commented on HIVE-5132:


We meet the same issue and copy jasper-compiler-jdt.jar to $HIVE_HOME/lib can 
resolve this issue, the error log is:
{code}
Unable to find a javac compiler;
com.sun.tools.javac.Main is not on the classpath.
Perhaps JAVA_HOME does not point to the JDK.
It is currently set to "/usr/java/jdk1.7.0_45/jre"
at 
org.apache.tools.ant.taskdefs.compilers.CompilerAdapterFactory.getCompiler(CompilerAdapterFactory.java:129)
at 
org.apache.tools.ant.taskdefs.Javac.findSupportedFileExtensions(Javac.java:979)
at org.apache.tools.ant.taskdefs.Javac.scanDir(Javac.java:956)
at org.apache.tools.ant.taskdefs.Javac.execute(Javac.java:927)
at 
org.apache.jasper.compiler.AntCompiler.generateClass(AntCompiler.java:220)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:298)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:277)
at org.apache.jasper.compiler.Compiler.compile(Compiler.java:265)
at 
org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:564)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:299)
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
at 
org.mortbay.jetty.servlet.DefaultServlet.doGet(DefaultServlet.java:503)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at 
org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{code}

> Can't access to hwi due to "No Java compiler available"
> ---
>
> Key: HIVE-5132
> URL: https://issues.apache.org/jira/browse/HIVE-5132
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0, 0.11.0
> Environment: JDK1.6, hadoop 2.0.4-alpha
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Fix For: 0.13.0
>
> Attachments: HIVE-5132-01.patch
>
>
> I want to use hwi to submit hive queries, but after start hwi successfully, I 
> can't open the web page of it.
> I noticed that someone also met the same issue in hive-0.10.
> Reproduce steps:
> --
> 1. start hwi
> bin/hive --config $HIVE_CONF_DIR --service hwi
> 2. access to http://:/hwi via browser
> got the following error message:
> HTTP ERROR 500
> Problem accessing /hwi/. Reason: 
> No Java compiler available
> Caused by:
> java.lang.IllegalStateException: No Java compile

[jira] [Updated] (HIVE-6866) Hive server2 jdbc driver connection leak with namenode

2014-04-08 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-6866:
---

Description: 
1. Set 'ipc.client.connection.maxidletime' to 360 in core-site.xml and 
start hive-server2.
2. Connect hive server2 repetitively in a while true loop.
3. The tcp connection number will increase until out of memory, it seems that 
hive server2 will not close the connection until the time out, the error 
message is as the following:
{code}
2014-03-18 23:30:36,873 ERROR ql.Driver (SessionState.java:printError(386)) - 
FAILED: RuntimeException java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
java.lang.RuntimeException: java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:190)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8676)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:181)
at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at 
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:37)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:483)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2259)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.jav

[jira] [Updated] (HIVE-6866) Hive server2 jdbc driver connection leak with namenode

2014-04-08 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-6866:
---

Description: 
1. Set 'ipc.client.connection.maxidletime' to 360 in core-site.xml and 
start hive-server2.
2. Connect hive server2 repetitively in a while true loop.
3. It seems that hive server2 will not close the connection until the time out, 
the error message is as the following:
{code}
2014-03-18 23:30:36,873 ERROR ql.Driver (SessionState.java:printError(386)) - 
FAILED: RuntimeException java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
java.lang.RuntimeException: java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:190)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8676)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:181)
at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at 
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:37)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:483)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2259)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2230)
at 
org.apache.hadoop.hdfs.DistributedFileSyst

[jira] [Created] (HIVE-6866) Hive server2 jdbc driver connection leak with namenode

2014-04-08 Thread Shengjun Xin (JIRA)
Shengjun Xin created HIVE-6866:
--

 Summary: Hive server2 jdbc driver connection leak with namenode
 Key: HIVE-6866
 URL: https://issues.apache.org/jira/browse/HIVE-6866
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Shengjun Xin


1. Set 'ipc.client.connection.maxidletime' to 360 in core-site.xml and 
start hive-server2
2. Connect hive server2 continuously
3. It seems that hive server2 will not close the connection until the time out, 
the error message is as the following:
{code}
2014-03-18 23:30:36,873 ERROR ql.Driver (SessionState.java:printError(386)) - 
FAILED: RuntimeException java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
java.lang.RuntimeException: java.io.IOException: Failed on local exception: 
java.io.IOException: Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:190)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:231)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:288)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1274)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1059)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8676)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:278)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:433)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:95)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:181)
at 
org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:148)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:203)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:40)
at 
org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:37)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:524)
at 
org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:37)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
Couldn't set up IO streams; Host Details : local host is: 
"hdm1.hadoop.local/192.168.2.101"; destination host is: 
"hdm1.hadoop.local":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:761)
at org.apache.hadoop.ipc.Client.call(Client.java:1239)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at com.sun.proxy.$Proxy11.mkdirs(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:483)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java

[jira] [Commented] (HIVE-5850) Multiple table join error for avro

2013-11-26 Thread Shengjun Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13833425#comment-13833425
 ] 

Shengjun Xin commented on HIVE-5850:


So could you please give me an example that parent can not get the latest 
schema?

> Multiple table join error for avro 
> ---
>
> Key: HIVE-5850
> URL: https://issues.apache.org/jira/browse/HIVE-5850
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Shengjun Xin
> Attachments: part.tar.gz, partsupp.tar.gz, schema.tar.gz
>
>
> Reproduce step:
> {code}
> -- Create table Part.
> CREATE EXTERNAL TABLE part
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');
> -- Create table Part Supplier.
> CREATE EXTERNAL TABLE partsupp
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
> --- Query
> select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
> p.p_partkey=1;
> {code}
> {code}
> Error message is:
> Error: java.io.IOException: java.io.IOException: 
> org.apache.avro.AvroTypeException: Found {
>   "type" : "record",
>   "name" : "partsupp",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "ps_partkey",
> "type" : "long"
>   }, {
> "name" : "ps_suppkey",
> "type" : "long"
>   }, {
> "name" : "ps_availqty",
> "type" : "long"
>   }, {
> "name" : "ps_supplycost",
> "type" : "double"
>   }, {
> "name" : "ps_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }, expecting {
>   "type" : "record",
>   "name" : "part",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "p_partkey",
> "type" : "long"
>   }, {
> "name" : "p_name",
> "type" : "string"
>   }, {
> "name" : "p_mfgr",
> "type" : "string"
>   }, {
> "name" : "p_brand",
> "type" : "string"
>   }, {
> "name" : "p_type",
> "type" : "string"
>   }, {
> "name" : "p_size",
> "type" : "int"
>   }, {
> "name" : "p_container",
> "type" : "string"
>   }, {
> "name" : "p_retailprice",
> "type" : "double"
>   }, {
> "name" : "p_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HIVE-5850) Multiple table join error for avro

2013-11-20 Thread Shengjun Xin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827445#comment-13827445
 ] 

Shengjun Xin commented on HIVE-5850:


This issue is because of using wrong schema when process the split.

In getSchema function of AvroGenericRecordReader.java, if a partition is the 
prefix of a split, it will use schema of this partition to parse the split, but 
this is not always correct.

For example, partition '/user/hadoop/tpc-h/data/part' is the prefix of 
'/user/hadoop/tpc-h/data/partsupp/good_2013-01_partsupp_tbl_0002.avro', but we 
can not use the schema of '/user/hadoop/tpc-h/data/part' to parse 
'/user/hadoop/tpc-h/data/partsupp/good_2013-01_partsupp_tbl_0002.avro'

In my opinion, if a partition is the path parent of a split, we can use this 
partition's schema to parse the split. 

> Multiple table join error for avro 
> ---
>
> Key: HIVE-5850
> URL: https://issues.apache.org/jira/browse/HIVE-5850
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Shengjun Xin
> Attachments: part.tar.gz, partsupp.tar.gz, schema.tar.gz
>
>
> Reproduce step:
> {code}
> -- Create table Part.
> CREATE EXTERNAL TABLE part
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');
> -- Create table Part Supplier.
> CREATE EXTERNAL TABLE partsupp
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
> --- Query
> select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
> p.p_partkey=1;
> {code}
> {code}
> Error message is:
> Error: java.io.IOException: java.io.IOException: 
> org.apache.avro.AvroTypeException: Found {
>   "type" : "record",
>   "name" : "partsupp",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "ps_partkey",
> "type" : "long"
>   }, {
> "name" : "ps_suppkey",
> "type" : "long"
>   }, {
> "name" : "ps_availqty",
> "type" : "long"
>   }, {
> "name" : "ps_supplycost",
> "type" : "double"
>   }, {
> "name" : "ps_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }, expecting {
>   "type" : "record",
>   "name" : "part",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "p_partkey",
> "type" : "long"
>   }, {
> "name" : "p_name",
> "type" : "string"
>   }, {
> "name" : "p_mfgr",
> "type" : "string"
>   }, {
> "name" : "p_brand",
> "type" : "string"
>   }, {
> "name" : "p_type",
> "type" : "string"
>   }, {
> "name" : "p_size",
> "type" : "int"
>   }, {
> "name" : "p_container",
> "type" : "string"
>   }, {
> "name" : "p_retailprice",
> "type" : "double"
>   }, {
> "name" : "p_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
> {code}



--
This message was sent by Atlassi

[jira] [Updated] (HIVE-5850) Multiple table join error for avro

2013-11-19 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-5850:
---

Attachment: schema.tar.gz
part.tar.gz
partsupp.tar.gz

> Multiple table join error for avro 
> ---
>
> Key: HIVE-5850
> URL: https://issues.apache.org/jira/browse/HIVE-5850
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0
>Reporter: Shengjun Xin
> Attachments: part.tar.gz, partsupp.tar.gz, schema.tar.gz
>
>
> Reproduce step:
> {code}
> -- Create table Part.
> CREATE EXTERNAL TABLE part
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');
> -- Create table Part Supplier.
> CREATE EXTERNAL TABLE partsupp
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
> TBLPROPERTIES 
> ('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
> --- Query
> select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
> p.p_partkey=1;
> {code}
> {code}
> Error message is:
> Error: java.io.IOException: java.io.IOException: 
> org.apache.avro.AvroTypeException: Found {
>   "type" : "record",
>   "name" : "partsupp",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "ps_partkey",
> "type" : "long"
>   }, {
> "name" : "ps_suppkey",
> "type" : "long"
>   }, {
> "name" : "ps_availqty",
> "type" : "long"
>   }, {
> "name" : "ps_supplycost",
> "type" : "double"
>   }, {
> "name" : "ps_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }, expecting {
>   "type" : "record",
>   "name" : "part",
>   "namespace" : "com.gs.sdst.pl.avro.tpch",
>   "fields" : [ {
> "name" : "p_partkey",
> "type" : "long"
>   }, {
> "name" : "p_name",
> "type" : "string"
>   }, {
> "name" : "p_mfgr",
> "type" : "string"
>   }, {
> "name" : "p_brand",
> "type" : "string"
>   }, {
> "name" : "p_type",
> "type" : "string"
>   }, {
> "name" : "p_size",
> "type" : "int"
>   }, {
> "name" : "p_container",
> "type" : "string"
>   }, {
> "name" : "p_retailprice",
> "type" : "double"
>   }, {
> "name" : "p_comment",
> "type" : "string"
>   }, {
> "name" : "systimestamp",
> "type" : "long"
>   } ]
> }
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5850) Multiple table join error for avro

2013-11-19 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-5850:
---

Description: 
Reproduce step:
{code}
-- Create table Part.
CREATE EXTERNAL TABLE part
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
TBLPROPERTIES 
('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');

-- Create table Part Supplier.
CREATE EXTERNAL TABLE partsupp
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
TBLPROPERTIES 
('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
--- Query
select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
p.p_partkey=1;
{code}
{code}
Error message is:
Error: java.io.IOException: java.io.IOException: 
org.apache.avro.AvroTypeException: Found {
  "type" : "record",
  "name" : "partsupp",
  "namespace" : "com.gs.sdst.pl.avro.tpch",
  "fields" : [ {
"name" : "ps_partkey",
"type" : "long"
  }, {
"name" : "ps_suppkey",
"type" : "long"
  }, {
"name" : "ps_availqty",
"type" : "long"
  }, {
"name" : "ps_supplycost",
"type" : "double"
  }, {
"name" : "ps_comment",
"type" : "string"
  }, {
"name" : "systimestamp",
"type" : "long"
  } ]
}, expecting {
  "type" : "record",
  "name" : "part",
  "namespace" : "com.gs.sdst.pl.avro.tpch",
  "fields" : [ {
"name" : "p_partkey",
"type" : "long"
  }, {
"name" : "p_name",
"type" : "string"
  }, {
"name" : "p_mfgr",
"type" : "string"
  }, {
"name" : "p_brand",
"type" : "string"
  }, {
"name" : "p_type",
"type" : "string"
  }, {
"name" : "p_size",
"type" : "int"
  }, {
"name" : "p_container",
"type" : "string"
  }, {
"name" : "p_retailprice",
"type" : "double"
  }, {
"name" : "p_comment",
"type" : "string"
  }, {
"name" : "systimestamp",
"type" : "long"
  } ]
}
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
{code}

  was:
Reproduce step:
{code}
-- Create table Part.
CREATE EXTERNAL TABLE part
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
TBLPROPERTIES 
('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');

-- Create table Part Supplier.
CREATE EXTERNAL TABLE partsupp
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
TBLPROPERTIES 
('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
--- Query
select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
p.p_partkey=1;
{code}
Error message is:
Error: java.io.IOException: java.io.IOException: 
org.apache.avro.AvroTypeException: Found {
  "type" : "record",
  "name" : "partsupp",
  "namespace" : "com.gs.sdst.pl.avro.tpch",
  "fields" : [ {
"name" : "ps_partkey",
"type" : "long"
  }, {
"name" : "ps_suppkey",
"type" : "long"
  }, {
"name" : "ps_availqty",
"type" : "long"
  

[jira] [Created] (HIVE-5850) Multiple table join error for avro

2013-11-19 Thread Shengjun Xin (JIRA)
Shengjun Xin created HIVE-5850:
--

 Summary: Multiple table join error for avro 
 Key: HIVE-5850
 URL: https://issues.apache.org/jira/browse/HIVE-5850
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Shengjun Xin


Reproduce step:
{code}
-- Create table Part.
CREATE EXTERNAL TABLE part
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hadoop/tpc-h/data/part'
TBLPROPERTIES 
('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/part.avsc');

-- Create table Part Supplier.
CREATE EXTERNAL TABLE partsupp
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS
INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hadoop/tpc-h/data/partsupp'
TBLPROPERTIES 
('avro.schema.url'='hdfs:///user/hadoop/tpc-h/schema/partsupp.avsc');
--- Query
select * from partsupp ps join part p on ps.ps_partkey = p.p_partkey where 
p.p_partkey=1;
{code}
Error message is:
Error: java.io.IOException: java.io.IOException: 
org.apache.avro.AvroTypeException: Found {
  "type" : "record",
  "name" : "partsupp",
  "namespace" : "com.gs.sdst.pl.avro.tpch",
  "fields" : [ {
"name" : "ps_partkey",
"type" : "long"
  }, {
"name" : "ps_suppkey",
"type" : "long"
  }, {
"name" : "ps_availqty",
"type" : "long"
  }, {
"name" : "ps_supplycost",
"type" : "double"
  }, {
"name" : "ps_comment",
"type" : "string"
  }, {
"name" : "systimestamp",
"type" : "long"
  } ]
}, expecting {
  "type" : "record",
  "name" : "part",
  "namespace" : "com.gs.sdst.pl.avro.tpch",
  "fields" : [ {
"name" : "p_partkey",
"type" : "long"
  }, {
"name" : "p_name",
"type" : "string"
  }, {
"name" : "p_mfgr",
"type" : "string"
  }, {
"name" : "p_brand",
"type" : "string"
  }, {
"name" : "p_type",
"type" : "string"
  }, {
"name" : "p_size",
"type" : "int"
  }, {
"name" : "p_container",
"type" : "string"
  }, {
"name" : "p_retailprice",
"type" : "double"
  }, {
"name" : "p_comment",
"type" : "string"
  }, {
"name" : "systimestamp",
"type" : "long"
  } ]
}
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:218)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-5811) Multiple view alias join error

2013-11-12 Thread Shengjun Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengjun Xin updated HIVE-5811:
---

Description: 
Reproduce step:
{code}
-- Create HBase table.
create 'il_emp', {NAME => 'dt', VERSIONS => 1000}, {NAME => 'md', VERSIONS => 
1000}

-- Populate data.
put 'il_emp', 'EMP001', 'dt:nm', 'Employee 001'
put 'il_emp', 'EMP001', 'dt:sl', '100'
put 'il_emp', 'EMP001', 'dt:dp', 'DEP01'

put 'il_emp', 'EMP002', 'dt:nm', 'Employee 002'
put 'il_emp', 'EMP002', 'dt:sl', '200'
put 'il_emp', 'EMP002', 'dt:dp', 'DEP01'

put 'il_emp', 'EMP003', 'dt:nm', 'Employee 003'
put 'il_emp', 'EMP003', 'dt:sl', '300'
put 'il_emp', 'EMP003', 'dt:dp', 'DEP02'

put 'il_emp', 'EMP004', 'dt:nm', 'Employee 004'
put 'il_emp', 'EMP004', 'dt:sl', '400'
put 'il_emp', 'EMP004', 'dt:dp', 'DEP02'


-- Create external Hive table referring to the HBase table.
CREATE EXTERNAL TABLE il_emp
(
key string,name string,salary bigint,department string
)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,dt:nm,dt:sl,dt:dp")
TBLPROPERTIES ("hbase.table.name" = "il_emp");

-- Create hive view. (dept_min_salary)
CREATE VIEW il_dept_min_salary
(
department,
min_salary
)
AS
SELECT
department,
MIN(salary) AS min_salary
FROM
il_emp
GROUP BY
department;

-- Create hive view. (dept_max_salary)
CREATE VIEW il_dept_max_salary
(
department,
max_salary
)
AS
SELECT
department,
MAX(salary) AS max_salary
FROM
il_emp
GROUP BY
department;


set hive.auto.convert.join=true;

-- Two different views (JOIN).
SELECT
e.name,
e.salary,
e.department,
v1.min_salary AS mns,
v2.max_salary AS mxs
FROM
il_emp e
JOIN il_dept_min_salary v1 ON e.department = v1.department
JOIN il_dept_max_salary v2 ON e.department = v2.department;
{code}
Error log is:
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:616)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
... 8 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:186)
... 14 more

  was:
Reproduce step:

-- Create HBase table.
create 'il_emp', {NAME => 'dt', VERSIONS => 1000}, {NAME => 'md', VERSIONS => 
1000}

-- Populate data.
put 'il_emp', 'EMP001', 'dt:nm', 'Employee 001'
put 'il_emp', 'EMP001', 'dt:sl', '100'
put 'il_emp', 'EMP001', 'dt:dp', 'DEP01'

put 'il_emp', 'EMP002', 'dt:nm', 'Employee 002'
put 'il_emp', 'EMP002', 'dt:sl', '200'
put 'il_emp', 'EMP002', 'dt:dp', 'DEP01'

put 'il_emp', 'EMP003', 'dt:nm', 'Employee 003'
put 'il_emp', 'EMP003', 'dt:sl', '300'
put 'il_emp', 'EMP003', 'dt:dp', 'DEP02'

put 'il_emp', 'EMP004', 'dt:nm', 'Employee 004'
put 'il_emp', 'EMP004', 'dt:sl', '400'
put 'il_emp', 'EMP004', 'dt:dp', 'DEP02'


-- Create external Hive table referring to the HBase table.
CREATE EXTERNAL TABLE il_emp
(
key string,name string,salary bigint,department string
)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,dt:nm,dt:sl,dt:dp")
TBLPROPERTIES ("hbase.table.name" = "il_emp");

-- Create hive view. (dept_min_salary)
CREATE VIEW il_dept_min_salary
(
department,
min_salary
)
AS
SELECT
department,
MIN(salary) AS min_salary
FROM
il_emp
GROUP BY
department;

-- Create hive view. (dept_max_salary)
CREATE VIEW il_dept_max_salary
(
department,
max_salary
)
AS
SELECT
department,

[jira] [Created] (HIVE-5811) Multiple view alias join error

2013-11-12 Thread Shengjun Xin (JIRA)
Shengjun Xin created HIVE-5811:
--

 Summary: Multiple view alias join error
 Key: HIVE-5811
 URL: https://issues.apache.org/jira/browse/HIVE-5811
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Shengjun Xin


Reproduce step:

-- Create HBase table.
create 'il_emp', {NAME => 'dt', VERSIONS => 1000}, {NAME => 'md', VERSIONS => 
1000}

-- Populate data.
put 'il_emp', 'EMP001', 'dt:nm', 'Employee 001'
put 'il_emp', 'EMP001', 'dt:sl', '100'
put 'il_emp', 'EMP001', 'dt:dp', 'DEP01'

put 'il_emp', 'EMP002', 'dt:nm', 'Employee 002'
put 'il_emp', 'EMP002', 'dt:sl', '200'
put 'il_emp', 'EMP002', 'dt:dp', 'DEP01'

put 'il_emp', 'EMP003', 'dt:nm', 'Employee 003'
put 'il_emp', 'EMP003', 'dt:sl', '300'
put 'il_emp', 'EMP003', 'dt:dp', 'DEP02'

put 'il_emp', 'EMP004', 'dt:nm', 'Employee 004'
put 'il_emp', 'EMP004', 'dt:sl', '400'
put 'il_emp', 'EMP004', 'dt:dp', 'DEP02'


-- Create external Hive table referring to the HBase table.
CREATE EXTERNAL TABLE il_emp
(
key string,name string,salary bigint,department string
)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES("hbase.columns.mapping" = ":key,dt:nm,dt:sl,dt:dp")
TBLPROPERTIES ("hbase.table.name" = "il_emp");

-- Create hive view. (dept_min_salary)
CREATE VIEW il_dept_min_salary
(
department,
min_salary
)
AS
SELECT
department,
MIN(salary) AS min_salary
FROM
il_emp
GROUP BY
department;

-- Create hive view. (dept_max_salary)
CREATE VIEW il_dept_max_salary
(
department,
max_salary
)
AS
SELECT
department,
MAX(salary) AS max_salary
FROM
il_emp
GROUP BY
department;


set hive.auto.convert.join=true;

-- Two different views (JOIN).
SELECT
e.name,
e.salary,
e.department,
v1.min_salary AS mns,
v2.max_salary AS mxs
FROM
il_emp e
JOIN il_dept_min_salary v1 ON e.department = v1.department
JOIN il_dept_max_salary v2 ON e.department = v2.department;

Error log is:
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:616)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
... 8 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:186)
... 14 more



--
This message was sent by Atlassian JIRA
(v6.1#6144)