Re: Hive Druid SQL
Hi Thanks for checking. Our Druid installation has Sql Server as metastore, and it is using Azure for Deep storage. It is working fine. That is why i was wondering why Hive would need MySQL or Postgres only for Druid integration. I'm using Hive 2.2.0 Now dismantling Druid just for Hive integration is a big effort to do i would guess. On Tue, Apr 10, 2018 at 12:46 AM, Ashutosh Chauhan <hashut...@apache.org> wrote: > Hi Amit, > > Yes only mysql and postgres are supported for druid metadata storage. > Thats because Druid only supports these. You mentioned that Hive and Druid > are working independently. Which metadata storage your Druid install is > using? > > Thanks, > Ashutosh > > On Mon, Apr 9, 2018 at 7:39 PM, Lefty Leverenz <leftylever...@gmail.com> > wrote: > >> > Does it mean, I cannot use SQLserver as Druid metastore for Hive to >> work with Druid? >> >> Apparently so. >> >>- In Hive 2.2.0 *hive.druid.metadata.db.type* was introduced with >>values "mysql" and "postgres" (HIVE-15277 >><https://issues.apache.org/jira/browse/HIVE-15277>). >>- In Hive 2.3.0 the value "postgres" was changed to "postgresql" ( >>HIVE-15809 <https://issues.apache.org/jira/browse/HIVE-15809>). >>- In Hive 3.0.0 (upcoming release) the value "derby" is added ( >>HIVE-18196 <https://issues.apache.org/jira/browse/HIVE-18196>). >> >> -- Lefty >> >> >> On Fri, Apr 6, 2018 at 10:09 AM Amit <ami...@gmail.com> wrote: >> >>> Hive Druid Integration: >>> I have Hive and Druid working independently. >>> But having trouble connecting the two together. >>> I don't have Hortonworks. >>> >>> I have Druid using sqlserver as metadata store database. >>> >>> When I try setting this property in Beeline, >>> >>> set hive.druid.metadata.db.type=sqlserver; >>> >>> I get a message: >>> Error: Error while processing statement: 'SET >>> hive.druid.metadata.db.type=sqlserver' FAILED >>> in validation : Invalid value.. expects one of patterns [mysql, postgres]. >>> (state=42000,code=1) >>> >>> Does it mean, I cannot use SQLserver as Druid metastore for Hive to work >>> with Druid? >>> >>> >>> >>> >
Hive Druid SQL
Hive Druid Integration: I have Hive and Druid working independently. But having trouble connecting the two together. I don't have Hortonworks. I have Druid using sqlserver as metadata store database. When I try setting this property in Beeline, set hive.druid.metadata.db.type=sqlserver; I get a message: Error: Error while processing statement: 'SET hive.druid.metadata.db.type=sqlserver' FAILED in validation : Invalid value.. expects one of patterns [mysql, postgres]. (state=42000,code=1) Does it mean, I cannot use SQLserver as Druid metastore for Hive to work with Druid?
test
test
LLAP Query Failed with no such method exception
Hi, I have configured hadoop 2.7.3 and hive 2.1.1 with LLAP. tez quiries are running fine, but after LLAP daemon is launched using slider, any insert or count(*) llap queries is throwing exception: java.lang.Exception: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: org.apache.hadoop.tracing.SpanReceiverHost.getInstance(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/SpanReceiverHost; at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:271) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: org.apache.hadoop.tracing.SpanReceiverHost.getInstance(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/SpanReceiverHost; at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:265) ... 12 more I suppose, might be because of missing htrace configuration, but after configuring the below configuration in core-site.xml, It is still throwing same exception. hadoop.htrace.spanreceiver.classes org.apache.htrace.impl.LocalFileSpanReceiver hadoop.htrace.local-file-span-receiver.path /usr/local/hadoop/logs/htrace.out Thanks & Regards, Amit Kumar, Scientist B, Mob: 9910611621
Re: trouble starting hiveserver2 with hive2.1.1
Hi, when running hive --service llap, on hive2.3, It throw below error Failed: java.io.IOException: Target /tmp/staging-slider-hpJkzz/lib/tez is a directory java.util.concurrent.ExecutionException: java.io.IOException: Target /tmp/staging-slider-hpJkzz/lib/tez is a directory at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.llap.cli.LlapServiceDriver.run(LlapServiceDriver.java:556) at org.apache.hadoop.hive.llap.cli.LlapServiceDriver.main(LlapServiceDriver.java:116) Caused by: java.io.IOException: Target /tmp/staging-slider-hpJkzz/lib/tez is a directory at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:500) at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:502) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348) at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1965) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1933) at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1898) at org.apache.hadoop.hive.llap.cli.LlapServiceDriver$3.call(LlapServiceDriver.java:450) at org.apache.hadoop.hive.llap.cli.LlapServiceDriver$3.call(LlapServiceDriver.java:404) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) INFO cli.LlapServiceDriver: LLAP service driver finished Thanks & Regards, Amit Kumar, Mob: 9910611621 On Sat, Jul 22, 2017 at 5:00 PM, Amit Kumar <delhiam...@gmail.com> wrote: > Hi, > > I have installed hadoop 2.7.2 and hive 2.1.1. Successfully configured > mysql as metastore and also able to connect to hive using hive cli. > > But on starting hiveserver2, exception is thrown as below: > > [hadoop@master bin]$ hiveserver2 > which: no hbase in (/opt/hadoop/hive/bin:/opt/ > hadoop/hive/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/ > usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/ > home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/ > opt/hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/ > bin:/opt/hadoop/slider/bin:/home/hadoop/.local/bin:/home/ > hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/ > hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/bin:/ > opt/hadoop/slider/bin) > Exception in thread "main" java.lang.NoSuchMethodError: > org.apache.hive.common.util.HiveStringUtils.startupShutdownMessage(Ljava/ > lang/Class;[Ljava/lang/String;Lorg/apache/commons/logging/Log;)V > at org.apache.hive.service.server.HiveServer2.main( > HiveServer2.java:455) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > > > Thanks & Regards, > > Amit Kumar, > Mob: 9910611621 <099106%2011621> > >
trouble starting hiveserver2 with hive2.1.1
Hi, I have installed hadoop 2.7.2 and hive 2.1.1. Successfully configured mysql as metastore and also able to connect to hive using hive cli. But on starting hiveserver2, exception is thrown as below: [hadoop@master bin]$ hiveserver2 which: no hbase in (/opt/hadoop/hive/bin:/opt/hadoop/hive/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/slider/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/slider/bin) Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hive.common.util.HiveStringUtils.startupShutdownMessage(Ljava/lang/Class;[Ljava/lang/String;Lorg/apache/commons/logging/Log;)V at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:455) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Thanks & Regards, Amit Kumar, Mob: 9910611621
Hive Insert query is failing
Hi, Hive query running on Tez to insert from one table to another is failing with the below error. Both the tables have file format as ORC and all the columns in both the tables are string. Can anyone help on how it can be fixed. Hive version: 1.2.1.2.4 Error message: Vertex failed, vertexName=Map 1, vertexId=vertex_1483552897173_0276_1_00, diagnostics=[Task failed, taskId=task_1483552897173_0276_1_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: ORC does not support type conversion from VARCHAR to STRING at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Query: insert into table tableB select col1, col2, col3, col4, col5, col6, col7, col8 from tableA Thanks Amit Legal Disclaimer: The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message!
Reading hive-site.xml
Hi, I am trying to understand how hive is reading the configuration from hive-site.xml. Where we define the structure of the xml file and code used to read the hite-site.xml. Thanks Amit Legal Disclaimer: The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message!
RE: Error running SQL query through Hive JDBC
Below is the code snippet with the SQL query which I am running. The same query is running fine through Hive CLI. String sql = " SELECT TBL_CODE FROM DB.CODE_MAP WHERE SYSTEM_NAME='TDS' AND TABLE_NAME=TRIM('XYZ')"; System.out.println("New SQL: " + sql); String driverName = "org.apache.hive.jdbc.HiveDriver"; try { Class.forName(driverName); Connection con = DriverManager.getConnection( "jdbc:hive2://hiveservername:1/default", "username", ""); HiveStatement stmt = (HiveStatement) con.createStatement(); ResultSet res = stmt.executeQuery(sql); while (res.next()) { Object ret_obj = res.getObject(1); System.out.println(res.getString(1)); } stmt.close(); con.close(); } catch (ClassNotFoundException e) { e.printStackTrace(); } catch (SQLException e) { e.printStackTrace(); } From: Markovitz, Dudu [mailto:dmarkov...@paypal.com] Sent: Friday, August 05, 2016 3:04 PM To: user@hive.apache.org Subject: RE: Error running SQL query through Hive JDBC Can you please share the query? From: Amit Bajpai [mailto:amit.baj...@flextronics.com] Sent: Friday, August 05, 2016 10:40 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Error running SQL query through Hive JDBC Hi, I am getting the below error when running the SQL query through Hive JDBC. Can suggestion how to fix it. org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException UDF = is not allowed at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) at com.flex.hdp.logs.test.main(test.java:84) Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException UDF = is not allowed at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:314) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:111) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:180) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:256) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:376) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy32.executeStatementAsync(Unknown Source) at org.apache.hiv
Error running SQL query through Hive JDBC
Hi, I am getting the below error when running the SQL query through Hive JDBC. Can suggestion how to fix it. org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException UDF = is not allowed at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392) at com.flex.hdp.logs.test.main(test.java:84) Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException UDF = is not allowed at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:314) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:111) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:180) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:256) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:376) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy32.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:401) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.parse.SemanticException:UDF = is not allowed at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:677) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:810) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1152) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:189) at
RE: hive concurrency not working
You need to increase the value for the below hive property value in Ambari hive.server2.tez.sessions.per.default.queue If this does not fix the issue then you need to update the capacity scheduler property values. From: Raj hadoop [mailto:raj.had...@gmail.com] Sent: Wednesday, August 03, 2016 8:15 AM To: user@hive.apache.org Subject: hive concurrency not working Dear All, In need or your help, we have horton works 4 node cluster,and the problem is hive is allowing only one user at a time, if any second resource need to login hive is not working, could someone please help me in this Thanks, Rajesh Legal Disclaimer: The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message!
RE: Yarn Application ID for Hive query
I am running hive on Tez. I am able to get the Yarn application ID for the hive query by submitting the query through Hive JDBC and using HiveStatement. Connection con = DriverManager.getConnection("jdbc:hive2://abc:1/default","xyz", ""); HiveStatement stmt = (HiveStatement) con.createStatement(); String sql = " SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID "; ResultSet res = stmt.executeQuery(sql); String yarn_app_id = new String(); for (String log : stmt.getQueryLog()) { if (log.contains("App id")){ yarn_app_id = log.substring(log.indexOf("App id") +7, log.length()-1); } } System.out.println("YARN Application ID: " + yarn_app_id); Now I am trying to find the Tez DAG ID for the query. From: Gerber, Bryan W [mailto:bryan.ger...@pnnl.gov] Sent: Monday, July 18, 2016 1:47 PM To: user@hive.apache.org Subject: RE: Yarn Application ID for Hive query Making Hive look like a normal SQL database is the goal of libraries like this, so it make sense that that abstraction wouldn't leak a concept like application ID. Especially because not all Hive queries generate a YARN application. That said, we went through this with JDBC access to Hive a while back to allow our user interface to cancel a query. Only relevant discussion I found was here: http://grokbase.com/t/cloudera/hue-user/1373c258xg/how-hue-beeswax-is-able-to-read-the-hadoop-job-id-that-gets-generated-by-hiveserver2 We are using this method, plus a background task that polls the YARN resource manager API to find the job with the corresponding hive.session.id. It is a lot of work for something that seems very simple. It would be nice to have access to a command or API call in HiveServer2 similar to MySQL's "SHOW PROCESSLIST" (and equivalent commands in most other databases). From: Amit Bajpai [mailto:amit.baj...@flextronics.com] Sent: Thursday, July 14, 2016 10:22 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Yarn Application ID for Hive query Hi, I am using the below python program to run a hive query. How can I get the Yarn application ID using the python program for the hive query execution. import pyhs2 with pyhs2.connect(host='abc.sac.com', port=1, authMechanism="PLAIN", user='amit', password='amit', database='default') as conn: with conn.cursor() as cur: #Execute query cur.execute("SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID") #Fetch table results for i in cur.fetch(): print i Thanks Amit Legal Disclaimer: The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message! Legal Disclaimer: The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message!
Yarn Application ID for Hive query
Hi, I am using the below python program to run a hive query. How can I get the Yarn application ID using the python program for the hive query execution. import pyhs2 with pyhs2.connect(host='abc.sac.com', port=1, authMechanism="PLAIN", user='amit', password='amit', database='default') as conn: with conn.cursor() as cur: #Execute query cur.execute("SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID") #Fetch table results for i in cur.fetch(): print i Thanks Amit Legal Disclaimer: The information contained in this message may be privileged and confidential. It is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and delete or destroy any copy of this message!
Date parsing Exception at Hadoop server
Hi Gurus, I am facing ParseException for the following code for my UDF in hive. just the evaluate method. private final SimpleDateFormat sdf = new SimpleDateFormat("dd-MM-", Locale.US); public Object evaluate(DeferredObject[] arguments) throws HiveException { String result = "0"; assert (arguments.length == 1); List list = (List) this.listOI.getList(arguments[0].get()); if (list == null) { return null; } System.out.println("- Size :"+ list.get(0)); if (list.size() > 0) { List listDates = new ArrayList(); // result = compareDates(list); for (Text dateTxt : list) { try { dateTxt.toString()); String dt = new String(dateTxt.toString().trim()); Date transDate = sdf.parse(dt); listDates.add(transDate); listDates.size()); } catch (ParseException e) { System.err.println(e.getMessage()); e.printStackTrace(); } } if (listDates.size() > 0) { Date resultDate = Collections.min(listDates); result = sdf.format(resultDate); } } return result; } The same code is passing the test perfectly. Following is the test method. public void testGetMinDate() throws HiveException { // set up the models we need GetMinDate example = new GetMinDate(); ObjectInspector stringOI = PrimitiveObjectInspectorFactory.javaStringObjectInspector; ObjectInspector listOI = ObjectInspectorFactory.getStandardListObjectInspector(stringOI); StringObjectInspector resultInspector = (StringObjectInspector) example.initialize(new ObjectInspector[]{listOI}); // create the actual UDF arguments List list = new ArrayList(); list.add(new Text("01-01-2015")); list.add(new Text("01-03-2014")); list.add(new Text("04-01-2015")); // test our results // the value exists Object result = example.evaluate(new DeferredObject[]{new DeferredJavaObject(list)}); System.out.println(result); Assert.assertEquals("01-03-2014", result); // the value doesn't exist // Object result2 = example.evaluate(new DeferredObject[]{new DeferredJavaObject(list)}); // Assert.assertEquals("Success", result2); // arguments are null Object result3 = example.evaluate(new DeferredObject[]{new DeferredJavaObject(null)}); Assert.assertNull(result3); } Following is the error java.text.ParseException: Unparseable date: "23-05-2015" at java.text.DateFormat.parse(DateFormat.java:357) at com.vzw.mct.GetMinDate.evaluate(GetMinDate.java:44) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1064) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:875) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:737) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Not sure why it is failing in the server. If any one kindly point it out it will be great. Thanks, Amit
Re: user matching query does not exist
i am using CDH 5.2.1, Any pointers will be of immense help. Thanks On Fri, May 15, 2015 at 9:43 AM, amit kumar ak3...@gmail.com wrote: Hi, After re-create my account in Hue, i receives “User matching query does not exist” when attempting to perform hive query. The query is succeed in hive command line. Please suggest on this, Thanks you Amit
Re: user matching query does not exist
Thank you Nitin, When the user runs the query via Hive command line. The query succeeds query like select * from railway; as per the link provided you i fire the command ./manage.py clearsessions ; i get the error. On Fri, May 15, 2015 at 12:32 PM, Nitin Pawar nitinpawar...@gmail.com wrote: this is related to djnago see this on how to clear sessions from django http://www.opencsw.org/community/questions/289/how-to-clear-the-django-session-cache On Fri, May 15, 2015 at 12:24 PM, amit kumar ak3...@gmail.com wrote: Yes it is happening for hue only, can u plz suggest how i cleaning up hue session from server ? The query is succeed in hive command line. On Fri, May 15, 2015 at 11:52 AM, Nitin Pawar nitinpawar...@gmail.com wrote: Is this happening for Hue? If yes, may be you can try cleaning up hue sessions from server. (this may clean all users active sessions from hue so be careful while doing it) On Fri, May 15, 2015 at 11:31 AM, amit kumar ak3...@gmail.com wrote: i am using CDH 5.2.1, Any pointers will be of immense help. Thanks On Fri, May 15, 2015 at 9:43 AM, amit kumar ak3...@gmail.com wrote: Hi, After re-create my account in Hue, i receives “User matching query does not exist” when attempting to perform hive query. The query is succeed in hive command line. Please suggest on this, Thanks you Amit -- Nitin Pawar -- Nitin Pawar
Re: user matching query does not exist
Yes it is happening for hue only, can u plz suggest how i cleaning up hue session from server ? The query is succeed in hive command line. On Fri, May 15, 2015 at 11:52 AM, Nitin Pawar nitinpawar...@gmail.com wrote: Is this happening for Hue? If yes, may be you can try cleaning up hue sessions from server. (this may clean all users active sessions from hue so be careful while doing it) On Fri, May 15, 2015 at 11:31 AM, amit kumar ak3...@gmail.com wrote: i am using CDH 5.2.1, Any pointers will be of immense help. Thanks On Fri, May 15, 2015 at 9:43 AM, amit kumar ak3...@gmail.com wrote: Hi, After re-create my account in Hue, i receives “User matching query does not exist” when attempting to perform hive query. The query is succeed in hive command line. Please suggest on this, Thanks you Amit -- Nitin Pawar
user matching query does not exist
Hi, After re-create my account in Hue, i receives “User matching query does not exist” when attempting to perform hive query. The query is succeed in hive command line. Please suggest on this, Thanks you Amit
Re: Hive : plan serialization format option
what error you are getting after mentioning javaxml in place of kryo On Wed, May 6, 2015 at 12:44 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Please find attached error log for the same. On Tue, May 5, 2015 at 11:36 PM, Jason Dere jd...@hortonworks.com wrote: Looks like you are running into https://issues.apache.org/jira/browse/HIVE-8321, fixed in Hive-0.14. You might be stuck having to use Kryo, what are the issues you are having with Kryo? Thanks, Jason On May 5, 2015, at 4:28 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Bottom on the log: at java.beans.Encoder.writeObject(Encoder.java:74) at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327) at java.beans.Encoder.writeExpression(Encoder.java:330) at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454) at java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:194) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256) ... 98 more Caused by: java.lang.NullPointerException at java.lang.StringBuilder.init(StringBuilder.java:109) at org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:49) at org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:45) at org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo.getTypeName(VarcharTypeInfo.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at java.beans.Statement.invokeInternal(Statement.java:292) at java.beans.Statement.access$000(Statement.java:58) at java.beans.Statement$2.run(Statement.java:185) at java.security.AccessController.doPrivileged(Native Method) at java.beans.Statement.invoke(Statement.java:182) at java.beans.Expression.getValue(Expression.java:153) at java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:193) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256) ... 111 more Job Submission failed with exception 'java.lang.RuntimeException(java.lang.RuntimeException: Cannot serialize object)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask On Tue, May 5, 2015 at 3:10 PM, Jason Dere jd...@hortonworks.com wrote: kryo/javaXML are the only available options. What are the errors you see with each setting? On May 1, 2015, at 9:41 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Hi Hive Users, I'm using cloudera's hive 0.13 version which by default provide Kryo plan serialization format. property namehive.plan.serialization.format/name value*kryo*/value /property As i'm facing issues with Kryo, can anyone help me identify the other open options in place of Kryo for hive plan serialization format. I know one option javaXML, but in my case it is not working.
Re: Hive : plan serialization format option
Jason, The last comment is This has been fixed in 0.14 release. Please open new jira if you see any issues. is this issue resolved in hive 0.14 ? On Tue, May 5, 2015 at 11:36 PM, Jason Dere jd...@hortonworks.com wrote: Looks like you are running into https://issues.apache.org/jira/browse/HIVE-8321, fixed in Hive-0.14. You might be stuck having to use Kryo, what are the issues you are having with Kryo? Thanks, Jason On May 5, 2015, at 4:28 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Bottom on the log: at java.beans.Encoder.writeObject(Encoder.java:74) at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327) at java.beans.Encoder.writeExpression(Encoder.java:330) at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454) at java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:194) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256) ... 98 more Caused by: java.lang.NullPointerException at java.lang.StringBuilder.init(StringBuilder.java:109) at org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:49) at org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:45) at org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo.getTypeName(VarcharTypeInfo.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at java.beans.Statement.invokeInternal(Statement.java:292) at java.beans.Statement.access$000(Statement.java:58) at java.beans.Statement$2.run(Statement.java:185) at java.security.AccessController.doPrivileged(Native Method) at java.beans.Statement.invoke(Statement.java:182) at java.beans.Expression.getValue(Expression.java:153) at java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:193) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256) ... 111 more Job Submission failed with exception 'java.lang.RuntimeException(java.lang.RuntimeException: Cannot serialize object)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask On Tue, May 5, 2015 at 3:10 PM, Jason Dere jd...@hortonworks.com wrote: kryo/javaXML are the only available options. What are the errors you see with each setting? On May 1, 2015, at 9:41 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Hi Hive Users, I'm using cloudera's hive 0.13 version which by default provide Kryo plan serialization format. property namehive.plan.serialization.format/name value*kryo*/value /property As i'm facing issues with Kryo, can anyone help me identify the other open options in place of Kryo for hive plan serialization format. I know one option javaXML, but in my case it is not working.
Re: Hive : plan serialization format option
Thank you Jason, will upgrade the hive 0.14, and tried out the bug. On Fri, May 8, 2015 at 1:43 AM, Jason Dere jd...@hortonworks.com wrote: The javaXML issue referenced by that bug should be fixed by hive-0.14 .. note the original poster was using hive-0.13 On May 7, 2015, at 12:48 PM, amit kumar ak3...@gmail.com wrote: Jason, The last comment is This has been fixed in 0.14 release. Please open new jira if you see any issues. is this issue resolved in hive 0.14 ? On Tue, May 5, 2015 at 11:36 PM, Jason Dere jd...@hortonworks.com wrote: Looks like you are running into https://issues.apache.org/jira/browse/HIVE-8321, fixed in Hive-0.14. You might be stuck having to use Kryo, what are the issues you are having with Kryo? Thanks, Jason On May 5, 2015, at 4:28 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Bottom on the log: at java.beans.Encoder.writeObject(Encoder.java:74) at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327) at java.beans.Encoder.writeExpression(Encoder.java:330) at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454) at java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:194) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256) ... 98 more Caused by: java.lang.NullPointerException at java.lang.StringBuilder.init(StringBuilder.java:109) at org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:49) at org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:45) at org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo.getTypeName(VarcharTypeInfo.java:37) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at java.beans.Statement.invokeInternal(Statement.java:292) at java.beans.Statement.access$000(Statement.java:58) at java.beans.Statement$2.run(Statement.java:185) at java.security.AccessController.doPrivileged(Native Method) at java.beans.Statement.invoke(Statement.java:182) at java.beans.Expression.getValue(Expression.java:153) at java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:193) at java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256) ... 111 more Job Submission failed with exception 'java.lang.RuntimeException(java.lang.RuntimeException: Cannot serialize object)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask On Tue, May 5, 2015 at 3:10 PM, Jason Dere jd...@hortonworks.com wrote: kryo/javaXML are the only available options. What are the errors you see with each setting? On May 1, 2015, at 9:41 AM, Bhagwan S. Soni bhgwnsson...@gmail.com wrote: Hi Hive Users, I'm using cloudera's hive 0.13 version which by default provide Kryo plan serialization format. property namehive.plan.serialization.format/name value*kryo*/value /property As i'm facing issues with Kryo, can anyone help me identify the other open options in place of Kryo for hive plan serialization format. I know one option javaXML, but in my case it is not working.
Re: Unable to move files on Hive/Hdfs
Hi Doug, I have use CDH 5.2.1 Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Stack trace: 2015-05-04 10:38:18,820 INFO [main]: exec.Task (SessionState.java:printInfo(537)) - Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 from hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 2015-05-04 10:38:18,857 ERROR [main]: exec.Task (SessionState.java:printError(546)) - Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 at org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269) at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL operation has been rejected. Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false. at org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553) After rolling those same changes out, the problem resolved itself. On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas douglas.mo...@thinkbiganalytics.com wrote: Hi Amit, We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2 release. There are lots of reasons for this though. Can you provide more details about the stack trace and version so we can compare? For our problem we've seen some relief with SET hive.metastore.client.socket.timeout=60s but the problem still happens from time to time. Thanks, Douglas From: amit kumar ak3...@gmail.com Reply-To: user@hive.apache.org Date: Tue, 5 May 2015 03:12:15 +0530 To: user@hive.apache.org Subject: Unable to move files on Hive/Hdfs While moving the data from hive/hdfs we get below error, Please suggest on this. Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-1 Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911\ 235-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-\ 1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Cumulative CPU: 5.83 sec HDFS Read: 553081 HDFS Write: 489704 SUCCESS Total MapReduce CPU Time Spent: 5 seconds 830 msec Error (1). Execution Failed. 2015-05-04 10:03:13 ERROR (1) in run_hive Thanks,
Re: Unable to move files on Hive/Hdfs
Hi Doug, I have use CDH 5.2.1 I performed the below task, and getting the error, but after rolling back the below changes issue has been resolved itself. Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Thanks, On Tue, May 5, 2015 at 4:36 AM, amit kumar ak3...@gmail.com wrote: Hi Doug, I have use CDH 5.2.1 Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Stack trace: 2015-05-04 10:38:18,820 INFO [main]: exec.Task (SessionState.java:printInfo(537)) - Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 from hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 2015-05-04 10:38:18,857 ERROR [main]: exec.Task (SessionState.java:printError(546)) - Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 at org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269) at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL operation has been rejected. Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false. at org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553) After rolling those same changes out, the problem resolved itself. On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas douglas.mo...@thinkbiganalytics.com wrote: Hi Amit, We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2 release. There are lots of reasons for this though. Can you provide more details about the stack trace and version so we can compare? For our problem we've seen some relief with SET hive.metastore.client.socket.timeout=60s but the problem still happens from time to time. Thanks, Douglas From: amit kumar ak3...@gmail.com Reply-To: user@hive.apache.org Date: Tue, 5 May 2015 03:12:15 +0530 To: user@hive.apache.org Subject: Unable to move files on Hive/Hdfs While moving the data from hive/hdfs we get below error, Please suggest on this. Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-1 Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911\ 235-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-\ 1 FAILED: Execution Error, return
Re: Unable to move files on Hive/Hdfs
Doug, Do i need any changes in configuration or else to resolve this issue. Thanks On Tue, May 5, 2015 at 4:46 AM, amit kumar ak3...@gmail.com wrote: Do you have any suggestion to resolve this issue, I am looking for a resolution. On Tue, May 5, 2015 at 4:42 AM, Moore, Douglas douglas.mo...@thinkbiganalytics.com wrote: Yep, permission problem. Weird though it seems to be moving a file within the same dir. Thanks for the update! - Douglas From: amit kumar ak3...@gmail.com Reply-To: user@hive.apache.org Date: Tue, 5 May 2015 04:40:18 +0530 To: user@hive.apache.org Subject: Re: Unable to move files on Hive/Hdfs Hi Doug, I have use CDH 5.2.1 I performed the below task, and getting the error, but after rolling back the below changes issue has been resolved itself. Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Thanks, On Tue, May 5, 2015 at 4:36 AM, amit kumar ak3...@gmail.com wrote: Hi Doug, I have use CDH 5.2.1 Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Stack trace: 2015-05-04 10:38:18,820 INFO [main]: exec.Task (SessionState.java:printInfo(537)) - Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 from hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 2015-05-04 10:38:18,857 ERROR [main]: exec.Task (SessionState.java:printError(546)) - Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 at org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269) at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL operation has been rejected. Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false. at org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553) After rolling those same changes out, the problem resolved itself. On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas douglas.mo...@thinkbiganalytics.com wrote: Hi Amit, We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2 release. There are lots of reasons for this though. Can you provide more details about the stack trace and version so we can compare? For our problem we've seen some relief with SET hive.metastore.client.socket.timeout=60s but the problem still happens from time to time. Thanks, Douglas From: amit kumar ak3
Re: Unable to move files on Hive/Hdfs
Do you have any suggestion to resolve this issue, I am looking for a resolution. On Tue, May 5, 2015 at 4:42 AM, Moore, Douglas douglas.mo...@thinkbiganalytics.com wrote: Yep, permission problem. Weird though it seems to be moving a file within the same dir. Thanks for the update! - Douglas From: amit kumar ak3...@gmail.com Reply-To: user@hive.apache.org Date: Tue, 5 May 2015 04:40:18 +0530 To: user@hive.apache.org Subject: Re: Unable to move files on Hive/Hdfs Hi Doug, I have use CDH 5.2.1 I performed the below task, and getting the error, but after rolling back the below changes issue has been resolved itself. Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Thanks, On Tue, May 5, 2015 at 4:36 AM, amit kumar ak3...@gmail.com wrote: Hi Doug, I have use CDH 5.2.1 Disable ACLs on Name Nodes Set Enable Access Control Lists = False Save Changes Restart Hadoop Cluster Stack trace: 2015-05-04 10:38:18,820 INFO [main]: exec.Task (SessionState.java:printInfo(537)) - Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 from hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 2015-05-04 10:38:18,857 ERROR [main]: exec.Task (SessionState.java:printError(546)) - Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1 at org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269) at org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL operation has been rejected. Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false. at org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553) After rolling those same changes out, the problem resolved itself. On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas douglas.mo...@thinkbiganalytics.com wrote: Hi Amit, We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2 release. There are lots of reasons for this though. Can you provide more details about the stack trace and version so we can compare? For our problem we've seen some relief with SET hive.metastore.client.socket.timeout=60s but the problem still happens from time to time. Thanks, Douglas From: amit kumar ak3...@gmail.com Reply-To: user@hive.apache.org Date: Tue, 5 May 2015 03:12:15 +0530 To: user@hive.apache.org Subject: Unable to move files on Hive/Hdfs While moving
Unable to move files on Hive/Hdfs
While moving the data from hive/hdfs we get below error, Please suggest on this. Moving data to: hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-1 Failed with exception Unable to move sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911\ 235-1/-ext-10002 to destination hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-\ 1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Cumulative CPU: 5.83 sec HDFS Read: 553081 HDFS Write: 489704 SUCCESS Total MapReduce CPU Time Spent: 5 seconds 830 msec Error (1). Execution Failed. 2015-05-04 10:03:13 ERROR (1) in run_hive Thanks,
what is the bench mark using SSD for HDFS over HDD
Hi User, I want to know the difference of query execution time in hive if I use SSD for HDFS and HDD for HDFS. Thanks, Amit
Query is stuck in middle
Hi User, I am running a join query on 100GB table with 10 GB table. My query got stuck without giving any error like below. 2014-11-26 20:19:53,893 Stage-1 map = 99%, reduce = 10%, Cumulative CPU 29443.21 sec 2014-11-26 20:20:53,920 Stage-1 map = 99%, reduce = 10%, Cumulative CPU 29480.04 sec 2014-11-26 20:21:53,923 Stage-1 map = 99%, reduce = 10%, Cumulative CPU 29516.21 sec 2014-11-26 20:22:53,935 Stage-1 map = 99%, reduce = 10%, Cumulative CPU 29552.95 sec Please help me to find out solution. Thanks Amit
Container launch failed Error
Hi Users, *my cluster(1+8) configuration*: RAM : 32 GB each HDFS : 1.5 TB SSD CPU : 8 core each --- I am trying to query on 300GB of table but I am able to run only select query. Except select query , for all other query I am getting following exception. Total jobs = 1 Stage-1 is selected by condition resolver. Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 183 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1416831990090_0005, Tracking URL = http://master:8088/proxy/application_1416831990090_0005/ Kill Command = /root/hadoop/bin/hadoop job -kill job_1416831990090_0005 Hadoop job information for Stage-1: number of mappers: 679; number of reducers: 183 2014-11-24 19:43:01,523 Stage-1 map = 0%, reduce = 0% 2014-11-24 19:43:22,730 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 625.19 sec 2014-11-24 19:43:23,778 Stage-1 map = 100%, reduce = 100% MapReduce Total cumulative CPU time: 10 minutes 25 seconds 190 msec Ended Job = job_1416831990090_0005 with errors Error during job, obtaining debugging information... Examining task ID: task_1416831990090_0005_m_05 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_42 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_35 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_65 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_02 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_07 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_58 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_43 (and more) from job job_1416831990090_0005 Task with the most failures(4): - Task ID: task_1416831990090_0005_m_05 URL: http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05 - Diagnostic Messages for this Task: Container launch failed for container_1416831990090_0005_01_000112 : java.lang.IllegalArgumentException: java.net.UnknownHostException: slave6 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.UnknownHostException: slave6 ... 12 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 679 Reduce: 183 Cumulative CPU: 625.19 sec HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 10 minutes 25 seconds 190 mse Please help me to fix the issue. Thanks Amit
Re: Container launch failed Error
hi Daniel, this stacktrace same for other query . for different run I am getting slave7 sometime slave8... And also I registered all machine IPs in /etc/hosts Regards Amit On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It seems that the application master can't resolve slave6's name to an IP Daniel On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote: Hi Users, *my cluster(1+8) configuration*: RAM : 32 GB each HDFS : 1.5 TB SSD CPU : 8 core each --- I am trying to query on 300GB of table but I am able to run only select query. Except select query , for all other query I am getting following exception. Total jobs = 1 Stage-1 is selected by condition resolver. Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 183 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1416831990090_0005, Tracking URL = http://master:8088/proxy/application_1416831990090_0005/ Kill Command = /root/hadoop/bin/hadoop job -kill job_1416831990090_0005 Hadoop job information for Stage-1: number of mappers: 679; number of reducers: 183 2014-11-24 19:43:01,523 Stage-1 map = 0%, reduce = 0% 2014-11-24 19:43:22,730 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 625.19 sec 2014-11-24 19:43:23,778 Stage-1 map = 100%, reduce = 100% MapReduce Total cumulative CPU time: 10 minutes 25 seconds 190 msec Ended Job = job_1416831990090_0005 with errors Error during job, obtaining debugging information... Examining task ID: task_1416831990090_0005_m_05 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_42 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_35 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_65 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_02 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_07 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_58 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_43 (and more) from job job_1416831990090_0005 Task with the most failures(4): - Task ID: task_1416831990090_0005_m_05 URL: http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05 - Diagnostic Messages for this Task: Container launch failed for container_1416831990090_0005_01_000112 : java.lang.IllegalArgumentException: java.net.UnknownHostException: slave6 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.UnknownHostException: slave6 ... 12 more FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 679 Reduce: 183 Cumulative CPU: 625.19 sec HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 10 minutes 25 seconds 190 mse Please help me to fix the issue. Thanks Amit
Re: Container launch failed Error
I did not modify in all the slaves. except slave will it be a problem ? But for small data (up to 20 GB table) it is running and for 300GB table only count(*) running sometimes and sometimes failed Thanks Amit On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: did you copy the hosts file to all the nodes? Daniel On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote: hi Daniel, this stacktrace same for other query . for different run I am getting slave7 sometime slave8... And also I registered all machine IPs in /etc/hosts Regards Amit On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It seems that the application master can't resolve slave6's name to an IP Daniel On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote: Hi Users, *my cluster(1+8) configuration*: RAM : 32 GB each HDFS : 1.5 TB SSD CPU : 8 core each --- I am trying to query on 300GB of table but I am able to run only select query. Except select query , for all other query I am getting following exception. Total jobs = 1 Stage-1 is selected by condition resolver. Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 183 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1416831990090_0005, Tracking URL = http://master:8088/proxy/application_1416831990090_0005/ Kill Command = /root/hadoop/bin/hadoop job -kill job_1416831990090_0005 Hadoop job information for Stage-1: number of mappers: 679; number of reducers: 183 2014-11-24 19:43:01,523 Stage-1 map = 0%, reduce = 0% 2014-11-24 19:43:22,730 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 625.19 sec 2014-11-24 19:43:23,778 Stage-1 map = 100%, reduce = 100% MapReduce Total cumulative CPU time: 10 minutes 25 seconds 190 msec Ended Job = job_1416831990090_0005 with errors Error during job, obtaining debugging information... Examining task ID: task_1416831990090_0005_m_05 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_42 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_35 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_65 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_02 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_07 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_58 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_43 (and more) from job job_1416831990090_0005 Task with the most failures(4): - Task ID: task_1416831990090_0005_m_05 URL: http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05 - Diagnostic Messages for this Task: Container launch failed for container_1416831990090_0005_01_000112 : java.lang.IllegalArgumentException: java.net.UnknownHostException: slave6 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.UnknownHostException: slave6 ... 12 more FAILED: Execution Error, return code 2 from
Re: Container launch failed Error
* except slave6, slave7, slave8 On Mon, Nov 24, 2014 at 10:56 PM, Amit Behera amit.bd...@gmail.com wrote: I did not modify in all the slaves. except slave will it be a problem ? But for small data (up to 20 GB table) it is running and for 300GB table only count(*) running sometimes and sometimes failed Thanks Amit On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: did you copy the hosts file to all the nodes? Daniel On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote: hi Daniel, this stacktrace same for other query . for different run I am getting slave7 sometime slave8... And also I registered all machine IPs in /etc/hosts Regards Amit On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It seems that the application master can't resolve slave6's name to an IP Daniel On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote: Hi Users, *my cluster(1+8) configuration*: RAM : 32 GB each HDFS : 1.5 TB SSD CPU : 8 core each --- I am trying to query on 300GB of table but I am able to run only select query. Except select query , for all other query I am getting following exception. Total jobs = 1 Stage-1 is selected by condition resolver. Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 183 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1416831990090_0005, Tracking URL = http://master:8088/proxy/application_1416831990090_0005/ Kill Command = /root/hadoop/bin/hadoop job -kill job_1416831990090_0005 Hadoop job information for Stage-1: number of mappers: 679; number of reducers: 183 2014-11-24 19:43:01,523 Stage-1 map = 0%, reduce = 0% 2014-11-24 19:43:22,730 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 625.19 sec 2014-11-24 19:43:23,778 Stage-1 map = 100%, reduce = 100% MapReduce Total cumulative CPU time: 10 minutes 25 seconds 190 msec Ended Job = job_1416831990090_0005 with errors Error during job, obtaining debugging information... Examining task ID: task_1416831990090_0005_m_05 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_42 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_35 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_65 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_02 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_07 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_58 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_43 (and more) from job job_1416831990090_0005 Task with the most failures(4): - Task ID: task_1416831990090_0005_m_05 URL: http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05 - Diagnostic Messages for this Task: Container launch failed for container_1416831990090_0005_01_000112 : java.lang.IllegalArgumentException: java.net.UnknownHostException: slave6 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.UnknownHostException
Re: Container launch failed Error
Hi Daniel, Thanks a lot, I will do that and rerun the query. :) On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It is a problem as the application master needs to contact the other nodes Try updating the hosts file on all the machines and try again. Daniel On 24 בנוב׳ 2014, at 19:26, Amit Behera amit.bd...@gmail.com wrote: I did not modify in all the slaves. except slave will it be a problem ? But for small data (up to 20 GB table) it is running and for 300GB table only count(*) running sometimes and sometimes failed Thanks Amit On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: did you copy the hosts file to all the nodes? Daniel On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote: hi Daniel, this stacktrace same for other query . for different run I am getting slave7 sometime slave8... And also I registered all machine IPs in /etc/hosts Regards Amit On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It seems that the application master can't resolve slave6's name to an IP Daniel On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote: Hi Users, *my cluster(1+8) configuration*: RAM : 32 GB each HDFS : 1.5 TB SSD CPU : 8 core each --- I am trying to query on 300GB of table but I am able to run only select query. Except select query , for all other query I am getting following exception. Total jobs = 1 Stage-1 is selected by condition resolver. Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 183 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1416831990090_0005, Tracking URL = http://master:8088/proxy/application_1416831990090_0005/ Kill Command = /root/hadoop/bin/hadoop job -kill job_1416831990090_0005 Hadoop job information for Stage-1: number of mappers: 679; number of reducers: 183 2014-11-24 19:43:01,523 Stage-1 map = 0%, reduce = 0% 2014-11-24 19:43:22,730 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 625.19 sec 2014-11-24 19:43:23,778 Stage-1 map = 100%, reduce = 100% MapReduce Total cumulative CPU time: 10 minutes 25 seconds 190 msec Ended Job = job_1416831990090_0005 with errors Error during job, obtaining debugging information... Examining task ID: task_1416831990090_0005_m_05 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_42 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_35 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_65 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_02 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_07 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_58 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_43 (and more) from job job_1416831990090_0005 Task with the most failures(4): - Task ID: task_1416831990090_0005_m_05 URL: http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05 - Diagnostic Messages for this Task: Container launch failed for container_1416831990090_0005_01_000112 : java.lang.IllegalArgumentException: java.net.UnknownHostException: slave6 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369
Re: Container launch failed Error
Hi Daniel, Thank you , Its running fine. *Another question:* could you please tell me what to do If I will get *Shuffle Error*. one time I got this type of error while running a join query on 300GB data with 20GB data Thanks Amit On Mon, Nov 24, 2014 at 11:13 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: Good luck Share your results with us Daniel On 24 בנוב׳ 2014, at 19:36, Amit Behera amit.bd...@gmail.com wrote: Hi Daniel, Thanks a lot, I will do that and rerun the query. :) On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It is a problem as the application master needs to contact the other nodes Try updating the hosts file on all the machines and try again. Daniel On 24 בנוב׳ 2014, at 19:26, Amit Behera amit.bd...@gmail.com wrote: I did not modify in all the slaves. except slave will it be a problem ? But for small data (up to 20 GB table) it is running and for 300GB table only count(*) running sometimes and sometimes failed Thanks Amit On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: did you copy the hosts file to all the nodes? Daniel On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote: hi Daniel, this stacktrace same for other query . for different run I am getting slave7 sometime slave8... And also I registered all machine IPs in /etc/hosts Regards Amit On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv daniel.ha...@veracity-group.com wrote: It seems that the application master can't resolve slave6's name to an IP Daniel On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote: Hi Users, *my cluster(1+8) configuration*: RAM : 32 GB each HDFS : 1.5 TB SSD CPU : 8 core each --- I am trying to query on 300GB of table but I am able to run only select query. Except select query , for all other query I am getting following exception. Total jobs = 1 Stage-1 is selected by condition resolver. Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 183 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1416831990090_0005, Tracking URL = http://master:8088/proxy/application_1416831990090_0005/ Kill Command = /root/hadoop/bin/hadoop job -kill job_1416831990090_0005 Hadoop job information for Stage-1: number of mappers: 679; number of reducers: 183 2014-11-24 19:43:01,523 Stage-1 map = 0%, reduce = 0% 2014-11-24 19:43:22,730 Stage-1 map = 53%, reduce = 0%, Cumulative CPU 625.19 sec 2014-11-24 19:43:23,778 Stage-1 map = 100%, reduce = 100% MapReduce Total cumulative CPU time: 10 minutes 25 seconds 190 msec Ended Job = job_1416831990090_0005 with errors Error during job, obtaining debugging information... Examining task ID: task_1416831990090_0005_m_05 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_42 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_35 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_65 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_02 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_07 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_58 (and more) from job job_1416831990090_0005 Examining task ID: task_1416831990090_0005_m_43 (and more) from job job_1416831990090_0005 Task with the most failures(4): - Task ID: task_1416831990090_0005_m_05 URL: http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05 - Diagnostic Messages for this Task: Container launch failed for container_1416831990090_0005_01_000112 : java.lang.IllegalArgumentException: java.net.UnknownHostException: slave6 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397) at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189
How to do single user multiple access in hive
Hi users, I have hive set up at multi node hadoop cluster. I want to run multiple queries on top of a table from different machines. So please help how to achieve multiple access on hive to run multiple queries simultaneously. Thanks Amit
Re: How to do single user multiple access in hive
hi Devopam, Thank you for replying. I am using Hue on the top of Hive. So can you please help me, how oozie will help me and how can I integrate oozie with this. Thanks Amit On Fri, Nov 7, 2014 at 7:58 PM, Devopam Mittra devo...@gmail.com wrote: hi Amit, Please try to see if Hive CLI (client) installed on the 'different' machines helps you achieve your goal at the minimalist end. If you use any other program like oozie (to submit your queries) etc. then you can fire queries through the respective interfaces safely enough. regards Devopam On Fri, Nov 7, 2014 at 7:29 PM, Amit Behera amit.bd...@gmail.com wrote: Hi users, I have hive set up at multi node hadoop cluster. I want to run multiple queries on top of a table from different machines. So please help how to achieve multiple access on hive to run multiple queries simultaneously. Thanks Amit -- Devopam Mittra Life and Relations are not binary
Re: confirm subscribe to user@hive.apache.org
Nov 2014 08:03:12 -0800 (PST) Date: Tue, 4 Nov 2014 21:33:12 +0530 Message-ID: calxycns4bv0b7jvbdctxm3utqlhg7wdaf8pppwjvhsetdth...@mail.gmail.com Subject: Want to Join this From: Amit Behera amit.bd...@gmail.com To: user-subscr...@hive.apache.org Content-Type: multipart/alternative; boundary=001a11c33312a1f5fa05070a9941 X-Virus-Checked: Checked by ClamAV on apache.org
RE: PIG heart beat freeze using hue + cdh 5.1
Thanks for the link ... but I am still unable to find how do I resolve the issue with the heart beat ... Date: Wed, 10 Sep 2014 09:52:19 -0400 Subject: Re: PIG heart beat freeze using hue + cdh 5.1 From: zenon...@gmail.com To: user@hive.apache.org Take a look at this link http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ Thanks On Tue, Sep 9, 2014 at 8:53 PM, Amit Dutta amitkrdu...@outlook.com wrote: Thanks a lot for your reply..I changed the following parameters from Cloudera manager mapred.tasktracker.map.tasks.maximum = 2 (it was 1 before) mapred.tasktracker.reduce.tasks.maximum = 2 (it was 1 before) could you please mention what are the parameters and how do I change those ... Regards,Amit Subject: Re: PIG heart beat freeze using hue + cdh 5.1 From: zenon...@gmail.com Date: Tue, 9 Sep 2014 20:34:19 -0400 To: user@hive.apache.org It use Yarn now you need to set your container resource memory and CPU then set the mapreduce physical memory and CPU cores the number of mapper and reducers are calculated based on the resource you gave to your mapper and reducer PengchengSent from my iPhone On Sep 9, 2014, at 7:55 PM, Amit Dutta amitkrdu...@outlook.com wrote: I think one of the issue is number of mapreduce slot for the cluster... Can anyone please let me know how do I increase the mapreduce slot? From: amitkrdu...@outlook.com To: user@hive.apache.org Subject: PIG heart beat freeze using hue + cdh 5.1 Date: Tue, 9 Sep 2014 17:55:01 -0500 Hi I have a only 604 rows in the hive table. while using A = LOAD 'revenue' USING org.apache.hcatalog.pig.HCatLoader(); DUMP A; it starts spouting heart beat repeatedly and does not leave this state.Can please someone help.I am getting following exception 2014-09-09 17:27:45,844 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Kind: RM_DELEGATION_TOKEN, Service: 10.215.204.182:8032, Ident: (owner=cloudera, renewer=oozie mr token, realUser=oozie, issueDate=1410301632571, maxDate=1410906432571, sequenceNumber=14, masterKeyId=2) 2014-09-09 17:27:46,709 [JobControl] WARN org.apache.hadoop.mapreduce.v2.util.MRApps - cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-httpclient-3.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-httpclient-3.1.jar This will be an error in Hadoop 2.0 2014-09-09 17:27:46,712 [JobControl] WARN org.apache.hadoop.mapreduce.v2.util.MRApps - cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-io-2.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-io-2.1.jar This will be an error in Hadoop 2.0 2014-09-09 17:27:46,894 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1410291186220_0006 2014-09-09 17:27:46,968 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://txwlcloud2:8088/proxy/application_1410291186220_0006/ 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1410291186220_0006 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4] C: R: 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://txwlcloud2:50030/jobdetails.jsp?jobid=job_1410291186220_0006 2014-09-09 17:27:47,019 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete Heart beat Heart beat Heart beat Heart beat Heart beat
PIG heart beat freeze using hue + cdh 5.1
Hi I have a only 604 rows in the hive table. while using A = LOAD 'revenue' USING org.apache.hcatalog.pig.HCatLoader(); DUMP A; it starts spouting heart beat repeatedly and does not leave this state.Can please someone help.I am getting following exception 2014-09-09 17:27:45,844 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Kind: RM_DELEGATION_TOKEN, Service: 10.215.204.182:8032, Ident: (owner=cloudera, renewer=oozie mr token, realUser=oozie, issueDate=1410301632571, maxDate=1410906432571, sequenceNumber=14, masterKeyId=2) 2014-09-09 17:27:46,709 [JobControl] WARN org.apache.hadoop.mapreduce.v2.util.MRApps - cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-httpclient-3.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-httpclient-3.1.jar This will be an error in Hadoop 2.0 2014-09-09 17:27:46,712 [JobControl] WARN org.apache.hadoop.mapreduce.v2.util.MRApps - cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-io-2.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-io-2.1.jar This will be an error in Hadoop 2.0 2014-09-09 17:27:46,894 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1410291186220_0006 2014-09-09 17:27:46,968 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://txwlcloud2:8088/proxy/application_1410291186220_0006/ 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1410291186220_0006 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[1,4] C: R: 2014-09-09 17:27:46,969 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://txwlcloud2:50030/jobdetails.jsp?jobid=job_1410291186220_0006 2014-09-09 17:27:47,019 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete Heart beat Heart beat Heart beat Heart beat Heart beat
Re: Pig jobs run forever with PigEditor in Hue
Hi Does anyone please let me know how to increase the mapreduce slots? i am getting infinite heartbeat when i run a PIG script from hue cloudera cdh5.1 Thanks,Amit
Increase mapreduce slots
Hi Does anyone please let me know how to increase the mapreduce slots? i am getting infinite heartbeat when i run a PIG script from hue cloudera cdh5.1 Thanks,Amit
Re: hbase importtsv
Make sure there are no primary key clash. HBase would over write the row if you upload data with same primary key. That's one reason you can possibly get less rows than what you uploaded Sent from my mobile device, please excuse the typos On May 1, 2014, at 3:34 PM, Kennedy, Sean C. sean.kenn...@merck.com wrote: I ran the following command to import an excel.csv file into hbase. Everything looked ok however when I ran a scan on the table in hbase I did not see as many rows as were in excel.csv file. Any help appreciated…. /hd/hadoop/bin/hadoop jar /hbase/hbase-0.94.15/hbase-0.94.15.jar importtsv '-Dimporttsv.separator=,' -Dimporttsv.columns=HBASE_ROW_KEY,ROOT,NODE,VALUE,X_PATH,IMG,NODE_URL,LFLAG,SORT_ORDER,SITE V_MES_INPUT_TREE /ma/segwhdfs/hpp/hbase/MES/csv/MES_INPUT_TREE The csv file had over 200,000 rows, however my hbase scan returned only 3500 or so rows. Output from scan ‘MES_INPUT_TREE’ 3855 row(s) in 5.6090 seconds Output from job: 4/05/01 17:58:53 INFO mapred.JobClient: Job complete: job_201405011721_0001 14/05/01 17:58:53 INFO mapred.JobClient: Counters: 20 14/05/01 17:58:53 INFO mapred.JobClient: Job Counters 14/05/01 17:58:53 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=1208423 14/05/01 17:58:53 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 14/05/01 17:58:53 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 14/05/01 17:58:53 INFO mapred.JobClient: Rack-local map tasks=1 14/05/01 17:58:53 INFO mapred.JobClient: Launched map tasks=4 14/05/01 17:58:53 INFO mapred.JobClient: Data-local map tasks=3 14/05/01 17:58:53 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=1427 14/05/01 17:58:53 INFO mapred.JobClient: ImportTsv 14/05/01 17:58:53 INFO mapred.JobClient: Bad Lines=3 14/05/01 17:58:53 INFO mapred.JobClient: File Output Format Counters 14/05/01 17:58:53 INFO mapred.JobClient: Bytes Written=0 14/05/01 17:58:53 INFO mapred.JobClient: FileSystemCounters 14/05/01 17:58:53 INFO mapred.JobClient: HDFS_BYTES_READ=5243015 14/05/01 17:58:53 INFO mapred.JobClient: FILE_BYTES_WRITTEN=80374 14/05/01 17:58:53 INFO mapred.JobClient: File Input Format Counters 14/05/01 17:58:53 INFO mapred.JobClient: Bytes Read=5242880 14/05/01 17:58:53 INFO mapred.JobClient: Map-Reduce Framework 14/05/01 17:58:53 INFO mapred.JobClient: Map input records=22494 14/05/01 17:58:53 INFO mapred.JobClient: Physical memory (bytes) snapshot=112275456 14/05/01 17:58:53 INFO mapred.JobClient: Spilled Records=0 14/05/01 17:58:53 INFO mapred.JobClient: CPU time spent (ms)=2430 14/05/01 17:58:53 INFO mapred.JobClient: Total committed heap usage (bytes)=145752064 14/05/01 17:58:53 INFO mapred.JobClient: Virtual memory (bytes) snapshot=769548288 14/05/01 17:58:53 INFO mapred.JobClient: Map output records=22491 14/05/01 17:58:53 INFO mapred.JobClient: SPLIT_RAW_BYTES=135 Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates Direct contact information for affiliates is available at http://www.merck.com/contact/contacts.html) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system.
Re: Error using ORC Format with Hive
Thanks for the reply. I did solve protobuf issue by upgrading to 2.5 but then hive 0.12 also started showing the same issue as 0.13 and 0.14 I was working through cli Turns out issue was due to space available (not) to data node. Let me elaborate for others in the list. I had about 2GB available on the partition where data node directory was configured (the name node and data node space was on the same directory tree but different directories, off course). I inserted kv1.txt (few KBs) to table#1 (stored as textfile) and then tried to insert into table#2 select * table#1. Table#2 was stored as Orc. It was difficult for me to guess that converted Orc data would be too big to fit in 2GB. Especially when data node logs did not have any error. Nor was there reserve configured for HDFS. I still don't know why it needs so much space however I could reproduce the error simply by pushing a 300MB file to HDFS hdfs dfs -put . Thus realizing that it's a space issue. Migrated datanode to a bigger partition and everything is fine now. On a separate note I am not seeing any significant query time improvement by pushing data into ORC. About 25% yeah but no where close to multiples I was hoping. I changed the striping to 4MB. Tried creating index every 10k rows. Inserted 6 million rows and did many different type of queries. Any ideas people what I might be missing ? Amit Sent from my mobile device, please excuse the typos On Apr 4, 2014, at 8:21 PM, Bryan Jeffrey bryan.jeff...@gmail.com wrote: Amit, Are you executing your select for conversion to orc via beeline, or hive cli? From looking at your logs, it appears that you do not have permissions in hdfs to write the resultant orc data. Check permissions in hdfs to ensure that your user has write permissions to write to hive warehouse. I forwarded you a previous thread regarding hive 12 protobuf issues. Regards, Bryan Jeffrey On Apr 4, 2014 8:14 PM, Amit Tewari amittew...@gmail.com wrote: I checked out and build hive 0.13. Tried with same results. i.e. eRpcServer.addBlock(NameNodeRpcServer.java:555) at File /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. I also tried it with the release version of hive 0.12 and that gave me a different error. Related to protobuffer incompatibility (pasted below) So at this point I can't run even the basic use case with ORC storage.. Any pointers would be very helpful. Amit Error: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses. at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046) at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749) at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530) at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129) at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749) at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530) at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181
Error using ORC format
Hi All, I am just trying to do some simple tests to see speedup in hive query with Hive 0.14 (trunk version this morning). Just tried to use sample test case to start with. First wanted to see how much I can speed up using ORC format. However for some reason I can't insert data into the table with ORC format. It fails with Exception File filename could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation I can however run inserting data into text table without any issue. I have included the step below. Any pointers would be appreciated. Amit I have a single node setup with minimal settings. JPS output is as follows $ jps 9823 NameNode 12172 JobHistoryServer 9903 DataNode 14895 Jps 11796 ResourceManager 12034 NodeManager *Running Hadoop 0.2.2 with Yarn.* Step1 CREATE TABLE pokes (foo INT, bar STRING); Step 2 LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE pokes; Step 3 CREATE TABLE pokes_1 (foo INT, bar STRING) Step 4 Insert into table pokes_1 select * from pokes; Step 5. CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc; Step 6. insert into pokes_orc select * from pokes; __FAILED__ with Exception below eRpcServer.addBlock(NameNodeRpcServer.java:555) at File /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:168) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227) ... 8 more Step 7 Insert overwrite table pokes_1 select * from pokes; Success
Error using ORC Format with Hive
Hi All, I am just trying to do some simple tests to see speedup in hive query with Hive 0.14 (trunk version this morning). Just tried to use sample test case to start with. First wanted to see how much I can speed up using ORC format. However for some reason I can't insert data into the table with ORC format. It fails with Exception File filename could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation I can however run inserting data into text table without any issue. I have included the step below. Any pointers would be appreciated. Amit I have a single node setup with minimal settings. JPS output is as follows $ jps 9823 NameNode 12172 JobHistoryServer 9903 DataNode 14895 Jps 11796 ResourceManager 12034 NodeManager *Running Hadoop 0.2.2 with Yarn.* Step1 CREATE TABLE pokes (foo INT, bar STRING); Step 2 LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE pokes; Step 3 CREATE TABLE pokes_1 (foo INT, bar STRING) Step 4 Insert into table pokes_1 select * from pokes; Step 5. CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc; Step 6. insert into pokes_orc select * from pokes; __FAILED__ with Exception below eRpcServer.addBlock(NameNodeRpcServer.java:555) at File /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:168) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227) ... 8 more Step 7 Insert overwrite table pokes_1 select * from pokes; Success
Re: Error using ORC Format with Hive
I checked out and build hive 0.13. Tried with same results. i.e. eRpcServer.addBlock(NameNodeRpcServer.java:555) at File /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. I also tried it with the release version of hive 0.12 and that gave me a different error. Related to protobuffer incompatibility (pasted below) So at this point I can't run even the basic use case with ORC storage.. Any pointers would be very helpful. Amit Error: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses. at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046) at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749) at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530) at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129) at com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749) at com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530) at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207) Amit On 4/4/14 2:28 PM, Amit Tewari wrote: Hi All, I am just trying to do some simple tests to see speedup in hive query with Hive 0.14 (trunk version this morning). Just tried to use sample test case to start with. First wanted to see how much I can speed up using ORC format. However for some reason I can't insert data into the table with ORC format. It fails with Exception File filename could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation I can however run inserting data into text table without any issue. I have included the step below. Any pointers would be appreciated. Amit I have a single node setup with minimal settings. JPS output is as follows $ jps 9823 NameNode 12172 JobHistoryServer 9903 DataNode 14895 Jps 11796 ResourceManager 12034 NodeManager *Running Hadoop 0.2.2 with Yarn.* Step1 CREATE TABLE pokes (foo INT, bar STRING); Step 2 LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE pokes; Step 3 CREATE TABLE pokes_1 (foo INT, bar STRING) Step 4 Insert into table pokes_1 select * from pokes; Step 5. CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc; Step 6. insert into pokes_orc select * from pokes; __FAILED__ with Exception below eRpcServer.addBlock(NameNodeRpcServer.java:555) at File /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation
Re: 回复: hive 0.11 auto convert join bug report
Hi Navis, I was trying to look at this email thread as well as the jira to understand the scope of this issue. Does this get triggered only in cases of using aliases which end up mapping to the same value upon hashing? Or can this be triggered under other conditions as well? What if the aliases are not used and the table names some how might map to similar hashcode values? Also is changing the alias the only workaround for this problem or is there any other workaround possible? Thanks, Amit On Sun, Aug 11, 2013 at 9:22 PM, Navis류승우 navis@nexr.com wrote: Hi, Hive is notorious making different result with different aliases. Changing alias was a final way to avoid bug in desperate situation. I think the patch in the issue is ready, wish it's helpful. Thanks. 2013/8/11 wzc1...@gmail.com: Hi Navis, My colleague chenchun finds that hashcode of 'deal' and 'dim_pay_date' are the same and the code in MapJoinProcessor.java ignores the order of rowschema. I look at your patch and it's exactly the same place we are working on. Thanks for your patch. 在 2013年8月11日星期日,下午9:38,Navis류승우 写道: Hi, I've booked this on https://issues.apache.org/jira/browse/HIVE-5056 and attached patch for it. It needs full test for confirmation but you can try it. Thanks. 2013/8/11 wzc1...@gmail.com: Hi all: when I change the table alias dim_pay_date to A, the query pass in hive 0.11( https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass ): use test; create table if not exists src ( `key` int,`val` string); load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite into table src; drop table if exists orderpayment_small; create table orderpayment_small (`dealid` int,`date` string,`time` string, `cityid` int, `userid` int); insert overwrite table orderpayment_small select 748, '2011-03-24', '2011-03-24', 55 ,5372613 from src limit 1; drop table if exists user_small; create table user_small( userid int); insert overwrite table user_small select key from src limit 100; set hive.auto.convert.join.noconditionaltask.size = 200; SELECT `A`.`date` , `deal`.`dealid` FROM `orderpayment_small` `orderpayment` JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date` JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = `orderpayment`.`dealid` JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = `orderpayment`.`cityid` JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` limit 5; It's quite strange and interesting now. I will keep searching for the answer to this issue. 在 2013年8月9日星期五,上午3:32,wzc1...@gmail.com 写道: Hi all: I'm currently testing hive11 and encounter one bug with hive.auto.convert.join, I construct a testcase so everyone can reproduce it(or you can reach the testcase here: https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug): use test; create table src ( `key` int,`val` string); load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite into table src; drop table if exists orderpayment_small; create table orderpayment_small (`dealid` int,`date` string,`time` string, `cityid` int, `userid` int); insert overwrite table orderpayment_small select 748, '2011-03-24', '2011-03-24', 55 ,5372613 from src limit 1; drop table if exists user_small; create table user_small( userid int); insert overwrite table user_small select key from src limit 100; set hive.auto.convert.join.noconditionaltask.size = 200; SELECT `dim_pay_date`.`date` , `deal`.`dealid` FROM `orderpayment_small` `orderpayment` JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` = `orderpayment`.`date` JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = `orderpayment`.`dealid` JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = `orderpayment`.`cityid` JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` limit 5; You should replace the path of kv1.txt by yourself. You can run the above query in hive 0.11 and it will fail with ArrayIndexOutOfBoundsException, You can see the explain result and the console output of the query here : https://gist.github.com/code6/6187569 I compile the trunk code but it doesn't work with this query. I can run this query in hive 0.9 with hive.auto.convert.join turns on. I try to dig into this problem and I think it may be caused by the map join optimization. Some adjacent operators aren't match for the input/output tableinfo(column positions diff). I'm not able to fix this bug and I would appreciate it if someone would like to look into this problem. Thanks.
Re: Hive 0.9.0 with hadoop 0.20.2 (fair scheduler mode)
On Thu, Sep 27, 2012 at 10:56 AM, Amit Sangroya sangroyaa...@gmail.comwrote: Hello everyone, I am experiencing that Hive v-0.9.0 works with hadoop 0.20.0 only in default scheduling mode. But when I try to use the Fair scheduler using this configuration, I see that map reduce do not progress and hive log shows table not found exception. I am using MySql database. This is very strange behavior. I tried to fix if there is any issue in hive, or if there is anything to configure in hadoop. I also tried few other combinations: 1. For me, Hive v-0.9.0 works with hadoop 0.20.0 in default scheduling mode. 2. I can also observe that Hive v-0.7.0 works with hadoop 0.20.0 even with Fair scheduler and default scheduler. 3. Hive v-0.9.0 works with hadoop 1.0.0 in both default and fair scheduling mode. Did anyone tried to run Hive v-0.9.0 works with hadoop 0.20.0 with fair scheduler. Is there any extra setting/parameter for this. Thanks in advance, Amit
Re: Question on bucketed map join
Hi Bejoy, I am joining two tables which are both bucketed 64 ways and i want to do a bucketed map join on them. I set the flag set hive.optimize.bucketmapjoin = true;. The auto.convert.join is always false on our cluster. When i run the following query: select /*+ MAPJOIN(b) */ a.visitor_id FROM amit_merchinteraction a join amit_dse_test_cell_allocation_f b ON a.visitor_id == b.account_id where a.country_id = 1 and a.dateint = 20120322 and a.dateint = 20120315 ; Hive sequentially creates a hash map using the contents of the mapjoin table b, on the client , one at a time. Is that expeced behaviour? Should it not create these hash maps on the corresponding mappers in parallel? Thanks, Amit On Thu, Jan 19, 2012 at 9:22 AM, Bejoy Ks bejoy...@yahoo.com wrote: Hi Avrila AFAIK the bucketed map join is not default in hive and it happens only when the values is set to true. It could be because the same value is already set in the hive configuration xml file. To cross confirm the same could you explicitly set this to false (set hive.optimize.bucketmapjoin = false;)and get the query execution plan from explain command. Please some pointers in line 1. Should I see sth different in the explain extended output if I set and unset the hive.optimize.bucketmapjoin option? [Bejoy] you should be seeing the same Try EXPLAIN your join query after setting this set hive.optimize.bucketmapjoin = false; 2. Should I see something different in the output of hive while running the query if again I set and unset the hive.optimize.bucketmapjoin? [Bejoy] No,Hive output should be the same. What ever is the execution plan for an join, optimally the end result should be same. 3. Is it possible that even though I set bucketmapjoin to true, Hive will still perform a normal map-side join for some reason? How can I check if this has actually happened? [Bejoy] Hive would perform a plain map side join only if the following parameter is enabled. (default it is disabled) set hive.auto.convert.join = true; you need to check this value in your configurations. If it is enabled irrespective of the table size hive would always try a map join, it would come to a normal join only after the map join attempt fails. AFAIK, if the number of buckets are same or multiples between the two tables involved in a join and if the join is on the same columns that are bucketed, with bucketmapjoin enabled it shouldn't execute a plain mapside join a bucketed map side join would be triggered. Hope it helps!.. Regards Bejoy.K.S -- *From:* Avrilia Floratou flora...@cs.wisc.edu *To:* user@hive.apache.org *Sent:* Thursday, January 19, 2012 9:23 PM *Subject:* Question on bucketed map join Hi, I have two tables with 8 buckets each on the same key and want to join them. I ran explain extended and get the plan produced by HIVE which shows that a map-side join is a possible plan. I then set in my script the hive.optimize.bucketmapjoin option to true and reran the explain extended query. I get the exact same plans as output. I ran the query with and without the bucketmapjoin optimization and saw no difference in the running time. I have the following questions: 1. Should I see sth different in the explain extended output if I set and unset the hive.optimize.bucketmapjoin option? 2. Should I see something different in the output of hive while running the query if again I set and unset the hive.optimize.bucketmapjoin? 3. Is it possible that even though I set bucketmapjoin to true, Hive will still perform a normal map-side join for some reason? How can I check if this has actually happened? Thanks, Avrilia
Re: Hive Security
Toad uses JDBC only while connecting as a direct Hive Connection from Eclipse. In most other cases where a Hub is involved it uses Thrift. On the other hand in the current release of Toad for Cloud there is no way to specify what hive user you want to connect as and hence it always connects as the User with which the hive server is running, and connects to the default database. What version of Toad for Cloud are you using? Thanks, Amit On Tue, Jan 31, 2012 at 10:59 AM, Sriram Krishnan skrish...@netflix.comwrote: I was under the impression that Toad uses JDBC – and AFAIK there is no way to authenticate users via JDBC using the HiveServer. FYI - https://issues.apache.org/jira/browse/HIVE-2539. BTW if anyone has a solution to this, I would be very interested to know as well. Sriram From: Shantian Purkad shantian_pur...@yahoo.com Reply-To: user@hive.apache.org, Shantian Purkad shantian_pur...@yahoo.com Date: Tue, 31 Jan 2012 10:20:04 -0800 To: user@hive.apache.org user@hive.apache.org Subject: Hive Security Hi, We are running Hive server and connecting to it through Toad. Everything works fine. Now we want to enable authentication on the hive server. The hive server indicates that we can specify user id password while connecting to it. Can someone please guide on how can we create and set users on hive server? Regards, Shantian
Re: Hive Custom UDF - hive.aux.jars.path not working
Do you know anyway in which this can be done in Hive Server ? Amit On Tue, Aug 23, 2011 at 11:21 AM, Chinna Rao Lalam 72745 chinna...@huawei.com wrote: Hi Amit, Pls check this issue HIVE-1405 it will help u .This issue targeting same scenario. Thanks Chinna Rao Lalam Hi Chinna, That worked, Thanks a lot. So once the jar is picked up, is there a way to create a temporary function, that is retained even if i quit the interactive shell and start it again? Or do i have to use the create command to register the function everytime? Thanks. Amit On Mon, Aug 22, 2011 at 10:00 PM, Chinna chinna...@huawei.com wrote: Hi, U need to mention the jar like this, ** ** ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive- site.xml propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist */{URJARNAME}.jar*/value/property ** ** U r using CLI mode so after changing the value if u start shell that is ok...and in another mode also we can start hive that is hiveserver this case after changing the value u need to restart the hive server ** ** Thanks, Chinna Rao Lalam -- *From:* Amit Sharma [mailto:amitsharma1...@gmail.com] *Sent:* Tuesday, August 23, 2011 3:35 AM *To:* user@hive.apache.org *Subject:* Re: Hive Custom UDF - hive.aux.jars.path not working ** ** Hi Vaibhav, Excuse my ignorance as im a little new to Hive. What do you mean by restart the Hive Server? I am using the Hive Interactive shell for my work. So i start the shell after modifying the config variable. Which server do i need to restart? Amit On Mon, Aug 22, 2011 at 2:49 PM, Aggarwal, Vaibhav vagg...@amazon.com wrote: Did you restart the hive server after modifying the hive- site.xml settings? I think you need to restart the server to pick up the latest settings in the config file. Thanks Vaibhav *From:* Amit Sharma [mailto:amitsharma1...@gmail.com] *Sent:* Monday, August 22, 2011 2:42 PM *To:* user@hive.apache.org *Subject:* Hive Custom UDF - hive.aux.jars.path not working Hi, I build custom UDFS for hive and they seem to work fine when i explicitly register the jars using the add jar jarname command or put in in the environment variable HIVE_AUX_JARS_PATH. But if i add it as a configuration variable in the hive-site.xml file and try to register the function using create temporary function functionname as 'funciton' , it cannot find the jar. Any idea whats going on here? Here is the snippet from hive-site.xml: ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive- site.xml propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist/value/property Amit ** **
Re: Hive Custom UDF - hive.aux.jars.path not working
Hi Chinna, That worked, Thanks a lot. So once the jar is picked up, is there a way to create a temporary function, that is retained even if i quit the interactive shell and start it again? Or do i have to use the create command to register the function everytime? Thanks. Amit On Mon, Aug 22, 2011 at 10:00 PM, Chinna chinna...@huawei.com wrote: Hi, U need to mention the jar like this, ** ** ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist */{URJARNAME}.jar*/value/property ** ** U r using CLI mode so after changing the value if u start shell that is ok...and in another mode also we can start hive that is hiveserver this case after changing the value u need to restart the hive server ** ** Thanks, Chinna Rao Lalam -- *From:* Amit Sharma [mailto:amitsharma1...@gmail.com] *Sent:* Tuesday, August 23, 2011 3:35 AM *To:* user@hive.apache.org *Subject:* Re: Hive Custom UDF - hive.aux.jars.path not working ** ** Hi Vaibhav, Excuse my ignorance as im a little new to Hive. What do you mean by restart the Hive Server? I am using the Hive Interactive shell for my work. So i start the shell after modifying the config variable. Which server do i need to restart? Amit On Mon, Aug 22, 2011 at 2:49 PM, Aggarwal, Vaibhav vagg...@amazon.com wrote: Did you restart the hive server after modifying the hive-site.xml settings? I think you need to restart the server to pick up the latest settings in the config file. Thanks Vaibhav *From:* Amit Sharma [mailto:amitsharma1...@gmail.com] *Sent:* Monday, August 22, 2011 2:42 PM *To:* user@hive.apache.org *Subject:* Hive Custom UDF - hive.aux.jars.path not working Hi, I build custom UDFS for hive and they seem to work fine when i explicitly register the jars using the add jar jarname command or put in in the environment variable HIVE_AUX_JARS_PATH. But if i add it as a configuration variable in the hive-site.xml file and try to register the function using create temporary function functionname as 'funciton' , it cannot find the jar. Any idea whats going on here? Here is the snippet from hive-site.xml: ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist/value/property Amit ** **
Hive Custom UDF - hive.aux.jars.path not working
Hi, I build custom UDFS for hive and they seem to work fine when i explicitly register the jars using the add jar jarname command or put in in the environment variable HIVE_AUX_JARS_PATH. But if i add it as a configuration variable in the hive-site.xml file and try to register the function using create temporary function functionname as 'funciton' , it cannot find the jar. Any idea whats going on here? Here is the snippet from hive-site.xml: ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist/value/property Amit
Re: Hive/hbase integration - Rebuild the Storage Handler
Hi, I am also trying the same but don't know the exact build steps. Someone please tell the same. -regards Amit From: Jean-Charles Thomas jctho...@autoscout24.com To: Hive mailing list user@hive.apache.org Sent: Tue, 22 March, 2011 11:40:18 AM Subject: Hive/hbase integration - Rebuild the Storage Handler Hi, I am using hbase 0.90 and Hive 0.7 and would like to try the hive/hbase integration. From the Wiki Doc I could see that I have to rebuild the the handler: “If you are not using hbase-0.89.0, you will need to rebuild the handler with the HBase jar matching your version, and change the --auxpath above accordingly” Can someone explain in more details how this can be done? I unfortunately only have a basic java knowledge. Thanks in advance, JC
Dynamic Configuration support in Hive SQL
Hi, Does hive support dynamic configuration? For example: is it possible to write a hive script with some ${PARAM} variables and let hive replace these parameters with their values at runtime. Eg. Original hive script: select * from person where age ${MIN_AGE}; Config file: MIN_AGE=18 And hive replaces the MIN_AGE parameter automatically. -amit