Re: Hive Druid SQL

2018-04-09 Thread Amit
Hi Thanks for checking.

Our Druid installation has Sql Server as metastore, and it is using Azure
for Deep storage. It is working fine.
That is why i was wondering why Hive would need MySQL or Postgres only for
Druid integration.

I'm using Hive 2.2.0

Now dismantling Druid just for Hive integration is a big effort to do i
would guess.



On Tue, Apr 10, 2018 at 12:46 AM, Ashutosh Chauhan <hashut...@apache.org>
wrote:

> Hi Amit,
>
> Yes only mysql and postgres are supported for druid metadata storage.
> Thats because Druid only supports these. You mentioned that Hive and Druid
> are working independently. Which metadata storage your Druid install is
> using?
>
> Thanks,
> Ashutosh
>
> On Mon, Apr 9, 2018 at 7:39 PM, Lefty Leverenz <leftylever...@gmail.com>
> wrote:
>
>> >  Does it mean, I cannot use SQLserver as Druid metastore for Hive to
>> work with Druid?
>>
>> Apparently so.
>>
>>- In Hive 2.2.0 *hive.druid.metadata.db.type* was introduced with
>>values "mysql" and "postgres" (HIVE-15277
>><https://issues.apache.org/jira/browse/HIVE-15277>).
>>- In Hive 2.3.0 the value "postgres" was changed to "postgresql" (
>>HIVE-15809 <https://issues.apache.org/jira/browse/HIVE-15809>).
>>- In Hive 3.0.0 (upcoming release) the value "derby" is added (
>>HIVE-18196 <https://issues.apache.org/jira/browse/HIVE-18196>).
>>
>> -- Lefty
>>
>>
>> On Fri, Apr 6, 2018 at 10:09 AM Amit <ami...@gmail.com> wrote:
>>
>>> Hive Druid Integration:
>>> I have Hive and Druid working independently.
>>> But having trouble connecting the two together.
>>> I don't have Hortonworks.
>>>
>>> I have Druid using sqlserver as metadata store database.
>>>
>>> When I try setting this property in Beeline,
>>>
>>> set hive.druid.metadata.db.type=sqlserver;
>>>
>>>  I get a message:
>>> Error: Error while processing statement: 'SET
>>> hive.druid.metadata.db.type=sqlserver' FAILED
>>> in validation : Invalid value.. expects one of patterns [mysql, postgres].
>>> (state=42000,code=1)
>>>
>>> Does it mean, I cannot use SQLserver as Druid metastore for Hive to work
>>> with Druid?
>>>
>>>
>>>
>>>
>


Hive Druid SQL

2018-04-06 Thread Amit
Hive Druid Integration:
I have Hive and Druid working independently.
But having trouble connecting the two together.
I don't have Hortonworks.

I have Druid using sqlserver as metadata store database.

When I try setting this property in Beeline,

set hive.druid.metadata.db.type=sqlserver;

 I get a message:
Error: Error while processing statement: 'SET
hive.druid.metadata.db.type=sqlserver' FAILED in
validation : Invalid value.. expects one of patterns [mysql, postgres].
(state=42000,code=1)

Does it mean, I cannot use SQLserver as Druid metastore for Hive to work
with Druid?


test

2018-04-06 Thread Amit
test


LLAP Query Failed with no such method exception

2017-08-01 Thread Amit Kumar
Hi,

I have configured hadoop 2.7.3 and hive 2.1.1 with LLAP.

tez quiries are running fine, but after LLAP daemon is launched using
slider, any insert or count(*) llap queries is throwing exception:

java.lang.Exception: java.util.concurrent.ExecutionException:
java.lang.NoSuchMethodError:
org.apache.hadoop.tracing.SpanReceiverHost.getInstance(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/SpanReceiverHost;
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:271)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69)
at
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at
org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException:
java.lang.NoSuchMethodError:
org.apache.hadoop.tracing.SpanReceiverHost.getInstance(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/tracing/SpanReceiverHost;
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:265)
... 12 more

I suppose, might be because of missing htrace configuration, but after
configuring the below configuration in core-site.xml, It is still throwing
same exception.



hadoop.htrace.spanreceiver.classes
org.apache.htrace.impl.LocalFileSpanReceiver


hadoop.htrace.local-file-span-receiver.path
/usr/local/hadoop/logs/htrace.out


Thanks & Regards,

Amit Kumar,
Scientist B,
Mob: 9910611621


Re: trouble starting hiveserver2 with hive2.1.1

2017-07-26 Thread Amit Kumar
Hi,

when running hive --service llap, on hive2.3, It throw below error

Failed: java.io.IOException: Target /tmp/staging-slider-hpJkzz/lib/tez is a
directory
java.util.concurrent.ExecutionException: java.io.IOException: Target
/tmp/staging-slider-hpJkzz/lib/tez is a directory
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hive.llap.cli.LlapServiceDriver.run(LlapServiceDriver.java:556)
at
org.apache.hadoop.hive.llap.cli.LlapServiceDriver.main(LlapServiceDriver.java:116)
Caused by: java.io.IOException: Target /tmp/staging-slider-hpJkzz/lib/tez
is a directory
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:500)
at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:502)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:348)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1965)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1933)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1898)
at
org.apache.hadoop.hive.llap.cli.LlapServiceDriver$3.call(LlapServiceDriver.java:450)
at
org.apache.hadoop.hive.llap.cli.LlapServiceDriver$3.call(LlapServiceDriver.java:404)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
INFO cli.LlapServiceDriver: LLAP service driver finished



Thanks & Regards,

Amit Kumar,
Mob: 9910611621


On Sat, Jul 22, 2017 at 5:00 PM, Amit Kumar <delhiam...@gmail.com> wrote:

> Hi,
>
> I have installed hadoop 2.7.2 and hive 2.1.1. Successfully configured
> mysql as metastore and also able to connect to hive using hive cli.
>
> But on starting hiveserver2, exception is thrown as below:
>
> [hadoop@master bin]$ hiveserver2
> which: no hbase in (/opt/hadoop/hive/bin:/opt/
> hadoop/hive/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/
> usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/
> home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/
> opt/hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/
> bin:/opt/hadoop/slider/bin:/home/hadoop/.local/bin:/home/
> hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/
> hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/bin:/
> opt/hadoop/slider/bin)
> Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.hive.common.util.HiveStringUtils.startupShutdownMessage(Ljava/
> lang/Class;[Ljava/lang/String;Lorg/apache/commons/logging/Log;)V
> at org.apache.hive.service.server.HiveServer2.main(
> HiveServer2.java:455)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
>
>
> Thanks & Regards,
>
> Amit Kumar,
> Mob: 9910611621 <099106%2011621>
>
>


trouble starting hiveserver2 with hive2.1.1

2017-07-22 Thread Amit Kumar
Hi,

I have installed hadoop 2.7.2 and hive 2.1.1. Successfully configured mysql
as metastore and also able to connect to hive using hive cli.

But on starting hiveserver2, exception is thrown as below:

[hadoop@master bin]$ hiveserver2
which: no hbase in
(/opt/hadoop/hive/bin:/opt/hadoop/hive/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/slider/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/usr/java/default/bin:/opt/hadoop/sbin:/opt/hadoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/sqoop/bin:/opt/hadoop/slider/bin)
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.hive.common.util.HiveStringUtils.startupShutdownMessage(Ljava/lang/Class;[Ljava/lang/String;Lorg/apache/commons/logging/Log;)V
at
org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:455)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)


Thanks & Regards,

Amit Kumar,
Mob: 9910611621


Hive Insert query is failing

2017-01-07 Thread Amit Bajpai
Hi,

Hive query running on Tez to insert from one table to another is failing with 
the below error. Both the tables have file format as ORC and all the columns in 
both the tables are string. Can anyone help on how it can be fixed.

Hive version: 1.2.1.2.4

Error message:
Vertex failed, vertexName=Map 1, vertexId=vertex_1483552897173_0276_1_00, 
diagnostics=[Task failed, taskId=task_1483552897173_0276_1_00_00, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: 
java.io.IOException: java.io.IOException: ORC does not support type conversion 
from VARCHAR to STRING
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Query:
insert into table tableB  select col1, col2, col3, col4, col5, col6, col7, col8 
from tableA

Thanks
Amit

Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

Reading hive-site.xml

2016-09-18 Thread Amit Bajpai
Hi,

I am trying to understand how hive is reading the configuration from 
hive-site.xml. Where we define the structure of the xml file and code used to 
read the hite-site.xml.

Thanks
Amit

Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

RE: Error running SQL query through Hive JDBC

2016-08-05 Thread Amit Bajpai
Below is the code snippet with the SQL query which I am running. The same query 
is running fine through Hive CLI.

String sql = " SELECT TBL_CODE FROM DB.CODE_MAP 
WHERE SYSTEM_NAME='TDS' AND TABLE_NAME=TRIM('XYZ')";

System.out.println("New SQL: " + sql);

String driverName = 
"org.apache.hive.jdbc.HiveDriver";
try {
Class.forName(driverName);
Connection con = 
DriverManager.getConnection(

"jdbc:hive2://hiveservername:1/default",

"username", "");
HiveStatement stmt = 
(HiveStatement) con.createStatement();
ResultSet res = 
stmt.executeQuery(sql);

while (res.next()) {
Object ret_obj 
= res.getObject(1);

System.out.println(res.getString(1));
}

stmt.close();
con.close();

} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (SQLException e) {
e.printStackTrace();
}

From: Markovitz, Dudu [mailto:dmarkov...@paypal.com]
Sent: Friday, August 05, 2016 3:04 PM
To: user@hive.apache.org
Subject: RE: Error running SQL query through Hive JDBC

Can you please share the query?

From: Amit Bajpai [mailto:amit.baj...@flextronics.com]
Sent: Friday, August 05, 2016 10:40 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Error running SQL query through Hive JDBC

Hi,

I am getting the below error when running the SQL query through Hive JDBC. Can 
suggestion how to fix it.

org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: SemanticException UDF = is not allowed
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231)
at 
org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217)
at 
org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254)
at 
org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392)
at com.flex.hdp.logs.test.main(test.java:84)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: SemanticException UDF = is not allowed
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:314)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:111)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:180)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:256)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:376)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
at com.sun.proxy.$Proxy32.executeStatementAsync(Unknown Source)
at 
org.apache.hiv

Error running SQL query through Hive JDBC

2016-08-05 Thread Amit Bajpai
Hi,

I am getting the below error when running the SQL query through Hive JDBC. Can 
suggestion how to fix it.

org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: SemanticException UDF = is not allowed
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231)
at 
org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217)
at 
org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254)
at 
org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:392)
at com.flex.hdp.logs.test.main(test.java:84)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling 
statement: FAILED: SemanticException UDF = is not allowed
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:314)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:111)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:180)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:256)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:376)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
at com.sun.proxy.$Proxy32.executeStatementAsync(Unknown Source)
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:271)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:401)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.parse.SemanticException:UDF = is not allowed
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:677)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:810)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1152)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:189)
at 

RE: hive concurrency not working

2016-08-03 Thread Amit Bajpai
You need to increase the value for the below hive property value in Ambari

hive.server2.tez.sessions.per.default.queue

If this does not fix the issue then you need to update the capacity scheduler 
property values.

From: Raj hadoop [mailto:raj.had...@gmail.com]
Sent: Wednesday, August 03, 2016 8:15 AM
To: user@hive.apache.org
Subject: hive concurrency not working

Dear All,

In need or your help,

we have horton works 4 node cluster,and the problem is hive is allowing only 
one user at a time,

if any second resource need to login hive is not working,

could someone please help me in this

Thanks,
Rajesh

Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!


RE: Yarn Application ID for Hive query

2016-07-18 Thread Amit Bajpai
I am running hive on Tez. I am able to get the Yarn application ID for the hive 
query by submitting the query through Hive JDBC and using HiveStatement.

Connection con = 
DriverManager.getConnection("jdbc:hive2://abc:1/default","xyz", "");
HiveStatement stmt = (HiveStatement) con.createStatement();
String sql = " SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID ";
ResultSet res = stmt.executeQuery(sql);
String yarn_app_id = new String();

for (String log : stmt.getQueryLog()) {
if (log.contains("App id")){
yarn_app_id = log.substring(log.indexOf("App id") +7, 
log.length()-1);
}
}

System.out.println("YARN Application ID: " + yarn_app_id);

Now I am trying to find the Tez DAG ID for the query.


From: Gerber, Bryan W [mailto:bryan.ger...@pnnl.gov]
Sent: Monday, July 18, 2016 1:47 PM
To: user@hive.apache.org
Subject: RE: Yarn Application ID for Hive query

Making Hive look like a normal SQL database is the goal of libraries like this, 
so it make sense that that abstraction wouldn't leak a concept like application 
ID. Especially because not all Hive queries generate a YARN application.

That said, we went through this with JDBC access to Hive a while back to allow 
our user interface to cancel a query. Only relevant discussion I found was 
here: 
http://grokbase.com/t/cloudera/hue-user/1373c258xg/how-hue-beeswax-is-able-to-read-the-hadoop-job-id-that-gets-generated-by-hiveserver2

We are using this method, plus a background task that polls the YARN resource 
manager API to find the job with the corresponding hive.session.id. It is a lot 
of work for something that seems very simple. It would be nice to have access 
to a command or API call in HiveServer2 similar to MySQL's "SHOW PROCESSLIST" 
(and equivalent commands in most other databases).

From: Amit Bajpai [mailto:amit.baj...@flextronics.com]
Sent: Thursday, July 14, 2016 10:22 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Yarn Application ID for Hive query

Hi,

I am using the below python program to run a hive query. How can I get the Yarn 
application ID using the python program for the hive query execution.

import pyhs2

with pyhs2.connect(host='abc.sac.com',
   port=1,
   authMechanism="PLAIN",
   user='amit',
   password='amit',
   database='default') as conn:
with conn.cursor() as cur:
#Execute query
cur.execute("SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID")

#Fetch table results
for i in cur.fetch():
print i

Thanks
Amit


Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

Yarn Application ID for Hive query

2016-07-14 Thread Amit Bajpai
Hi,

I am using the below python program to run a hive query. How can I get the Yarn 
application ID using the python program for the hive query execution.

import pyhs2

with pyhs2.connect(host='abc.sac.com',
   port=1,
   authMechanism="PLAIN",
   user='amit',
   password='amit',
   database='default') as conn:
with conn.cursor() as cur:
#Execute query
cur.execute("SELECT COMP_ID, COUNT(1) FROM tableA GROUP BY COMP_ID")

#Fetch table results
for i in cur.fetch():
print i

Thanks
Amit


Legal Disclaimer:
The information contained in this message may be privileged and confidential. 
It is intended to be read only by the individual or entity to whom it is 
addressed or by their designee. If the reader of this message is not the 
intended recipient, you are on notice that any distribution of this message, in 
any form, is strictly prohibited. If you have received this message in error, 
please immediately notify the sender and delete or destroy any copy of this 
message!

Date parsing Exception at Hadoop server

2015-09-01 Thread Amit Dutta
Hi Gurus,

 I am facing ParseException for the following code for my UDF in hive. just the 
evaluate method.

private final SimpleDateFormat sdf = new SimpleDateFormat("dd-MM-", 
Locale.US);

public Object evaluate(DeferredObject[] arguments) throws HiveException {
String result = "0";
assert (arguments.length == 1);

List list = (List) this.listOI.getList(arguments[0].get());
if (list == null) {
return null;
}
System.out.println("- Size :"+ 
list.get(0));
if (list.size() > 0) {
List listDates = new ArrayList();
// result = compareDates(list);
for (Text dateTxt : list) {
try {

dateTxt.toString());
String dt = new String(dateTxt.toString().trim());
Date transDate = sdf.parse(dt);
listDates.add(transDate);

 listDates.size());
} catch (ParseException e) {
System.err.println(e.getMessage());
e.printStackTrace();
}
}
if (listDates.size() > 0) {
Date resultDate = Collections.min(listDates);
result = sdf.format(resultDate);
}
}
return result;
}

The same code is passing the test perfectly. Following is the test method.

 public void testGetMinDate() throws HiveException {

// set up the models we need
GetMinDate example = new GetMinDate();
ObjectInspector stringOI = 
PrimitiveObjectInspectorFactory.javaStringObjectInspector;
ObjectInspector listOI = 
ObjectInspectorFactory.getStandardListObjectInspector(stringOI);
StringObjectInspector resultInspector = (StringObjectInspector) 
example.initialize(new ObjectInspector[]{listOI});

// create the actual UDF arguments
List list = new ArrayList();
list.add(new Text("01-01-2015"));
list.add(new Text("01-03-2014"));
list.add(new Text("04-01-2015"));

// test our results

// the value exists
Object result = example.evaluate(new DeferredObject[]{new 
DeferredJavaObject(list)});
System.out.println(result);
Assert.assertEquals("01-03-2014", result);

// the value doesn't exist
//  Object result2 = example.evaluate(new DeferredObject[]{new 
DeferredJavaObject(list)});
//  Assert.assertEquals("Success", result2);

// arguments are null
Object result3 = example.evaluate(new DeferredObject[]{new 
DeferredJavaObject(null)});
Assert.assertNull(result3);
  }

Following is the error

java.text.ParseException: Unparseable date: "23-05-2015"
at java.text.DateFormat.parse(DateFormat.java:357)
at com.vzw.mct.GetMinDate.evaluate(GetMinDate.java:44)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:166)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:79)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1064)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:875)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:737)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:803)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:262)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Not sure why it is failing in the server. If any one kindly point it out it 
will be great.


Thanks,

Amit


Re: user matching query does not exist

2015-05-15 Thread amit kumar
i am using CDH 5.2.1,

Any pointers will be of immense help.



Thanks



On Fri, May 15, 2015 at 9:43 AM, amit kumar ak3...@gmail.com wrote:

 Hi,

 After re-create my account in Hue, i receives “User matching query does
 not exist” when attempting to perform hive query.

 The query is succeed in hive command line.

 Please suggest on this,

 
 Thanks you
 Amit



Re: user matching query does not exist

2015-05-15 Thread amit kumar
Thank you Nitin,

When the user runs the query via Hive command line.  The query succeeds

query like select * from railway;

as per the link provided you i fire the command ./manage.py clearsessions ;
i get the error.



On Fri, May 15, 2015 at 12:32 PM, Nitin Pawar nitinpawar...@gmail.com
wrote:

 this is related to djnago
 see this on how to clear sessions from django

 http://www.opencsw.org/community/questions/289/how-to-clear-the-django-session-cache

 On Fri, May 15, 2015 at 12:24 PM, amit kumar ak3...@gmail.com wrote:

 Yes it is happening  for hue only, can u plz suggest how i cleaning up
 hue session from server ?

 The query is succeed in hive command line.

 On Fri, May 15, 2015 at 11:52 AM, Nitin Pawar nitinpawar...@gmail.com
 wrote:

 Is this happening for Hue?

 If yes, may be you can try cleaning up hue sessions from server. (this
 may clean all users active sessions from hue so be careful while doing it)



 On Fri, May 15, 2015 at 11:31 AM, amit kumar ak3...@gmail.com wrote:

 i am using CDH 5.2.1,

 Any pointers will be of immense help.



 Thanks



 On Fri, May 15, 2015 at 9:43 AM, amit kumar ak3...@gmail.com wrote:

 Hi,

 After re-create my account in Hue, i receives “User matching query
 does not exist” when attempting to perform hive query.

 The query is succeed in hive command line.

 Please suggest on this,

 
 Thanks you
 Amit





 --
 Nitin Pawar





 --
 Nitin Pawar



Re: user matching query does not exist

2015-05-15 Thread amit kumar
Yes it is happening  for hue only, can u plz suggest how i cleaning up hue
session from server ?

The query is succeed in hive command line.

On Fri, May 15, 2015 at 11:52 AM, Nitin Pawar nitinpawar...@gmail.com
wrote:

 Is this happening for Hue?

 If yes, may be you can try cleaning up hue sessions from server. (this may
 clean all users active sessions from hue so be careful while doing it)



 On Fri, May 15, 2015 at 11:31 AM, amit kumar ak3...@gmail.com wrote:

 i am using CDH 5.2.1,

 Any pointers will be of immense help.



 Thanks



 On Fri, May 15, 2015 at 9:43 AM, amit kumar ak3...@gmail.com wrote:

 Hi,

 After re-create my account in Hue, i receives “User matching query does
 not exist” when attempting to perform hive query.

 The query is succeed in hive command line.

 Please suggest on this,

 
 Thanks you
 Amit





 --
 Nitin Pawar



user matching query does not exist

2015-05-14 Thread amit kumar
Hi,

After re-create my account in Hue, i receives “User matching query does not
exist” when attempting to perform hive query.

The query is succeed in hive command line.

Please suggest on this,


Thanks you
Amit


Re: Hive : plan serialization format option

2015-05-07 Thread amit kumar
what error you are getting after mentioning javaxml in place of kryo

On Wed, May 6, 2015 at 12:44 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
wrote:

 Please find attached error log for the same.

 On Tue, May 5, 2015 at 11:36 PM, Jason Dere jd...@hortonworks.com wrote:

  Looks like you are running into
 https://issues.apache.org/jira/browse/HIVE-8321, fixed in Hive-0.14.
 You might be stuck having to use Kryo, what are the issues you are having
 with Kryo?


  Thanks,
 Jason

  On May 5, 2015, at 4:28 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
 wrote:

  Bottom on the log:

 at java.beans.Encoder.writeObject(Encoder.java:74)

 at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327)

 at java.beans.Encoder.writeExpression(Encoder.java:330)

 at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454)

 at
 java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:194)

 at
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256)

 ... 98 more

 Caused by: java.lang.NullPointerException

 at java.lang.StringBuilder.init(StringBuilder.java:109)

 at
 org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:49)

 at
 org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:45)

 at
 org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo.getTypeName(VarcharTypeInfo.java:37)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)

 at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)

 at java.beans.Statement.invokeInternal(Statement.java:292)

 at java.beans.Statement.access$000(Statement.java:58)

 at java.beans.Statement$2.run(Statement.java:185)

 at java.security.AccessController.doPrivileged(Native Method)

 at java.beans.Statement.invoke(Statement.java:182)

 at java.beans.Expression.getValue(Expression.java:153)

 at
 java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:193)

 at
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256)

 ... 111 more

 Job Submission failed with exception
 'java.lang.RuntimeException(java.lang.RuntimeException: Cannot serialize
 object)'

 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask

 On Tue, May 5, 2015 at 3:10 PM, Jason Dere jd...@hortonworks.com wrote:

 kryo/javaXML are the only available options. What are the errors you see
 with each setting?


  On May 1, 2015, at 9:41 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
 wrote:

   Hi Hive Users,

  I'm using cloudera's hive 0.13 version which by default provide Kryo
 plan serialization format.
  property
 namehive.plan.serialization.format/name
 value*kryo*/value
 /property

  As i'm facing issues with Kryo, can anyone help me identify the other
 open options in place of Kryo for hive plan serialization format.

  I know one option javaXML, but in my case it is not working.











Re: Hive : plan serialization format option

2015-05-07 Thread amit kumar
Jason,

The last comment is This has been fixed in 0.14 release. Please open new
jira if you see any issues.

is this issue resolved in hive 0.14 ?



On Tue, May 5, 2015 at 11:36 PM, Jason Dere jd...@hortonworks.com wrote:

  Looks like you are running into
 https://issues.apache.org/jira/browse/HIVE-8321, fixed in Hive-0.14.
 You might be stuck having to use Kryo, what are the issues you are having
 with Kryo?


  Thanks,
 Jason

  On May 5, 2015, at 4:28 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
 wrote:

  Bottom on the log:

 at java.beans.Encoder.writeObject(Encoder.java:74)

 at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327)

 at java.beans.Encoder.writeExpression(Encoder.java:330)

 at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454)

 at
 java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:194)

 at
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256)

 ... 98 more

 Caused by: java.lang.NullPointerException

 at java.lang.StringBuilder.init(StringBuilder.java:109)

 at
 org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:49)

 at
 org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:45)

 at
 org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo.getTypeName(VarcharTypeInfo.java:37)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)

 at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)

 at java.beans.Statement.invokeInternal(Statement.java:292)

 at java.beans.Statement.access$000(Statement.java:58)

 at java.beans.Statement$2.run(Statement.java:185)

 at java.security.AccessController.doPrivileged(Native Method)

 at java.beans.Statement.invoke(Statement.java:182)

 at java.beans.Expression.getValue(Expression.java:153)

 at
 java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:193)

 at
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256)

 ... 111 more

 Job Submission failed with exception
 'java.lang.RuntimeException(java.lang.RuntimeException: Cannot serialize
 object)'

 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask

 On Tue, May 5, 2015 at 3:10 PM, Jason Dere jd...@hortonworks.com wrote:

 kryo/javaXML are the only available options. What are the errors you see
 with each setting?


  On May 1, 2015, at 9:41 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
 wrote:

   Hi Hive Users,

  I'm using cloudera's hive 0.13 version which by default provide Kryo
 plan serialization format.
  property
 namehive.plan.serialization.format/name
 value*kryo*/value
 /property

  As i'm facing issues with Kryo, can anyone help me identify the other
 open options in place of Kryo for hive plan serialization format.

  I know one option javaXML, but in my case it is not working.










Re: Hive : plan serialization format option

2015-05-07 Thread amit kumar
Thank you Jason, will upgrade the hive 0.14, and tried out the bug.

On Fri, May 8, 2015 at 1:43 AM, Jason Dere jd...@hortonworks.com wrote:

  The javaXML issue referenced by that bug should be fixed by hive-0.14 ..
 note the original poster was using hive-0.13


  On May 7, 2015, at 12:48 PM, amit kumar ak3...@gmail.com wrote:

  Jason,

  The last comment is This has been fixed in 0.14 release. Please open
 new jira if you see any issues.

  is this issue resolved in hive 0.14 ?



 On Tue, May 5, 2015 at 11:36 PM, Jason Dere jd...@hortonworks.com wrote:

  Looks like you are running into
 https://issues.apache.org/jira/browse/HIVE-8321, fixed in Hive-0.14.
 You might be stuck having to use Kryo, what are the issues you are having
 with Kryo?


  Thanks,
 Jason

  On May 5, 2015, at 4:28 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
 wrote:

  Bottom on the log:

 at java.beans.Encoder.writeObject(Encoder.java:74)

 at java.beans.XMLEncoder.writeObject(XMLEncoder.java:327)

 at java.beans.Encoder.writeExpression(Encoder.java:330)

 at java.beans.XMLEncoder.writeExpression(XMLEncoder.java:454)

 at
 java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:194)

 at
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256)

 ... 98 more

 Caused by: java.lang.NullPointerException

 at java.lang.StringBuilder.init(StringBuilder.java:109)

 at
 org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:49)

 at
 org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo.getQualifiedName(BaseCharTypeInfo.java:45)

 at
 org.apache.hadoop.hive.serde2.typeinfo.VarcharTypeInfo.getTypeName(VarcharTypeInfo.java:37)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75)

 at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279)

 at java.beans.Statement.invokeInternal(Statement.java:292)

 at java.beans.Statement.access$000(Statement.java:58)

 at java.beans.Statement$2.run(Statement.java:185)

 at java.security.AccessController.doPrivileged(Native Method)

 at java.beans.Statement.invoke(Statement.java:182)

 at java.beans.Expression.getValue(Expression.java:153)

 at
 java.beans.DefaultPersistenceDelegate.doProperty(DefaultPersistenceDelegate.java:193)

 at
 java.beans.DefaultPersistenceDelegate.initBean(DefaultPersistenceDelegate.java:256)

 ... 111 more

 Job Submission failed with exception
 'java.lang.RuntimeException(java.lang.RuntimeException: Cannot serialize
 object)'

 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask

 On Tue, May 5, 2015 at 3:10 PM, Jason Dere jd...@hortonworks.com wrote:

 kryo/javaXML are the only available options. What are the errors you see
 with each setting?


  On May 1, 2015, at 9:41 AM, Bhagwan S. Soni bhgwnsson...@gmail.com
 wrote:

   Hi Hive Users,

  I'm using cloudera's hive 0.13 version which by default provide Kryo
 plan serialization format.
  property
 namehive.plan.serialization.format/name
 value*kryo*/value
 /property

  As i'm facing issues with Kryo, can anyone help me identify the other
 open options in place of Kryo for hive plan serialization format.

  I know one option javaXML, but in my case it is not working.












Re: Unable to move files on Hive/Hdfs

2015-05-04 Thread amit kumar
Hi Doug,

I have use CDH 5.2.1

Disable ACLs on Name Nodes


Set Enable Access Control Lists = False

Save Changes

Restart Hadoop Cluster



Stack trace:

2015-05-04 10:38:18,820 INFO  [main]: exec.Task
(SessionState.java:printInfo(537)) - Moving data to:
hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1
from
hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002

2015-05-04 10:38:18,857 ERROR [main]: exec.Task
(SessionState.java:printError(546)) - Failed with exception Unable to move
sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
to destination
hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move
sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
to destination
hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

at
org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269)

at
org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89)

at
org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914)

at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)

at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)

at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)

at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)

at
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)

at
org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)

at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL operation
has been rejected.  Support for ACLs has been disabled by setting
dfs.namenode.acls.enabled to false.

at
org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553)


After rolling those same changes out, the problem resolved itself.


On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas 
douglas.mo...@thinkbiganalytics.com wrote:

   Hi Amit,

  We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2 release.
 There are lots of reasons for this though.
 Can you provide more details about the stack trace and version so we can
 compare?

  For our problem we've seen some relief with SET
 hive.metastore.client.socket.timeout=60s
 but the problem still happens from time to time.

  Thanks,
 Douglas

   From: amit kumar ak3...@gmail.com
 Reply-To: user@hive.apache.org
 Date: Tue, 5 May 2015 03:12:15 +0530
 To: user@hive.apache.org
 Subject: Unable to move files on Hive/Hdfs

   While moving the data from hive/hdfs we get below error,

 Please suggest on this.

 Moving data to:
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-1
 Failed with exception Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911\
 235-1/-ext-10002 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-\
 1
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.MoveTask
 MapReduce Jobs Launched:
 Stage-Stage-1: Map: 1 Cumulative CPU: 5.83 sec HDFS Read: 553081 HDFS
 Write: 489704 SUCCESS
 Total MapReduce CPU Time Spent: 5 seconds 830 msec
 Error (1). Execution Failed.
 2015-05-04 10:03:13 ERROR (1) in run_hive

 Thanks,



Re: Unable to move files on Hive/Hdfs

2015-05-04 Thread amit kumar
Hi Doug,

I have use CDH 5.2.1

I performed the below task, and getting the error, but after rolling back
the below changes issue has been resolved itself.

Disable ACLs on Name Nodes
Set Enable Access Control Lists = False
Save Changes
Restart Hadoop Cluster


Thanks,


On Tue, May 5, 2015 at 4:36 AM, amit kumar ak3...@gmail.com wrote:

 Hi Doug,

 I have use CDH 5.2.1

 Disable ACLs on Name Nodes


 Set Enable Access Control Lists = False

 Save Changes

 Restart Hadoop Cluster



 Stack trace:

 2015-05-04 10:38:18,820 INFO  [main]: exec.Task
 (SessionState.java:printInfo(537)) - Moving data to:
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1
 from
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002

 2015-05-04 10:38:18,857 ERROR [main]: exec.Task
 (SessionState.java:printError(546)) - Failed with exception Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

 at
 org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269)

 at
 org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89)

 at
 org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200)

 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516)

 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283)

 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101)

 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924)

 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)

 at
 org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)

 at
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)

 at
 org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)

 at
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)

 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)

 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL operation
 has been rejected.  Support for ACLs has been disabled by setting
 dfs.namenode.acls.enabled to false.

 at
 org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553)


 After rolling those same changes out, the problem resolved itself.


 On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas 
 douglas.mo...@thinkbiganalytics.com wrote:

   Hi Amit,

  We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2 release.
 There are lots of reasons for this though.
 Can you provide more details about the stack trace and version so we can
 compare?

  For our problem we've seen some relief with SET
 hive.metastore.client.socket.timeout=60s
 but the problem still happens from time to time.

  Thanks,
 Douglas

   From: amit kumar ak3...@gmail.com
 Reply-To: user@hive.apache.org
 Date: Tue, 5 May 2015 03:12:15 +0530
 To: user@hive.apache.org
 Subject: Unable to move files on Hive/Hdfs

   While moving the data from hive/hdfs we get below error,

 Please suggest on this.

 Moving data to:
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-1
 Failed with exception Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911\
 235-1/-ext-10002 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-\
 1
 FAILED: Execution Error, return

Re: Unable to move files on Hive/Hdfs

2015-05-04 Thread amit kumar
Doug,

Do i need any changes in configuration or else to resolve this issue.

Thanks

On Tue, May 5, 2015 at 4:46 AM, amit kumar ak3...@gmail.com wrote:

 Do you have any suggestion to resolve this issue,

 I am looking for a resolution.

 On Tue, May 5, 2015 at 4:42 AM, Moore, Douglas 
 douglas.mo...@thinkbiganalytics.com wrote:

   Yep, permission problem. Weird though it seems to be moving a file
 within the same dir.

 Thanks for the update!

 - Douglas
From: amit kumar ak3...@gmail.com
 Reply-To: user@hive.apache.org
 Date: Tue, 5 May 2015 04:40:18 +0530
 To: user@hive.apache.org
 Subject: Re: Unable to move files on Hive/Hdfs

  Hi Doug,

  I have use CDH 5.2.1

  I performed the below task, and getting the error, but after rolling
 back the below changes issue has been resolved itself.

  Disable ACLs on Name Nodes
 Set Enable Access Control Lists = False
 Save Changes
 Restart Hadoop Cluster


  Thanks,


 On Tue, May 5, 2015 at 4:36 AM, amit kumar ak3...@gmail.com wrote:

 Hi Doug,

  I have use CDH 5.2.1

  Disable ACLs on Name Nodes


  Set Enable Access Control Lists = False

 Save Changes

 Restart Hadoop Cluster



  Stack trace:

 2015-05-04 10:38:18,820 INFO  [main]: exec.Task
 (SessionState.java:printInfo(537)) - Moving data to:
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1
 from
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002

 2015-05-04 10:38:18,857 ERROR [main]: exec.Task
 (SessionState.java:printError(546)) - Failed with exception Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

 at
 org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269)

 at
 org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89)

 at
 org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200)

 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516)

 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283)

 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101)

 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924)

 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)

 at
 org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)

 at
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)

 at
 org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)

 at
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)

 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)

 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL
 operation has been rejected.  Support for ACLs has been disabled by setting
 dfs.namenode.acls.enabled to false.

 at
 org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553)


  After rolling those same changes out, the problem resolved itself.


 On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas 
 douglas.mo...@thinkbiganalytics.com wrote:

   Hi Amit,

  We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2
 release. There are lots of reasons for this though.
 Can you provide more details about the stack trace and version so we
 can compare?

  For our problem we've seen some relief with SET
 hive.metastore.client.socket.timeout=60s
 but the problem still happens from time to time.

  Thanks,
 Douglas

   From: amit kumar ak3

Re: Unable to move files on Hive/Hdfs

2015-05-04 Thread amit kumar
Do you have any suggestion to resolve this issue,

I am looking for a resolution.

On Tue, May 5, 2015 at 4:42 AM, Moore, Douglas 
douglas.mo...@thinkbiganalytics.com wrote:

   Yep, permission problem. Weird though it seems to be moving a file
 within the same dir.

 Thanks for the update!

 - Douglas
From: amit kumar ak3...@gmail.com
 Reply-To: user@hive.apache.org
 Date: Tue, 5 May 2015 04:40:18 +0530
 To: user@hive.apache.org
 Subject: Re: Unable to move files on Hive/Hdfs

  Hi Doug,

  I have use CDH 5.2.1

  I performed the below task, and getting the error, but after rolling
 back the below changes issue has been resolved itself.

  Disable ACLs on Name Nodes
 Set Enable Access Control Lists = False
 Save Changes
 Restart Hadoop Cluster


  Thanks,


 On Tue, May 5, 2015 at 4:36 AM, amit kumar ak3...@gmail.com wrote:

 Hi Doug,

  I have use CDH 5.2.1

  Disable ACLs on Name Nodes


  Set Enable Access Control Lists = False

 Save Changes

 Restart Hadoop Cluster



  Stack trace:

 2015-05-04 10:38:18,820 INFO  [main]: exec.Task
 (SessionState.java:printInfo(537)) - Moving data to:
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1
 from
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002

 2015-05-04 10:38:18,857 ERROR [main]: exec.Task
 (SessionState.java:printError(546)) - Failed with exception Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move
 sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-10002
 to destination
 hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-37-43_010_529731467724830376-1/-ext-1

 at
 org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2269)

 at
 org.apache.hadoop.hive.ql.exec.MoveTask.moveFile(MoveTask.java:89)

 at
 org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:200)

 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)

 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516)

 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283)

 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101)

 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924)

 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:269)

 at
 org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:221)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:431)

 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:367)

 at
 org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:464)

 at
 org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:474)

 at
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)

 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:694)

 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:633)

 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

 at java.lang.reflect.Method.invoke(Method.java:606)

 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

 Caused by: org.apache.hadoop.hdfs.protocol.AclException: The ACL
 operation has been rejected.  Support for ACLs has been disabled by setting
 dfs.namenode.acls.enabled to false.

 at
 org.apache.hadoop.hdfs.server.namenode.NNConf.checkAclsConfigFlag(NNConf.java:85)
 at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAclStatus(FSNamesystem.java:8553)


  After rolling those same changes out, the problem resolved itself.


 On Tue, May 5, 2015 at 4:28 AM, Moore, Douglas 
 douglas.mo...@thinkbiganalytics.com wrote:

   Hi Amit,

  We've seen the same error on MoveTask with Hive 0.14 / HDP 2.2
 release. There are lots of reasons for this though.
 Can you provide more details about the stack trace and version so we can
 compare?

  For our problem we've seen some relief with SET
 hive.metastore.client.socket.timeout=60s
 but the problem still happens from time to time.

  Thanks,
 Douglas

   From: amit kumar ak3...@gmail.com
 Reply-To: user@hive.apache.org
 Date: Tue, 5 May 2015 03:12:15 +0530
 To: user@hive.apache.org
 Subject: Unable to move files on Hive/Hdfs

   While moving

Unable to move files on Hive/Hdfs

2015-05-04 Thread amit kumar
While moving the data from hive/hdfs we get below error,

Please suggest on this.

Moving data to:
hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-1
Failed with exception Unable to move
sourcehdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911\
235-1/-ext-10002 to destination
hdfs://nameservice1/tmp/hive-srv-hdp-edh-d/hive_2015-05-04_10-02-39_841_5305383954203911235-1/-ext-\
1
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MoveTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Cumulative CPU: 5.83 sec HDFS Read: 553081 HDFS
Write: 489704 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 830 msec
Error (1). Execution Failed.
2015-05-04 10:03:13 ERROR (1) in run_hive

Thanks,


what is the bench mark using SSD for HDFS over HDD

2014-12-02 Thread Amit Behera
Hi User,

I want to know the difference of query execution time in hive if I use SSD
for HDFS and HDD for HDFS.


Thanks,
Amit


Query is stuck in middle

2014-11-26 Thread Amit Behera
Hi User,

I am running a join query on 100GB table with 10 GB table.
My query got stuck without giving any error like below.

2014-11-26 20:19:53,893 Stage-1 map = 99%,  reduce = 10%, Cumulative CPU
29443.21 sec
2014-11-26 20:20:53,920 Stage-1 map = 99%,  reduce = 10%, Cumulative CPU
29480.04 sec
2014-11-26 20:21:53,923 Stage-1 map = 99%,  reduce = 10%, Cumulative CPU
29516.21 sec
2014-11-26 20:22:53,935 Stage-1 map = 99%,  reduce = 10%, Cumulative CPU
29552.95 sec


Please help me to find out solution.

Thanks
Amit


Container launch failed Error

2014-11-24 Thread Amit Behera
Hi Users,

*my cluster(1+8) configuration*:

RAM  : 32 GB each
HDFS : 1.5 TB SSD
CPU   : 8 core each

---

I am trying to query on 300GB of table but I am able to run only select
query.

Except select query , for all other query I am getting following exception.





Total jobs = 1

Stage-1 is selected by condition resolver.

Launching Job 1 out of 1

Number of reduce tasks not specified. Estimated
from input data size: 183

In order to change the average load for a
reducer (in bytes):

  set
hive.exec.reducers.bytes.per.reducer=number

In order to limit the maximum number of
reducers:

  set hive.exec.reducers.max=number

In order to set a constant number of reducers:

  set mapreduce.job.reduces=number

Starting Job = job_1416831990090_0005, Tracking
URL = http://master:8088/proxy/application_1416831990090_0005/

Kill Command = /root/hadoop/bin/hadoop job
-kill job_1416831990090_0005

Hadoop job information for Stage-1: number of
mappers: 679; number of reducers: 183

2014-11-24 19:43:01,523 Stage-1 map = 0%,
reduce = 0%

2014-11-24 19:43:22,730 Stage-1 map = 53%,
reduce = 0%, Cumulative CPU 625.19 sec

2014-11-24 19:43:23,778 Stage-1 map = 100%,
reduce = 100%

MapReduce Total cumulative CPU time: 10 minutes
25 seconds 190 msec

Ended Job = job_1416831990090_0005 with errors

Error during job, obtaining debugging
information...

Examining task ID:
task_1416831990090_0005_m_05 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_42 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_35 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_65 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_02 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_07 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_58 (and more) from job
job_1416831990090_0005

Examining task ID:
task_1416831990090_0005_m_43 (and more) from job
job_1416831990090_0005


 Task with the most failures(4):

-

Task ID:

  task_1416831990090_0005_m_05


 URL:

 
http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05

-

Diagnostic Messages for this Task:

Container launch failed for
container_1416831990090_0005_01_000112 :
java.lang.IllegalArgumentException: java.net.UnknownHostException:
slave6

at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

at
org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)

at
org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)

at
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)

at
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189)

at
org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)

at
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)

at
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)

at
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: java.net.UnknownHostException: slave6

... 12 more



 FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask

MapReduce Jobs Launched:

Job 0: Map: 679  Reduce: 183   Cumulative CPU:
625.19 sec   HDFS Read: 0 HDFS Write: 0 FAIL

Total MapReduce CPU Time Spent: 10 minutes 25
seconds 190 mse




Please help me to fix the issue.

Thanks
Amit


Re: Container launch failed Error

2014-11-24 Thread Amit Behera
hi Daniel,


this stacktrace same for other query .
for different run I am getting slave7 sometime slave8...

And also I registered all machine IPs in /etc/hosts

Regards
Amit



On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv 
daniel.ha...@veracity-group.com wrote:

 It seems that the application master can't resolve slave6's name to an IP

 Daniel

 On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote:

 Hi Users,

 *my cluster(1+8) configuration*:

 RAM  : 32 GB each
 HDFS : 1.5 TB SSD
 CPU   : 8 core each

 ---

 I am trying to query on 300GB of table but I am able to run only select
 query.

 Except select query , for all other query I am getting following exception.





 Total jobs = 1

 Stage-1 is selected by condition resolver.

 Launching Job 1 out of 1

 Number of reduce tasks not specified. Estimated
 from input data size: 183

 In order to change the average load for a
 reducer (in bytes):

   set
 hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of
 reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapreduce.job.reduces=number

 Starting Job = job_1416831990090_0005, Tracking
 URL = http://master:8088/proxy/application_1416831990090_0005/

 Kill Command = /root/hadoop/bin/hadoop job
 -kill job_1416831990090_0005

 Hadoop job information for Stage-1: number of
 mappers: 679; number of reducers: 183

 2014-11-24 19:43:01,523 Stage-1 map = 0%,
 reduce = 0%

 2014-11-24 19:43:22,730 Stage-1 map = 53%,
 reduce = 0%, Cumulative CPU 625.19 sec

 2014-11-24 19:43:23,778 Stage-1 map = 100%,
 reduce = 100%

 MapReduce Total cumulative CPU time: 10 minutes
 25 seconds 190 msec

 Ended Job = job_1416831990090_0005 with errors

 Error during job, obtaining debugging
 information...

 Examining task ID:
 task_1416831990090_0005_m_05 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_42 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_35 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_65 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_02 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_07 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_58 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_43 (and more) from job
 job_1416831990090_0005


  Task with the most failures(4):

 -

 Task ID:

   task_1416831990090_0005_m_05


  URL:

  
 http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05

 -

 Diagnostic Messages for this Task:

 Container launch failed for
 container_1416831990090_0005_01_000112 :
 java.lang.IllegalArgumentException: java.net.UnknownHostException:
 slave6

   at
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

   at
 org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)

   at
 org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)

   at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)

   at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189)

   at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)

   at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)

   at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)

   at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)

   at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

   at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

   at java.lang.Thread.run(Thread.java:745)

 Caused by: java.net.UnknownHostException: slave6

   ... 12 more



  FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask

 MapReduce Jobs Launched:

 Job 0: Map: 679  Reduce: 183   Cumulative CPU:
 625.19 sec   HDFS Read: 0 HDFS Write: 0 FAIL

 Total MapReduce CPU Time Spent: 10 minutes 25
 seconds 190 mse




 Please help me to fix the issue.

 Thanks
 Amit




Re: Container launch failed Error

2014-11-24 Thread Amit Behera
I did not modify in all the slaves. except slave

will it be a problem ?

But for small data (up to 20 GB table) it is running and for 300GB table
only count(*) running sometimes and sometimes failed

Thanks
Amit

On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv 
daniel.ha...@veracity-group.com wrote:

 did you copy the hosts file to all the nodes?

 Daniel

 On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote:

 hi Daniel,


 this stacktrace same for other query .
 for different run I am getting slave7 sometime slave8...

 And also I registered all machine IPs in /etc/hosts

 Regards
 Amit



 On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 It seems that the application master can't resolve slave6's name to an IP

 Daniel

 On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote:

 Hi Users,

 *my cluster(1+8) configuration*:

 RAM  : 32 GB each
 HDFS : 1.5 TB SSD
 CPU   : 8 core each

 ---

 I am trying to query on 300GB of table but I am able to run only select
 query.

 Except select query , for all other query I am getting following
 exception.





 Total jobs = 1

 Stage-1 is selected by condition resolver.

 Launching Job 1 out of 1

 Number of reduce tasks not specified. Estimated
 from input data size: 183

 In order to change the average load for a
 reducer (in bytes):

   set
 hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of
 reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapreduce.job.reduces=number

 Starting Job = job_1416831990090_0005, Tracking
 URL = http://master:8088/proxy/application_1416831990090_0005/

 Kill Command = /root/hadoop/bin/hadoop job
 -kill job_1416831990090_0005

 Hadoop job information for Stage-1: number of
 mappers: 679; number of reducers: 183

 2014-11-24 19:43:01,523 Stage-1 map = 0%,
 reduce = 0%

 2014-11-24 19:43:22,730 Stage-1 map = 53%,
 reduce = 0%, Cumulative CPU 625.19 sec

 2014-11-24 19:43:23,778 Stage-1 map = 100%,
 reduce = 100%

 MapReduce Total cumulative CPU time: 10 minutes
 25 seconds 190 msec

 Ended Job = job_1416831990090_0005 with errors

 Error during job, obtaining debugging
 information...

 Examining task ID:
 task_1416831990090_0005_m_05 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_42 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_35 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_65 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_02 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_07 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_58 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_43 (and more) from job
 job_1416831990090_0005


  Task with the most failures(4):

 -

 Task ID:

   task_1416831990090_0005_m_05


  URL:

  
 http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05

 -

 Diagnostic Messages for this Task:

 Container launch failed for
 container_1416831990090_0005_01_000112 :
 java.lang.IllegalArgumentException: java.net.UnknownHostException:
 slave6

  at
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

  at
 org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)

  at
 org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)

  at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)

  at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189)

  at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)

  at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)

  at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)

  at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)

  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

  at java.lang.Thread.run(Thread.java:745)

 Caused by: java.net.UnknownHostException: slave6

  ... 12 more



  FAILED: Execution Error, return code 2 from

Re: Container launch failed Error

2014-11-24 Thread Amit Behera
* except slave6, slave7, slave8

On Mon, Nov 24, 2014 at 10:56 PM, Amit Behera amit.bd...@gmail.com wrote:

 I did not modify in all the slaves. except slave

 will it be a problem ?

 But for small data (up to 20 GB table) it is running and for 300GB table
 only count(*) running sometimes and sometimes failed

 Thanks
 Amit

 On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 did you copy the hosts file to all the nodes?

 Daniel

 On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote:

 hi Daniel,


 this stacktrace same for other query .
 for different run I am getting slave7 sometime slave8...

 And also I registered all machine IPs in /etc/hosts

 Regards
 Amit



 On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 It seems that the application master can't resolve slave6's name to an IP

 Daniel

 On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote:

 Hi Users,

 *my cluster(1+8) configuration*:

 RAM  : 32 GB each
 HDFS : 1.5 TB SSD
 CPU   : 8 core each

 ---

 I am trying to query on 300GB of table but I am able to run only select
 query.

 Except select query , for all other query I am getting following
 exception.





 Total jobs = 1

 Stage-1 is selected by condition resolver.

 Launching Job 1 out of 1

 Number of reduce tasks not specified. Estimated
 from input data size: 183

 In order to change the average load for a
 reducer (in bytes):

   set
 hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of
 reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapreduce.job.reduces=number

 Starting Job = job_1416831990090_0005, Tracking
 URL = http://master:8088/proxy/application_1416831990090_0005/

 Kill Command = /root/hadoop/bin/hadoop job
 -kill job_1416831990090_0005

 Hadoop job information for Stage-1: number of
 mappers: 679; number of reducers: 183

 2014-11-24 19:43:01,523 Stage-1 map = 0%,
 reduce = 0%

 2014-11-24 19:43:22,730 Stage-1 map = 53%,
 reduce = 0%, Cumulative CPU 625.19 sec

 2014-11-24 19:43:23,778 Stage-1 map = 100%,
 reduce = 100%

 MapReduce Total cumulative CPU time: 10 minutes
 25 seconds 190 msec

 Ended Job = job_1416831990090_0005 with errors

 Error during job, obtaining debugging
 information...

 Examining task ID:
 task_1416831990090_0005_m_05 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_42 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_35 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_65 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_02 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_07 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_58 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_43 (and more) from job
 job_1416831990090_0005


  Task with the most failures(4):

 -

 Task ID:

   task_1416831990090_0005_m_05


  URL:

  
 http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05

 -

 Diagnostic Messages for this Task:

 Container launch failed for
 container_1416831990090_0005_01_000112 :
 java.lang.IllegalArgumentException: java.net.UnknownHostException:
 slave6

 at
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

 at
 org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)

 at
 org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)

 at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)

 at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189)

 at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)

 at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)

 at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)

 at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369)

 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:745)

 Caused by: java.net.UnknownHostException

Re: Container launch failed Error

2014-11-24 Thread Amit Behera
Hi Daniel,

Thanks a lot,


I will do that and rerun the query. :)

On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv 
daniel.ha...@veracity-group.com wrote:

 It is a problem as the application master needs to contact the other nodes

 Try updating the hosts file on all the machines and try again.

 Daniel

 On 24 בנוב׳ 2014, at 19:26, Amit Behera amit.bd...@gmail.com wrote:

 I did not modify in all the slaves. except slave

 will it be a problem ?

 But for small data (up to 20 GB table) it is running and for 300GB table
 only count(*) running sometimes and sometimes failed

 Thanks
 Amit

 On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 did you copy the hosts file to all the nodes?

 Daniel

 On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote:

 hi Daniel,


 this stacktrace same for other query .
 for different run I am getting slave7 sometime slave8...

 And also I registered all machine IPs in /etc/hosts

 Regards
 Amit



 On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 It seems that the application master can't resolve slave6's name to an IP

 Daniel

 On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote:

 Hi Users,

 *my cluster(1+8) configuration*:

 RAM  : 32 GB each
 HDFS : 1.5 TB SSD
 CPU   : 8 core each

 ---

 I am trying to query on 300GB of table but I am able to run only select
 query.

 Except select query , for all other query I am getting following
 exception.





 Total jobs = 1

 Stage-1 is selected by condition resolver.

 Launching Job 1 out of 1

 Number of reduce tasks not specified. Estimated
 from input data size: 183

 In order to change the average load for a
 reducer (in bytes):

   set
 hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of
 reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapreduce.job.reduces=number

 Starting Job = job_1416831990090_0005, Tracking
 URL = http://master:8088/proxy/application_1416831990090_0005/

 Kill Command = /root/hadoop/bin/hadoop job
 -kill job_1416831990090_0005

 Hadoop job information for Stage-1: number of
 mappers: 679; number of reducers: 183

 2014-11-24 19:43:01,523 Stage-1 map = 0%,
 reduce = 0%

 2014-11-24 19:43:22,730 Stage-1 map = 53%,
 reduce = 0%, Cumulative CPU 625.19 sec

 2014-11-24 19:43:23,778 Stage-1 map = 100%,
 reduce = 100%

 MapReduce Total cumulative CPU time: 10 minutes
 25 seconds 190 msec

 Ended Job = job_1416831990090_0005 with errors

 Error during job, obtaining debugging
 information...

 Examining task ID:
 task_1416831990090_0005_m_05 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_42 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_35 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_65 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_02 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_07 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_58 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_43 (and more) from job
 job_1416831990090_0005


  Task with the most failures(4):

 -

 Task ID:

   task_1416831990090_0005_m_05


  URL:

  
 http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05

 -

 Diagnostic Messages for this Task:

 Container launch failed for
 container_1416831990090_0005_01_000112 :
 java.lang.IllegalArgumentException: java.net.UnknownHostException:
 slave6

 at
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

 at
 org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)

 at
 org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)

 at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)

 at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189)

 at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:110)

 at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403)

 at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)

 at
 org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369

Re: Container launch failed Error

2014-11-24 Thread Amit Behera
Hi Daniel,

Thank you , Its running fine.

*Another question:*
 could you please tell me what to do If I will get *Shuffle Error*.
one time I got this type of error while running a join query on 300GB data
with 20GB data


Thanks
Amit

On Mon, Nov 24, 2014 at 11:13 PM, Daniel Haviv 
daniel.ha...@veracity-group.com wrote:

 Good luck
 Share your results with us

 Daniel

 On 24 בנוב׳ 2014, at 19:36, Amit Behera amit.bd...@gmail.com wrote:

 Hi Daniel,

 Thanks a lot,


 I will do that and rerun the query. :)

 On Mon, Nov 24, 2014 at 10:59 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 It is a problem as the application master needs to contact the other nodes

 Try updating the hosts file on all the machines and try again.

 Daniel

 On 24 בנוב׳ 2014, at 19:26, Amit Behera amit.bd...@gmail.com wrote:

 I did not modify in all the slaves. except slave

 will it be a problem ?

 But for small data (up to 20 GB table) it is running and for 300GB table
 only count(*) running sometimes and sometimes failed

 Thanks
 Amit

 On Mon, Nov 24, 2014 at 10:37 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 did you copy the hosts file to all the nodes?

 Daniel

 On 24 בנוב׳ 2014, at 19:04, Amit Behera amit.bd...@gmail.com wrote:

 hi Daniel,


 this stacktrace same for other query .
 for different run I am getting slave7 sometime slave8...

 And also I registered all machine IPs in /etc/hosts

 Regards
 Amit



 On Mon, Nov 24, 2014 at 10:22 PM, Daniel Haviv 
 daniel.ha...@veracity-group.com wrote:

 It seems that the application master can't resolve slave6's name to an
 IP

 Daniel

 On 24 בנוב׳ 2014, at 18:49, Amit Behera amit.bd...@gmail.com wrote:

 Hi Users,

 *my cluster(1+8) configuration*:

 RAM  : 32 GB each
 HDFS : 1.5 TB SSD
 CPU   : 8 core each

 ---

 I am trying to query on 300GB of table but I am able to run only select
 query.

 Except select query , for all other query I am getting following
 exception.





 Total jobs = 1

 Stage-1 is selected by condition resolver.

 Launching Job 1 out of 1

 Number of reduce tasks not specified. Estimated
 from input data size: 183

 In order to change the average load for a
 reducer (in bytes):

   set
 hive.exec.reducers.bytes.per.reducer=number

 In order to limit the maximum number of
 reducers:

   set hive.exec.reducers.max=number

 In order to set a constant number of reducers:

   set mapreduce.job.reduces=number

 Starting Job = job_1416831990090_0005, Tracking
 URL = http://master:8088/proxy/application_1416831990090_0005/

 Kill Command = /root/hadoop/bin/hadoop job
 -kill job_1416831990090_0005

 Hadoop job information for Stage-1: number of
 mappers: 679; number of reducers: 183

 2014-11-24 19:43:01,523 Stage-1 map = 0%,
 reduce = 0%

 2014-11-24 19:43:22,730 Stage-1 map = 53%,
 reduce = 0%, Cumulative CPU 625.19 sec

 2014-11-24 19:43:23,778 Stage-1 map = 100%,
 reduce = 100%

 MapReduce Total cumulative CPU time: 10 minutes
 25 seconds 190 msec

 Ended Job = job_1416831990090_0005 with errors

 Error during job, obtaining debugging
 information...

 Examining task ID:
 task_1416831990090_0005_m_05 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_42 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_35 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_65 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_02 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_07 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_58 (and more) from job
 job_1416831990090_0005

 Examining task ID:
 task_1416831990090_0005_m_43 (and more) from job
 job_1416831990090_0005


  Task with the most failures(4):

 -

 Task ID:

   task_1416831990090_0005_m_05


  URL:

  
 http://master:8088/taskdetails.jsp?jobid=job_1416831990090_0005tipid=task_1416831990090_0005_m_05

 -

 Diagnostic Messages for this Task:

 Container launch failed for
 container_1416831990090_0005_01_000112 :
 java.lang.IllegalArgumentException: java.net.UnknownHostException:
 slave6

at
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418)

at
 org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:397)

at
 org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:233)

at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:211)

at
 org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.init(ContainerManagementProtocolProxy.java:189

How to do single user multiple access in hive

2014-11-07 Thread Amit Behera
Hi users,

I have hive set up at multi node hadoop cluster.
I want to run multiple queries on top of a table from different machines.

So please help how to achieve multiple access on hive to run multiple
queries simultaneously.

Thanks
Amit


Re: How to do single user multiple access in hive

2014-11-07 Thread Amit Behera
hi Devopam,

Thank you for replying.

I am using Hue on the top of Hive. So can you please help me, how oozie
will help me and how can I integrate oozie with this.

Thanks
Amit



On Fri, Nov 7, 2014 at 7:58 PM, Devopam Mittra devo...@gmail.com wrote:

 hi Amit,
 Please try to see if Hive CLI (client) installed on the 'different'
 machines helps you achieve your goal at the minimalist end.
 If you use any other program like oozie (to submit your queries) etc. then
 you can fire queries through the respective interfaces safely enough.

 regards
 Devopam


 On Fri, Nov 7, 2014 at 7:29 PM, Amit Behera amit.bd...@gmail.com wrote:

 Hi users,

 I have hive set up at multi node hadoop cluster.
 I want to run multiple queries on top of a table from different machines.

 So please help how to achieve multiple access on hive to run multiple
 queries simultaneously.

 Thanks
 Amit




 --
 Devopam Mittra
 Life and Relations are not binary



Re: confirm subscribe to user@hive.apache.org

2014-11-04 Thread Amit Behera
 Nov 2014 08:03:12 -0800 (PST)
 Date: Tue, 4 Nov 2014 21:33:12 +0530
 Message-ID: 
 calxycns4bv0b7jvbdctxm3utqlhg7wdaf8pppwjvhsetdth...@mail.gmail.com
 Subject: Want to Join this
 From: Amit Behera amit.bd...@gmail.com
 To: user-subscr...@hive.apache.org
 Content-Type: multipart/alternative; boundary=001a11c33312a1f5fa05070a9941
 X-Virus-Checked: Checked by ClamAV on apache.org




RE: PIG heart beat freeze using hue + cdh 5.1

2014-09-11 Thread Amit Dutta
Thanks for the link ... but I am still unable to  find how do I resolve the 
issue with the heart beat ...

Date: Wed, 10 Sep 2014 09:52:19 -0400
Subject: Re: PIG heart beat freeze using hue + cdh 5.1
From: zenon...@gmail.com
To: user@hive.apache.org

Take a look at this link
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/

Thanks
On Tue, Sep 9, 2014 at 8:53 PM, Amit Dutta amitkrdu...@outlook.com wrote:



Thanks a lot for your reply..I changed the following parameters from Cloudera 
manager 
mapred.tasktracker.map.tasks.maximum = 2 (it was 1 before)
mapred.tasktracker.reduce.tasks.maximum =  2 (it was 1 before)
could you please mention what are the parameters and how do I change those ...
Regards,Amit
Subject: Re: PIG heart beat freeze using hue + cdh 5.1
From: zenon...@gmail.com
Date: Tue, 9 Sep 2014 20:34:19 -0400
To: user@hive.apache.org

It use Yarn now you need to set your container resource memory and CPU then set 
the mapreduce physical memory and CPU cores the number of mapper and reducers 
are calculated based on the resource you gave to your mapper and reducer

PengchengSent from my iPhone
On Sep 9, 2014, at 7:55 PM, Amit Dutta amitkrdu...@outlook.com wrote:




I think one of the issue is number of mapreduce slot for the cluster... Can 
anyone please let me know how do I increase the mapreduce slot?

From: amitkrdu...@outlook.com
To: user@hive.apache.org
Subject: PIG heart beat freeze using hue + cdh 5.1
Date: Tue, 9 Sep 2014 17:55:01 -0500




Hi I have a only 604 rows in the hive table.
while using A = LOAD 'revenue' USING org.apache.hcatalog.pig.HCatLoader(); DUMP 
A; it starts spouting heart beat repeatedly and does not leave this state.Can 
please someone help.I am getting following exception
  2014-09-09 17:27:45,844 [JobControl] INFO  
org.apache.hadoop.mapreduce.JobSubmitter  - Kind: RM_DELEGATION_TOKEN, Service: 
10.215.204.182:8032, Ident: (owner=cloudera, renewer=oozie mr token, 
realUser=oozie, issueDate=1410301632571, maxDate=1410906432571, 
sequenceNumber=14, masterKeyId=2)
  2014-09-09 17:27:46,709 [JobControl] WARN  
org.apache.hadoop.mapreduce.v2.util.MRApps  - cache file 
(mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-httpclient-3.1.jar
 conflicts with cache file (mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-httpclient-3.1.jar
 This will be an error in Hadoop 2.0
  2014-09-09 17:27:46,712 [JobControl] WARN  
org.apache.hadoop.mapreduce.v2.util.MRApps  - cache file 
(mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-io-2.1.jar
 conflicts with cache file (mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-io-2.1.jar
 This will be an error in Hadoop 2.0
  2014-09-09 17:27:46,894 [JobControl] INFO  
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl  - Submitted application 
application_1410291186220_0006
  2014-09-09 17:27:46,968 [JobControl] INFO  org.apache.hadoop.mapreduce.Job  - 
The url to track the job: 
http://txwlcloud2:8088/proxy/application_1410291186220_0006/
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- HadoopJobId: job_1410291186220_0006
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- Processing aliases A
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- detailed locations: M: A[1,4] C:  R:
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- More information at: 
http://txwlcloud2:50030/jobdetails.jsp?jobid=job_1410291186220_0006
  2014-09-09 17:27:47,019 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- 0% complete
  Heart beat
  Heart beat
  Heart beat
  Heart beat
  Heart beat
  
  

  

PIG heart beat freeze using hue + cdh 5.1

2014-09-09 Thread Amit Dutta
Hi I have a only 604 rows in the hive table.
while using A = LOAD 'revenue' USING org.apache.hcatalog.pig.HCatLoader(); DUMP 
A; it starts spouting heart beat repeatedly and does not leave this state.Can 
please someone help.I am getting following exception
  2014-09-09 17:27:45,844 [JobControl] INFO  
org.apache.hadoop.mapreduce.JobSubmitter  - Kind: RM_DELEGATION_TOKEN, Service: 
10.215.204.182:8032, Ident: (owner=cloudera, renewer=oozie mr token, 
realUser=oozie, issueDate=1410301632571, maxDate=1410906432571, 
sequenceNumber=14, masterKeyId=2)
  2014-09-09 17:27:46,709 [JobControl] WARN  
org.apache.hadoop.mapreduce.v2.util.MRApps  - cache file 
(mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-httpclient-3.1.jar
 conflicts with cache file (mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-httpclient-3.1.jar
 This will be an error in Hadoop 2.0
  2014-09-09 17:27:46,712 [JobControl] WARN  
org.apache.hadoop.mapreduce.v2.util.MRApps  - cache file 
(mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/pig/commons-io-2.1.jar
 conflicts with cache file (mapreduce.job.cache.files) 
hdfs://txwlcloud2:8020/user/oozie/share/lib/lib_20140820161455/hcatalog/commons-io-2.1.jar
 This will be an error in Hadoop 2.0
  2014-09-09 17:27:46,894 [JobControl] INFO  
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl  - Submitted application 
application_1410291186220_0006
  2014-09-09 17:27:46,968 [JobControl] INFO  org.apache.hadoop.mapreduce.Job  - 
The url to track the job: 
http://txwlcloud2:8088/proxy/application_1410291186220_0006/
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- HadoopJobId: job_1410291186220_0006
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- Processing aliases A
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- detailed locations: M: A[1,4] C:  R:
  2014-09-09 17:27:46,969 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- More information at: 
http://txwlcloud2:50030/jobdetails.jsp?jobid=job_1410291186220_0006
  2014-09-09 17:27:47,019 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher  
- 0% complete
  Heart beat
  Heart beat
  Heart beat
  Heart beat
  Heart beat  

Re: Pig jobs run forever with PigEditor in Hue

2014-09-09 Thread Amit Dutta
Hi 
Does anyone please let me know how to increase the mapreduce slots? i am 
getting infinite heartbeat when i run a PIG script from hue cloudera cdh5.1
Thanks,Amit   

Increase mapreduce slots

2014-09-09 Thread Amit Dutta
Hi 
Does anyone please let me know how to increase the mapreduce slots? i am 
getting infinite heartbeat when i run a PIG script from hue cloudera cdh5.1
Thanks,Amit
  

Re: hbase importtsv

2014-05-01 Thread Amit Tewari
Make sure there are no primary key clash. HBase would over write the row if you 
upload data with same primary key. That's one reason you can possibly get less 
rows than what you uploaded 
 

Sent from my mobile device, please excuse the typos

 On May 1, 2014, at 3:34 PM, Kennedy, Sean C. sean.kenn...@merck.com wrote:
 
 I ran the following command to import an excel.csv file into hbase. 
 Everything looked ok however when I ran a scan on the table in hbase I did 
 not see as many rows as were in excel.csv file.
  
 Any help appreciated….
  
  
  
 /hd/hadoop/bin/hadoop jar /hbase/hbase-0.94.15/hbase-0.94.15.jar importtsv 
 '-Dimporttsv.separator=,' 
 -Dimporttsv.columns=HBASE_ROW_KEY,ROOT,NODE,VALUE,X_PATH,IMG,NODE_URL,LFLAG,SORT_ORDER,SITE
  V_MES_INPUT_TREE /ma/segwhdfs/hpp/hbase/MES/csv/MES_INPUT_TREE
  
  
 The csv file had over 200,000 rows, however my hbase scan returned only 3500 
 or so rows.  
  
 Output from scan ‘MES_INPUT_TREE’
  
 3855 row(s) in 5.6090 seconds
  
  
 Output from job:
  
 4/05/01 17:58:53 INFO mapred.JobClient: Job complete: job_201405011721_0001
 14/05/01 17:58:53 INFO mapred.JobClient: Counters: 20
 14/05/01 17:58:53 INFO mapred.JobClient:   Job Counters
 14/05/01 17:58:53 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=1208423
 14/05/01 17:58:53 INFO mapred.JobClient: Total time spent by all reduces 
 waiting after reserving slots (ms)=0
 14/05/01 17:58:53 INFO mapred.JobClient: Total time spent by all maps 
 waiting after reserving slots (ms)=0
 14/05/01 17:58:53 INFO mapred.JobClient: Rack-local map tasks=1
 14/05/01 17:58:53 INFO mapred.JobClient: Launched map tasks=4
 14/05/01 17:58:53 INFO mapred.JobClient: Data-local map tasks=3
 14/05/01 17:58:53 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=1427
 14/05/01 17:58:53 INFO mapred.JobClient:   ImportTsv
 14/05/01 17:58:53 INFO mapred.JobClient: Bad Lines=3
 14/05/01 17:58:53 INFO mapred.JobClient:   File Output Format Counters
 14/05/01 17:58:53 INFO mapred.JobClient: Bytes Written=0
 14/05/01 17:58:53 INFO mapred.JobClient:   FileSystemCounters
 14/05/01 17:58:53 INFO mapred.JobClient: HDFS_BYTES_READ=5243015
 14/05/01 17:58:53 INFO mapred.JobClient: FILE_BYTES_WRITTEN=80374
 14/05/01 17:58:53 INFO mapred.JobClient:   File Input Format Counters
 14/05/01 17:58:53 INFO mapred.JobClient: Bytes Read=5242880
 14/05/01 17:58:53 INFO mapred.JobClient:   Map-Reduce Framework
 14/05/01 17:58:53 INFO mapred.JobClient: Map input records=22494
 14/05/01 17:58:53 INFO mapred.JobClient: Physical memory (bytes) 
 snapshot=112275456
 14/05/01 17:58:53 INFO mapred.JobClient: Spilled Records=0
 14/05/01 17:58:53 INFO mapred.JobClient: CPU time spent (ms)=2430
 14/05/01 17:58:53 INFO mapred.JobClient: Total committed heap usage 
 (bytes)=145752064
 14/05/01 17:58:53 INFO mapred.JobClient: Virtual memory (bytes) 
 snapshot=769548288
 14/05/01 17:58:53 INFO mapred.JobClient: Map output records=22491
 14/05/01 17:58:53 INFO mapred.JobClient: SPLIT_RAW_BYTES=135
 Notice:  This e-mail message, together with any attachments, contains
 information of Merck  Co., Inc. (One Merck Drive, Whitehouse Station,
 New Jersey, USA 08889), and/or its affiliates Direct contact information
 for affiliates is available at 
 http://www.merck.com/contact/contacts.html) that may be confidential,
 proprietary copyrighted and/or legally privileged. It is intended solely
 for the use of the individual or entity named on this message. If you are
 not the intended recipient, and have received this message in error,
 please notify us immediately by reply e-mail and then delete it from 
 your system.


Re: Error using ORC Format with Hive

2014-04-05 Thread Amit Tewari
Thanks for the reply. I did solve protobuf issue by upgrading to 2.5 but then 
hive 0.12 also started showing the same issue as 0.13 and 0.14

I was working through  cli

Turns out issue was due to space available (not) to data node. Let me elaborate 
for others in the list. 

I had about 2GB available on the partition where data node directory was 
configured (the name node and data node space was on the same directory tree 
but different directories, off course). I inserted kv1.txt (few KBs) to table#1 
(stored as textfile) and then tried to insert into table#2 select * table#1. 
Table#2 was stored as Orc.  It was difficult for me to guess that converted Orc 
data would be too big to fit in 2GB.  Especially when data node logs did not 
have any error. Nor was there reserve configured for HDFS. I still don't know 
why it needs so much space however I could reproduce the error simply by 
pushing a 300MB file to HDFS hdfs dfs -put . Thus realizing that it's a space 
issue. Migrated datanode  to a bigger partition and everything is fine now. 

On a separate note I am not seeing any significant query time improvement by 
pushing data into ORC. About 25% yeah but no where close to multiples I was 
hoping. I changed the striping to 4MB. Tried creating index every 10k rows. 
Inserted 6 million rows and did many different type of queries. Any ideas 
people what I might be missing  ? 

Amit 

Sent from my mobile device, please excuse the typos

 On Apr 4, 2014, at 8:21 PM, Bryan Jeffrey bryan.jeff...@gmail.com wrote:
 
 Amit,
 
 Are you executing your select for conversion to orc via beeline, or hive cli? 
 From looking at your logs, it appears that you do not have permissions in 
 hdfs to write the resultant orc data. Check permissions in hdfs to ensure 
 that your user has write permissions to write to hive warehouse.
 
 I forwarded you a previous thread regarding hive 12 protobuf issues.
 
 Regards,
 
 Bryan Jeffrey
 
 On Apr 4, 2014 8:14 PM, Amit Tewari amittew...@gmail.com wrote:
 I checked out and build hive 0.13. Tried with same results. i.e. 
 eRpcServer.addBlock(NameNodeRpcServer.java:555)
 at File 
 /tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3
  could only be replicated to 0 nodes instead of minReplication (=1).  There 
 are 1 datanode(s) running and no node(s) are excluded in this operation.
 
 
 
 I also tried it with the release version of hive 0.12 and that gave me a 
 different error. Related to protobuffer incompatibility (pasted below)
 
 So at this point I can't run even the basic use case with ORC storage..
 
 Any pointers would be very helpful.
 
 Amit
 
 Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
 
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
 Caused by: java.lang.UnsupportedOperationException: This is supposed to be 
 overridden by subclasses.
 at 
 com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
 at 
 com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
 at 
 com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
 at 
 com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
 at 
 com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
 at 
 com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
 at 
 org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
 at 
 org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
 at 
 org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
 at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
 at 
 org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
 at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181

Error using ORC format

2014-04-04 Thread Amit Tewari

Hi All,

I am just trying to do some simple tests to see speedup in hive query 
with Hive 0.14 (trunk version this morning). Just tried to use sample 
test case to start with. First wanted to see how much I can speed up 
using ORC format.


However for some reason I can't insert data into the table with ORC 
format. It fails with Exception File filename could only be 
replicated to 0 nodes instead of minReplication (=1).  There are 1 
datanode(s) running and no node(s) are excluded in this operation


I can however run inserting data into text table without any issue.

I have included the step below.

Any pointers would be appreciated.

Amit



I have a single node setup with minimal settings. JPS output is as follows
$ jps
9823 NameNode
12172 JobHistoryServer
9903 DataNode
14895 Jps
11796 ResourceManager
12034 NodeManager
*Running Hadoop 0.2.2 with Yarn.*



Step1

CREATE TABLE pokes (foo INT, bar STRING);

Step 2

LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE 
pokes;


Step 3
CREATE TABLE pokes_1 (foo INT, bar STRING)

Step 4

Insert into table pokes_1 select * from pokes;

Step 5.

CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc;

Step 6.

insert into pokes_orc select * from pokes; __FAILED__ with Exception 
below 


eRpcServer.addBlock(NameNodeRpcServer.java:555)
at File 
/tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 
could only be replicated to 0 nodes instead of minReplication (=1). 
There are 1 datanode(s) running and no node(s) are excluded in this 
operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:168)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843)

at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)

... 8 more


Step 7

Insert overwrite table pokes_1 select * from pokes; Success




Error using ORC Format with Hive

2014-04-04 Thread Amit Tewari

Hi All,

I am just trying to do some simple tests to see speedup in hive query 
with Hive 0.14 (trunk version this morning). Just tried to use sample 
test case to start with. First wanted to see how much I can speed up 
using ORC format.


However for some reason I can't insert data into the table with ORC 
format. It fails with Exception File filename could only be 
replicated to 0 nodes instead of minReplication (=1).  There are 1 
datanode(s) running and no node(s) are excluded in this operation


I can however run inserting data into text table without any issue.

I have included the step below.

Any pointers would be appreciated.

Amit



I have a single node setup with minimal settings. JPS output is as follows
$ jps
9823 NameNode
12172 JobHistoryServer
9903 DataNode
14895 Jps
11796 ResourceManager
12034 NodeManager
*Running Hadoop 0.2.2 with Yarn.*



Step1

CREATE TABLE pokes (foo INT, bar STRING);

Step 2

LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE 
pokes;


Step 3
CREATE TABLE pokes_1 (foo INT, bar STRING)

Step 4

Insert into table pokes_1 select * from pokes;

Step 5.

CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc;

Step 6.

insert into pokes_orc select * from pokes; __FAILED__ with Exception 
below 


eRpcServer.addBlock(NameNodeRpcServer.java:555)
at File 
/tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 
could only be replicated to 0 nodes instead of minReplication (=1). 
There are 1 datanode(s) running and no node(s) are excluded in this 
operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodorg.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:168)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843)

at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227)

... 8 more


Step 7

Insert overwrite table pokes_1 select * from pokes; Success



Re: Error using ORC Format with Hive

2014-04-04 Thread Amit Tewari

I checked out and build hive 0.13. Tried with same results. i.e.
eRpcServer.addBlock(NameNodeRpcServer.java:555)
at File 
/tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 
could only be replicated to 0 nodes instead of minReplication (=1). 
There are 1 datanode(s) running and no node(s) are excluded in this 
operation.




I also tried it with the release version of hive 0.12 and that gave me a 
different error. Related to protobuffer incompatibility (pasted below)


So at this point I can't run even the basic use case with ORC storage..

Any pointers would be very helpful.

Amit

Error: java.lang.RuntimeException: Hive Runtime Error while closing 
operators
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)

at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to 
be overridden by subclasses.
at 
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
at 
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at 
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
at 
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
at 
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
at 
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
at 
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
at 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)

at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)


Amit

On 4/4/14 2:28 PM, Amit Tewari wrote:

Hi All,

I am just trying to do some simple tests to see speedup in hive query 
with Hive 0.14 (trunk version this morning). Just tried to use sample 
test case to start with. First wanted to see how much I can speed up 
using ORC format.


However for some reason I can't insert data into the table with ORC 
format. It fails with Exception File filename could only be 
replicated to 0 nodes instead of minReplication (=1). There are 1 
datanode(s) running and no node(s) are excluded in this operation


I can however run inserting data into text table without any issue.

I have included the step below.

Any pointers would be appreciated.

Amit



I have a single node setup with minimal settings. JPS output is as 
follows

$ jps
9823 NameNode
12172 JobHistoryServer
9903 DataNode
14895 Jps
11796 ResourceManager
12034 NodeManager
*Running Hadoop 0.2.2 with Yarn.*



Step1

CREATE TABLE pokes (foo INT, bar STRING);

Step 2

LOAD DATA LOCAL INPATH './examples/files/kv1.txt' OVERWRITE INTO TABLE 
pokes;


Step 3
CREATE TABLE pokes_1 (foo INT, bar STRING)

Step 4

Insert into table pokes_1 select * from pokes;

Step 5.

CREATE TABLE pokes_orc (foo INT, bar STRING) stored as orc;

Step 6.

insert into pokes_orc select * from pokes; __FAILED__ with Exception 
below 


eRpcServer.addBlock(NameNodeRpcServer.java:555)
at File 
/tmp/hive-hduser/hive_2014-04-04_20-34-43_550_7470522328893486504-1/_task_tmp.-ext-10002/_tmp.00_3 
could only be replicated to 0 nodes instead of minReplication (=1).  
There are 1 datanode(s) running and no node(s) are excluded in this 
operation

Re: 回复: hive 0.11 auto convert join bug report

2013-09-12 Thread Amit Sharma
Hi Navis,

I was trying to look at this email thread as well as the jira to understand
the scope of this issue. Does this get triggered only in cases of using
aliases which end up mapping to the same value upon hashing? Or can this be
triggered under other conditions as well? What if the aliases are not used
and the table names some how might map to similar hashcode values?

Also is changing the alias the only workaround for this problem or is there
any other workaround possible?

Thanks,
Amit


On Sun, Aug 11, 2013 at 9:22 PM, Navis류승우 navis@nexr.com wrote:

 Hi,

 Hive is notorious making different result with different aliases.
 Changing alias was a final way to avoid bug in desperate situation.

 I think the patch in the issue is ready, wish it's helpful.

 Thanks.

 2013/8/11  wzc1...@gmail.com:
  Hi Navis,
 
  My colleague chenchun finds that hashcode of 'deal' and 'dim_pay_date'
 are
  the same and the code in MapJoinProcessor.java ignores the order of
  rowschema.
  I look at your patch and it's exactly the same place we are working on.
  Thanks for your patch.
 
  在 2013年8月11日星期日,下午9:38,Navis류승우 写道:
 
  Hi,
 
  I've booked this on https://issues.apache.org/jira/browse/HIVE-5056
  and attached patch for it.
 
  It needs full test for confirmation but you can try it.
 
  Thanks.
 
  2013/8/11 wzc1...@gmail.com:
 
  Hi all:
  when I change the table alias dim_pay_date to A, the query pass in hive
  0.11(
 https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass
 ):
 
  use test;
  create table if not exists src ( `key` int,`val` string);
  load data local inpath '/Users/code6/git/hive/data/files/kv1.txt'
 overwrite
  into table src;
  drop table if exists orderpayment_small;
  create table orderpayment_small (`dealid` int,`date` string,`time`
 string,
  `cityid` int, `userid` int);
  insert overwrite table orderpayment_small select 748, '2011-03-24',
  '2011-03-24', 55 ,5372613 from src limit 1;
  drop table if exists user_small;
  create table user_small( userid int);
  insert overwrite table user_small select key from src limit 100;
  set hive.auto.convert.join.noconditionaltask.size = 200;
  SELECT
  `A`.`date`
  , `deal`.`dealid`
  FROM `orderpayment_small` `orderpayment`
  JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date`
  JOIN `orderpayment_small` `deal` ON `deal`.`dealid` =
  `orderpayment`.`dealid`
  JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` =
  `orderpayment`.`cityid`
  JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid`
  limit 5;
 
 
  It's quite strange and interesting now. I will keep searching for the
 answer
  to this issue.
 
 
 
  在 2013年8月9日星期五,上午3:32,wzc1...@gmail.com 写道:
 
  Hi all:
  I'm currently testing hive11 and encounter one bug with
  hive.auto.convert.join, I construct a testcase so everyone can reproduce
  it(or you can reach the testcase
  here:
 https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug):
 
  use test;
  create table src ( `key` int,`val` string);
  load data local inpath '/Users/code6/git/hive/data/files/kv1.txt'
 overwrite
  into table src;
  drop table if exists orderpayment_small;
  create table orderpayment_small (`dealid` int,`date` string,`time`
 string,
  `cityid` int, `userid` int);
  insert overwrite table orderpayment_small select 748, '2011-03-24',
  '2011-03-24', 55 ,5372613 from src limit 1;
  drop table if exists user_small;
  create table user_small( userid int);
  insert overwrite table user_small select key from src limit 100;
  set hive.auto.convert.join.noconditionaltask.size = 200;
  SELECT
  `dim_pay_date`.`date`
  , `deal`.`dealid`
  FROM `orderpayment_small` `orderpayment`
  JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` =
  `orderpayment`.`date`
  JOIN `orderpayment_small` `deal` ON `deal`.`dealid` =
  `orderpayment`.`dealid`
  JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` =
  `orderpayment`.`cityid`
  JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid`
  limit 5;
 
 
  You should replace the path of kv1.txt by yourself. You can run the above
  query in hive 0.11 and it will fail with ArrayIndexOutOfBoundsException,
 You
  can see the explain result and the console output of the query here :
  https://gist.github.com/code6/6187569
 
  I compile the trunk code but it doesn't work with this query. I can run
 this
  query in hive 0.9 with hive.auto.convert.join turns on.
 
  I try to dig into this problem and I think it may be caused by the map
 join
  optimization. Some adjacent operators aren't match for the input/output
  tableinfo(column positions diff).
 
  I'm not able to fix this bug and I would appreciate it if someone would
 like
  to look into this problem.
 
  Thanks.
 
 



Re: Hive 0.9.0 with hadoop 0.20.2 (fair scheduler mode)

2012-09-27 Thread Amit Sangroya
On Thu, Sep 27, 2012 at 10:56 AM, Amit Sangroya sangroyaa...@gmail.comwrote:

 Hello everyone,

 I am experiencing that Hive v-0.9.0 works with hadoop 0.20.0 only in
 default scheduling mode. But when I try to use the Fair scheduler using
 this configuration, I see that map reduce do not progress and hive log
 shows table not found exception. I am using MySql database.

 This is very strange behavior. I tried to fix if there is any issue in
 hive, or if there is anything to configure in hadoop.

 I also tried few other combinations:

1. For me, Hive v-0.9.0 works with hadoop 0.20.0 in default scheduling
 mode.
2. I can also observe that Hive v-0.7.0 works with hadoop 0.20.0 even
with Fair scheduler and default scheduler.
3. Hive v-0.9.0 works with hadoop 1.0.0 in both default  and fair
scheduling mode.


 Did anyone tried to run Hive v-0.9.0 works with hadoop 0.20.0 with
 fair scheduler. Is there any extra setting/parameter for this.


 Thanks in advance,
 Amit



Re: Question on bucketed map join

2012-03-26 Thread Amit Sharma
Hi Bejoy,
  I am joining two tables which are both bucketed 64 ways and i want to do
a bucketed map join on them. I set the flag set hive.optimize.bucketmapjoin
= true;. The auto.convert.join is always false on our cluster. When i run
the following query:

select /*+ MAPJOIN(b) */ a.visitor_id FROM
amit_merchinteraction a join amit_dse_test_cell_allocation_f b
ON a.visitor_id == b.account_id
where a.country_id = 1 and a.dateint = 20120322 and a.dateint = 20120315
;

Hive sequentially creates a hash map using the contents of the mapjoin
table b, on the client , one at a time. Is that expeced behaviour? Should
it not create these hash maps on the corresponding mappers in parallel?

Thanks,
Amit

On Thu, Jan 19, 2012 at 9:22 AM, Bejoy Ks bejoy...@yahoo.com wrote:

 Hi Avrila
AFAIK the bucketed map join is not default in hive and it happens
 only when the values is set to true. It could be because the same value is
 already set in the hive configuration xml file. To cross confirm the same
 could you explicitly set this to false
 (set hive.optimize.bucketmapjoin = false;)and get the query execution
 plan from explain command.

 Please some pointers in line

 1. Should I see sth different in the explain extended output if I set and
 unset the hive.optimize.bucketmapjoin option?
 [Bejoy] you should be seeing the same
 Try EXPLAIN your join query after setting this
 set hive.optimize.bucketmapjoin = false;

 2. Should I see something different in the output of hive while running
 the query if again I set and unset the hive.optimize.bucketmapjoin?
 [Bejoy] No,Hive output should be the same. What ever is the execution plan
 for an join, optimally the end result should be same.

 3. Is it possible that even though I set bucketmapjoin to true, Hive will
 still perform a normal map-side join for some reason? How can I check if
 this has actually happened?
 [Bejoy] Hive would perform a plain map side join only if the following
 parameter is enabled. (default it is disabled)
 set hive.auto.convert.join = true; you need to check this value in your
 configurations.
 If it is enabled irrespective of the table size hive would always try a
 map join, it would come to a normal join only after the map join attempt
 fails.
 AFAIK, if the number of buckets are same or multiples between the two
 tables involved in a join and if the join is on the same columns that are
 bucketed, with bucketmapjoin enabled it shouldn't execute a plain mapside
 join a bucketed map side join would be triggered.

 Hope it helps!..

 Regards
 Bejoy.K.S

   --
 *From:* Avrilia Floratou flora...@cs.wisc.edu
 *To:* user@hive.apache.org
 *Sent:* Thursday, January 19, 2012 9:23 PM
 *Subject:* Question on bucketed map join

 Hi,

 I have two tables with 8 buckets each on the same key and want to join
 them.
 I ran explain extended and get the plan produced by HIVE which shows
 that a map-side join is a possible plan.

 I then set in my script the hive.optimize.bucketmapjoin option to true and
 reran the explain extended query. I get the exact same plans as output.

 I ran the query with and without the bucketmapjoin optimization and saw no
 difference in the running time.

 I have the following questions:

 1. Should I see sth different in the explain extended output if I set and
 unset the hive.optimize.bucketmapjoin option?

 2. Should I see something different in the output of hive while running
 the query if again I set and unset the hive.optimize.bucketmapjoin?

 3. Is it possible that even though I set bucketmapjoin to true, Hive will
 still perform a normal map-side join for some reason? How can I check if
 this has actually happened?

 Thanks,
 Avrilia




Re: Hive Security

2012-01-31 Thread Amit Sharma
Toad uses JDBC only while connecting as a direct Hive Connection from
Eclipse. In most other cases where a Hub is involved it uses Thrift.

On the other hand in the current release of Toad for Cloud there is no way
to specify what hive user you want to connect as and hence it always
connects as the User with which the hive server is running, and connects to
the default database.

What version of Toad for Cloud are you using?

Thanks,
Amit

On Tue, Jan 31, 2012 at 10:59 AM, Sriram Krishnan skrish...@netflix.comwrote:

  I was under the impression that Toad uses JDBC – and AFAIK there is no
 way to authenticate users via JDBC using the HiveServer.

  FYI - https://issues.apache.org/jira/browse/HIVE-2539.

  BTW if anyone has a solution to this, I would be very interested to know
 as well.

  Sriram

   From: Shantian Purkad shantian_pur...@yahoo.com
 Reply-To: user@hive.apache.org, Shantian Purkad 
 shantian_pur...@yahoo.com
 Date: Tue, 31 Jan 2012 10:20:04 -0800
 To: user@hive.apache.org user@hive.apache.org
 Subject: Hive Security

 Hi,

  We are running Hive server and connecting to it through Toad. Everything
 works fine.

  Now we want to enable authentication on the hive server. The hive server
 indicates that we can specify user id password while connecting to it.

  Can someone please guide on how can we create and set users on hive
 server?


  Regards,
 Shantian



Re: Hive Custom UDF - hive.aux.jars.path not working

2011-08-26 Thread Amit Sharma
Do you know anyway in which this can be done in  Hive Server ?

Amit

On Tue, Aug 23, 2011 at 11:21 AM, Chinna Rao Lalam 72745 
chinna...@huawei.com wrote:


 Hi Amit,

  Pls check this issue HIVE-1405 it will help u .This issue targeting same
 scenario.

 Thanks
 Chinna Rao Lalam

  Hi Chinna,
That worked, Thanks a lot. So once the jar is picked up, is
  there a way
  to create a temporary function, that is retained even if i quit the
  interactive shell and start it again?  Or do i have to use the create
  command to register the function everytime?
 
  Thanks.
  Amit
 
  On Mon, Aug 22, 2011 at 10:00 PM, Chinna chinna...@huawei.com wrote:
 
Hi,
  
 
  
 U need to mention the jar like this,
  
   ** **
  
   ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-
  site.xml
  
 
 propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist
 */{URJARNAME}.jar*/value/property
  
   
  
   ** **
  
U r using CLI mode  so after changing the value if u start
  shell that is
   ok...and in another mode also we can start hive that is
  hiveserver this case
   after changing the value u need to restart the hive server
  
   ** **
  
   Thanks,
  
   Chinna Rao Lalam
--
  
   *From:* Amit Sharma [mailto:amitsharma1...@gmail.com]
   *Sent:* Tuesday, August 23, 2011 3:35 AM
  
   *To:* user@hive.apache.org
   *Subject:* Re: Hive Custom UDF - hive.aux.jars.path not
  working
   ** **
  
   Hi Vaibhav,
 Excuse my ignorance as im a little new to Hive. What do you
  mean by
   restart the Hive Server? I am using the Hive Interactive shell
  for my work.
   So i start the shell after modifying the config variable. Which
  server do i
   need to restart?
  
   Amit
  
   On Mon, Aug 22, 2011 at 2:49 PM, Aggarwal, Vaibhav
  vagg...@amazon.com wrote:
  
   Did you restart the hive server after modifying the hive-
  site.xml settings?
   
  
   I think you need to restart the server to pick up the latest
  settings in
   the config file.
  

  
   Thanks
  
   Vaibhav
  

  
   *From:* Amit Sharma [mailto:amitsharma1...@gmail.com]
   *Sent:* Monday, August 22, 2011 2:42 PM
   *To:* user@hive.apache.org
   *Subject:* Hive Custom UDF - hive.aux.jars.path not working
  

  
   Hi,
 I build custom UDFS for hive and they seem to work fine when i
  explicitly register the jars using the add jar jarname
  command or put in in the
   environment variable HIVE_AUX_JARS_PATH. But if i add it as a
   configuration variable in the hive-site.xml file and try to
  register the
   function using create temporary function functionname as
  'funciton' , it
   cannot find the jar. Any idea whats going on here?
  
   Here is the snippet from hive-site.xml:
  
   ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-
  site.xml
  
 
 propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist/value/property
   Amit
  
   ** **
  
 



Re: Hive Custom UDF - hive.aux.jars.path not working

2011-08-23 Thread Amit Sharma
Hi Chinna,
   That worked, Thanks a lot. So once the jar is picked up, is there a way
to create a temporary function, that is retained even if i quit the
interactive shell and start it again?  Or do i have to use the create
command to register the function everytime?

Thanks.
Amit

On Mon, Aug 22, 2011 at 10:00 PM, Chinna chinna...@huawei.com wrote:

  Hi,

   

   U need to mention the jar like this,

 ** **

 ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml

 propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist
 */{URJARNAME}.jar*/value/property

 

 ** **

  U r using CLI mode  so after changing the value if u start shell that is
 ok...and in another mode also we can start hive that is hiveserver this case
 after changing the value u need to restart the hive server

 ** **

 Thanks,

 Chinna Rao Lalam
  --

 *From:* Amit Sharma [mailto:amitsharma1...@gmail.com]
 *Sent:* Tuesday, August 23, 2011 3:35 AM

 *To:* user@hive.apache.org
 *Subject:* Re: Hive Custom UDF - hive.aux.jars.path not working

 ** **

 Hi Vaibhav,
   Excuse my ignorance as im a little new to Hive. What do you mean by
 restart the Hive Server? I am using the Hive Interactive shell for my work.
 So i start the shell after modifying the config variable. Which server do i
 need to restart?

 Amit

 On Mon, Aug 22, 2011 at 2:49 PM, Aggarwal, Vaibhav vagg...@amazon.com
 wrote:

 Did you restart the hive server after modifying the hive-site.xml settings?
 

 I think you need to restart the server to pick up the latest settings in
 the config file.

  

 Thanks

 Vaibhav

  

 *From:* Amit Sharma [mailto:amitsharma1...@gmail.com]
 *Sent:* Monday, August 22, 2011 2:42 PM
 *To:* user@hive.apache.org
 *Subject:* Hive Custom UDF - hive.aux.jars.path not working

  

 Hi,
   I build custom UDFS for hive and they seem to work fine when i explicitly
 register the jars using the add jar jarname command or put in in the
 environment variable HIVE_AUX_JARS_PATH. But if i add it as a
 configuration variable in the hive-site.xml file and try to register the
 function using create temporary function functionname as 'funciton' , it
 cannot find the jar. Any idea whats going on here?

 Here is the snippet from hive-site.xml:

 ~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml

 propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist/value/property

 Amit

 ** **



Hive Custom UDF - hive.aux.jars.path not working

2011-08-22 Thread Amit Sharma
Hi,
  I build custom UDFS for hive and they seem to work fine when i explicitly
register the jars using the add jar jarname command or put in in the
environment variable HIVE_AUX_JARS_PATH. But if i add it as a
configuration variable in the hive-site.xml file and try to register the
function using create temporary function functionname as 'funciton' , it
cannot find the jar. Any idea whats going on here?

Here is the snippet from hive-site.xml:

~/Documents/workspace/Hive_0_7_1/build/dist/conf$grep aux hive-site.xml
propertynamehive.aux.jars.path/namevalue/Users/amsharma/dev/Perforce/development/depot/dataeng/hive/dist/value/property

Amit


Re: Hive/hbase integration - Rebuild the Storage Handler

2011-03-22 Thread amit jaiswal
Hi,

I am also trying the same but don't know the exact build steps. Someone please 
tell the same.

-regards
Amit




From: Jean-Charles Thomas jctho...@autoscout24.com
To: Hive mailing list user@hive.apache.org
Sent: Tue, 22 March, 2011 11:40:18 AM
Subject: Hive/hbase integration - Rebuild the Storage Handler

  
Hi,
 
I am using hbase 0.90 and Hive 0.7 and would like to try the hive/hbase 
integration. From the Wiki Doc I could see that I have to rebuild the the 
handler:
 
“If you are not using hbase-0.89.0, you will need to rebuild the handler with 
the HBase jar matching your version, and change the --auxpath above accordingly”
 
Can someone explain in more details how this can be done? I unfortunately only 
have a basic java knowledge.
 
Thanks in advance,
 
JC 

Dynamic Configuration support in Hive SQL

2011-03-21 Thread amit jaiswal
Hi,

Does hive support dynamic configuration? For example: is it possible to write a 
hive script with some ${PARAM} variables and let hive replace these parameters 
with their values at runtime.

Eg.
Original hive script:
select * from person where age  ${MIN_AGE};

Config file:
MIN_AGE=18

And hive replaces the MIN_AGE parameter automatically.

-amit