ISSUE with Filter Hbase Table using SingleColumnValueFilter

2013-11-27 Thread samir das mohapatra
Dear developer

I am looking for a solution where i can applu the *SingleColumnValueFilter
to select only the value  which i will mention in the value parameter not
other then the value which i will pass.*


*  Exxample:*

SingleColumnValueFilter colValFilter = new
SingleColumnValueFilter(Bytes.toBytes("cf1"), Bytes.toBytes("code")
, CompareFilter.CompareOp.EQUAL, new
SubstringComparator("SAMIR_AL_START"));
colValFilter.setFilterIfMissing(false);
filters.add(colValFilter);

Note: I want only the  "*SAMIR_AL_START"  value not like "XYZ_AL_START"
also, I mean I want exact match value not likly.*

*   Right now it is giving both "SAMIR_AL_START" along with "XYZ_AL_START"*


*Regards,*
*samir.*


Region Server based filter using SingleColumnValueFilter is not working in CDH4.2.1 but working on CDH4.1.2

2013-11-27 Thread samir das mohapatra
Dear Hadoop/Hbase Developer,

  I was trying to scan the hbase table  by  applying  *SingleColumnValueFilter
 ,* It workes fine in CDH4.1.2  but when same code  I am running in Other
Dev cluster which is not working under CDH4.2.1 , Is there any issue with
version difference or it is a code level issue ?

  I am sharing the code which i wrote in Driver level scanning for hbase
mapreduce.

CODE
*

 List filters = new ArrayList();

   SingleColumnValueFilter colValFilter = new
SingleColumnValueFilter(Bytes.toBytes("cf1"), Bytes.toBytes("code"),
CompareFilter.CompareOp.EQUAL, new SubstringComparator("SAMIR_AL_START "));


colValFilter.setFilterIfMissing(false);
filters.add(colValFilter);


FilterList fl = new FilterList( FilterList.Operator.MUST_PASS_ALL,
filters);


Scan scan = new Scan();
scan.setFilter(fl);
scan.addColumn(Bytes.toBytes("cf1"), Bytes.toBytes("sequence_id"));
scan.addColumn(Bytes.toBytes("cf1"), Bytes.toBytes("session_id"));
scan.addColumn(Bytes.toBytes("cf1"), Bytes.toBytes("timestamp"));
scan.addColumn(Bytes.toBytes("cf1"), Bytes.toBytes("userguid"));
scan.addColumn(Bytes.toBytes("cf1"), Bytes.toBytes("code"));



*Note: same code when I am running, it is not giving  only
 "SAMIR_AL_START " value as output , rather then it is producing the other
family's some Other value. *

*For example I want to filter the record  from hbase which containes only
"SAMIR_AL_START"  as value under families 'cf1' and qualifier 'code' (It is
giving as i wanted under CDH4.1.2 but when I am running same code in Other
cluder(CDH4.2.1) *
*which is not giving  the right output )*



*If Incase anyone already  aware with this type of filter in hbase using
java could you please help me on the same problem.*





*Thanks,*
*samir. *


Did anyone work with Hbase mapreduce with multiple table as input ?

2013-11-17 Thread samir das mohapatra
Dear hadoop/hbase developer

Did  Anyone work  with Hbase mapreduce with multiple table as input ?
   Any url-link or example  will help me alot.

  Thanks in advance.

Thanks,
samir.


Facing issue using Sqoop2

2013-10-08 Thread samir das mohapatra
Dear All

   I am getting error like blow mention, did any one got from sqoop2

Error:

sqoop:000> set server --host hostname1  --port 8050 --webapp sqoop
Server is set successfully
sqoop:000> show server -all
Server host: hostname1
Server port: 8050
Server webapp: sqoop
sqoop:000> show version --all
client version:
  Sqoop 1.99.2-cdh4.4.0 revision
  Compiled by jenkins on Tue Sep  3 20:15:11 PDT 2013
Exception has occurred during processing command
Exception: com.sun.jersey.api.client.ClientHandlerException Message:
java.net.ConnectException: Connection refused

Regards,
samir.


Re: Getting error from sqoop2 command

2013-10-08 Thread samir das mohapatra
Dear Sqoop user/dev

   I am facing on issue , given below. Do you have any Idea why i am facing
this error and what could be the problem ?

sqoop:000> show connector --all
*Exception has occurred during processing command *
*Exception: com.sun.jersey.api.client.UniformInterfaceException Message:
GET http://localhost:12000/sqoop/v1/connector/all returned a response
status of 404 Not Found*
*
*
*After that i ser the server the it got resolved, but i am getting
connection refused*
*
*
sqoop:000> set server --host localhost --port 8050 --webapp sqoop
Server is set successfully

sqoop:000> show connector --all
*Exception has occurred during processing command *
*Exception: com.sun.jersey.api.client.ClientHandlerException Message:
java.net.ConnectException: Connection refused*



Regards,
samir.


On Tue, Oct 8, 2013 at 12:16 PM, samir das mohapatra <
samir.help...@gmail.com> wrote:

> Dear Sqoop user/dev
>
>I am facing on issue , given below. Do you have any Idea why i am
> facing this error and what could be the problem ?
>
> sqoop:000> show connector --all
> *Exception has occurred during processing command *
> *Exception: com.sun.jersey.api.client.UniformInterfaceException Message:
> GET http://localhost:12000/sqoop/v1/connector/all returned a response
> status of 404 Not Found*
> *
> *
> *After that i ser the server the it got resolved, but i am getting
> connection refused*
> *
> *
> sqoop:000> set server --host localhost --port 8050 --webapp sqoop
> Server is set successfully
>
> sqoop:000> show connector --all
> *Exception has occurred during processing command *
> *Exception: com.sun.jersey.api.client.ClientHandlerException Message:
> java.net.ConnectException: Connection refused*
>
>
>
> Regards,
> samir.
>


how to use Sqoop command without Hardcoded password while using sqoop command

2013-10-03 Thread samir das mohapatra
Dear Hadoop/Sqoop users


Is there any way to call sqoop command without hard coding the password for
the specific RDBMS. ?. If we are hard coding the password then it will be
huge issue with sequrity.


Regards,
samir.


How to ignore empty file comming out from hive map side join

2013-09-13 Thread samir das mohapatra
Dear Hive/Hadoop Developer

   Just I was  runing hive mapside join , along with output data I colud
see some empty file  in map stage, Why it is ? and how to ignore this file .

Regards,
samir.


While Inserting data into hive Why I colud not able to query ?

2013-07-16 Thread samir das mohapatra
Dear All,
  Did any one faced the issue :
   While Loading  huge dataset into hive table , hive restricting me to
query from same table.

  I have set hive.support.concurrency=true, still showing

conflicting lock present for TABLENAME mode SHARED


  hive.support.concurrency
  true
  Whether hive supports concurrency or not. A zookeeper
instance must be up and running for the default hive lock manager to
support read-write locks.



If It is  like that then how to solve that issue? is there any row lock
?

Regards


Error While Processing SequeceFile with Lzo Compressed in hive External table (CDH4.3)

2013-06-19 Thread samir das mohapatra
Dear All,

 Any One would have face this type of Issue ?

  I am getting Some error while processing Sequecen file with LZO compresss
in hive query  In CDH4.3.x Distribution.

Error Logs:

SET hive.exec.compress.output=true;

SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;



-rw-r--r--   3 myuser supergroup  25172 2013-06-19 21:25
/user/myDir/00_0  -- Lzo Compressed sequence file



-rw-r--r--   3 myuser   supergroup  71007 2013-06-19 21:42
/user/myDir/00_0   -- Normal sequence file



1.   Now the problem that if I create an External table on top of the
directory to read the data it gives me an error : *Failed with exception
java.io.IOException:java.io.EOFException: Premature EOF from inputStream*

*
*

*Table Creation:*

*
*

*
*

CREATE EXTERNAL TABLE IF NOT EXISTS MyTable

(

userip  string

usertid string

)

ROW FORMAT DELIMITED

   FIELDS TERMINATED BY '\001'  ESCAPED BY '\020'

   COLLECTION ITEMS TERMINATED BY '\002'

   MAP KEYS TERMINATED BY '\003'

   LINES TERMINATED BY '\012'

STORED AS SEQUENCEFILE

LOCATION '/path/to/file';


After that while querying to the table getting error:


*Failed with exception java.io.IOException:java.io.EOFException: Premature
EOF from inputStream*
*
*
*Why it is like that?*
*
*
*Regards,*
*samir*

*
*

*
*

*
*

*T*

*
*

*
*

*
*


Now give .gz file as input to the MAP

2013-06-11 Thread samir das mohapatra
Hi All,
Did any one worked on, how to pass the .gz file as  file input for
mapreduce job ?

Regards,
samir.


Re: copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread samir das mohapatra
Do you have any link or example ? could you please send me ?


On Tue, Jun 4, 2013 at 2:53 AM, Shahab Yunus  wrote:

> Have you taken a look into extending the FileOutputFormat class and
> overriding the OutputCommitter API functionality?
>
> Regards,
> Shahab
>
>
> On Mon, Jun 3, 2013 at 5:11 PM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>> Dear All,
>>
>>  Is there any way to copy the intermediate output file of the mapper
>> into local  folder  after each map task complete.
>>
>>Right now I am using
>>
>>FileSystem.copyToLocalFile(hdfsLocation,localLocation);
>>  indiste the cleanup of mapper task , but it is failing .
>>
>> Exception file not found.
>>
>> But if I am giving same statement after the job complete in driver class
>> ,it is working fine. that i dont want.
>>
>> protected void cleanup(Context context){
>>FileSystem.copyToLocalFile(hdfsLocation,localLocation);//failed
>> }
>>
>> Note: I need to copy the inter mediate output of the mapper to local file
>> system just after  each map task complete. I dont want any reducer.
>>
>>If this is not the right solution then how to solve this type of
>> scenario.
>>
>> Any help.
>>
>> regards,
>> samir.
>>
>
>


copyToLocal Failed inside the cleanup(.........) of Map task

2013-06-03 Thread samir das mohapatra
Dear All,

 Is there any way to copy the intermediate output file of the mapper
into local  folder  after each map task complete.

   Right now I am using

   FileSystem.copyToLocalFile(hdfsLocation,localLocation);
 indiste the cleanup of mapper task , but it is failing .

Exception file not found.

But if I am giving same statement after the job complete in driver class
,it is working fine. that i dont want.

protected void cleanup(Context context){
   FileSystem.copyToLocalFile(hdfsLocation,localLocation);//failed
}

Note: I need to copy the inter mediate output of the mapper to local file
system just after  each map task complete. I dont want any reducer.

   If this is not the right solution then how to solve this type of
scenario.

Any help.

regards,
samir.


How to get the intermediate mapper output file name

2013-06-03 Thread samir das mohapatra
Hi all,
   How to get the mapper output filename  inside the  the mapper .

  or

How to change the  mapper ouput file name.
 Default it looks like part-m-0,part-m-1 etc.

Regards,
samir.


Re: Pulling data from secured hadoop cluster to another hadoop cluster

2013-05-28 Thread samir das mohapatra
it is not hadoop security issue, the security is in host , I Mean to say in
network level.

 I could not able to ping bcz source system is designed such a way that
only you can connecto through ssh .
If this is the case how to over come this problem.

What extra parameter i need to add i ssh level so that i could able to ping
the machine. All the servers are in same domain.




On Tue, May 28, 2013 at 7:35 PM, Shahab Yunus wrote:

> Also Samir, when you say 'secured', by any chance that cluster is secured
> with Kerberos (rather than ssh)?
>
> -Shahab
>
>
> On Tue, May 28, 2013 at 8:29 AM, Nitin Pawar wrote:
>
>> hadoop daemons do not use ssh to communicate.
>>
>> if  your distcp job could not connect to remote server then either the
>> connection was rejected by the target namenode or the it was not able to
>> establish the network connection.
>>
>> were you able to see the hdfs on server1 from server2?
>>
>>
>> On Tue, May 28, 2013 at 5:17 PM, samir das mohapatra <
>> samir.help...@gmail.com> wrote:
>>
>>> Hi All,
>>>   I could able to connect the hadoop (source ) cluster after ssh is
>>> established.
>>>
>>> But i wanted to know, If I want to pull some data using distcp from
>>> source secured hadoop box to another hadoop cluster , I could not able to
>>> ping  name node machine.  In this approach how to run distcp command from
>>> target cluster in with secured connection.
>>>
>>> Source: hadoop.server1 (ssh secured)
>>> Target:  hadoop.server2 (runing distcp here)
>>>
>>>
>>> running command:
>>>
>>> distcp hftp://hadoop.server1:50070/dataSet
>>> hdfs://hadoop.server2:54310/targetDataSet
>>>
>>>
>>> Regards,
>>> samir.
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>


Pulling data from secured hadoop cluster to another hadoop cluster

2013-05-28 Thread samir das mohapatra
Hi All,
  I could able to connect the hadoop (source ) cluster after ssh is
established.

But i wanted to know, If I want to pull some data using distcp from source
secured hadoop box to another hadoop cluster , I could not able to ping
name node machine.  In this approach how to run distcp command from target
cluster in with secured connection.

Source: hadoop.server1 (ssh secured)
Target:  hadoop.server2 (runing distcp here)


running command:

distcp hftp://hadoop.server1:50070/dataSet
hdfs://hadoop.server2:54310/targetDataSet


Regards,
samir.


Issue with data Copy from CDH3 to CDH4

2013-05-24 Thread samir das mohapatra
Hi all,

We tried to pull the data from  upstream cluster(cdh3) which is running
cdh3 to down stream system (running cdh4) ,Using *distcp* to copy the data,
it was throughing some exception bcz due to version isssue.

 I wanted to know is there any solution to pull the data from CDH3 to CDH4
without manually.

What is the other approach to solve the problem.(Data are 10 PB)
 Regards,
samir.


Re: Secondary Name Node Issue CDH4.1.2

2013-04-01 Thread samir das mohapatra
Few more information:
  I am using cdh4.1.2 with MRv1.




On Mon, Apr 1, 2013 at 11:40 AM, samir.helpdoc wrote:

> Hi All,
>   Could you please share me , how to start/configure SSN in Different
> Physical Machine. Currently it is start on name node machine I want to
> start SSN in Different machine.
>
> Regards,
> samir.
>


Re: how to copy a table from one hbase cluster to another cluster?

2013-03-20 Thread samir das mohapatra
yes, yes just i thought same thing.
many many thanks.


On Wed, Mar 20, 2013 at 8:55 PM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Hi Samir,
>
> Have you looked at the link I sent you?
>
> You have a command line for that, you have an example, and if you need
> to do it in Java, you san simply open the
> org.apache.hadoop.hbase.mapreduce.CopyTable, look into it, and do
> almost the same thing for your needs?
>
> JM
>
> 2013/3/20 samir das mohapatra :
> > Thanks, for reply
> >
> > I need to copy the hbase table into another cluster through the java
> code.
> > Any example will help to  me
> >
> >
> >
> > On Wed, Mar 20, 2013 at 8:48 PM, Jean-Marc Spaggiari
> >  wrote:
> >>
> >> Hi Samir,
> >>
> >> Is this what you are looking for?
> >>
> >> http://hbase.apache.org/book/ops_mgt.html#copytable
> >>
> >> What kind of help do you need?
> >>
> >> JM
> >>
> >> 2013/3/20 samir das mohapatra :
> >> > Hi All,
> >> > Can you help me to copy one hbase table  to another cluster hbase
> >> > (Table
> >> > copy) .
> >> >
> >> > Regards,
> >> > samir
> >
> >
>


Re: how to copy a table from one hbase cluster to another cluster?

2013-03-20 Thread samir das mohapatra
Thanks, for reply

I need to copy the hbase table into another cluster through the java code.
Any example will help to  me



On Wed, Mar 20, 2013 at 8:48 PM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:

> Hi Samir,
>
> Is this what you are looking for?
>
> http://hbase.apache.org/book/ops_mgt.html#copytable
>
> What kind of help do you need?
>
> JM
>
> 2013/3/20 samir das mohapatra :
> > Hi All,
> > Can you help me to copy one hbase table  to another cluster hbase
> (Table
> > copy) .
> >
> > Regards,
> > samir
>


how to copy a table from one hbase cluster to another cluster?

2013-03-20 Thread samir das mohapatra
Hi All,
Can you help me to copy one hbase table  to another cluster hbase
(Table copy) .

Regards,
samir


Re: Is there any way to get information from Hbase once some record get updated?

2013-03-15 Thread samir das mohapatra
thanks, But i wanted to know the table specific trigger. I mean outof whole
database's tales, I want only for specific table with trigger event. Is is
possibel?


On Fri, Mar 15, 2013 at 11:08 AM, Ted  wrote:

> Take a look at BaseRegionObserver class where you would see various hooks.
>
> Cheers
>
> On Mar 14, 2013, at 10:24 PM, samir das mohapatra 
> wrote:
>
> > Hi All,
> >
> >   Is there any way to get information  from Hbase once some record get
> updated? , Like the Database Trigger.
> >
> > Regards,
> > samir.
>


Is there any way to get information from Hbase once some record get updated?

2013-03-14 Thread samir das mohapatra
Hi All,

  Is there any way to get information  from Hbase once some record get
updated? , Like the Database Trigger.

Regards,
samir.


Re: How to pull Delta data from one cluster to another cluster ?

2013-03-14 Thread samir das mohapatra
will sqoop support inter-cluster data copy  after filder.

Senario:
   1) cluster-1, cluster-2
   2)  taking data from cluster-1 to cluster-2 based on filter condition
   Will sqoop  support ?



On Thu, Mar 14, 2013 at 4:19 PM, Tariq  wrote:

> You can do that through Pig.
>
> samir das mohapatra  wrote:
>>
>> how to pull delta data that means filter data not whole data as off now i
>> know we can do whole data through the distcp, colud you plese help if i am
>> wrong or any other way to pull efficiently.
>>
>>
>> like : get data based on filter condition.
>>
>>
>>
>>
>> On Thu, Mar 14, 2013 at 3:43 PM, Mohammad Tariq wrote:
>>
>>> Use distcp.
>>>
>>> Warm Regards,
>>> Tariq
>>> https://mtariq.jux.com/
>>> cloudfront.blogspot.com
>>>
>>>
>>> On Thu, Mar 14, 2013 at 3:40 PM, samir das mohapatra <
>>> samir.help...@gmail.com> wrote:
>>>
>>>> Regards,
>>>> samir.
>>>>
>>>
>>>
>>
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.
>


Re: How to pull Delta data from one cluster to another cluster ?

2013-03-14 Thread samir das mohapatra
how to pull delta data that means filter data not whole data as off now i
know we can do whole data through the distcp, colud you plese help if i am
wrong or any other way to pull efficiently.


like : get data based on filter condition.




On Thu, Mar 14, 2013 at 3:43 PM, Mohammad Tariq  wrote:

> Use distcp.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Thu, Mar 14, 2013 at 3:40 PM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>> Regards,
>> samir.
>>
>
>


Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread samir das mohapatra
Use can use Custom Partitioner for that same.



Regards,

Samir.


On Wed, Mar 13, 2013 at 2:29 PM, Vikas Jadhav wrote:

>
> Hi
> I am specifying requirement again with example.
>
>
>
> I have use case where i need to shufffle same (key,value) pair to multiple
> reducers
>
>
> For Example  we have pair  (1,"ABC") and two reducers (reducer0 and
> reducer1) are there then
>
> by default this pair will go to reduce1 (cause  (key % numOfReducer) =
> (1%2) )
>
>
> how i should shuffle this pair to both reducer.
>
> Also I willing to change the code of hadoop framework if Necessory.
>
>  Thank you
>
> On Wed, Mar 13, 2013 at 12:51 PM, feng lu  wrote:
>
>> Hi
>>
>> you can use Job#setNumReduceTasks(int tasks) method to set the number of
>> reducer to output.
>>
>>
>> On Wed, Mar 13, 2013 at 2:15 PM, Vikas Jadhav 
>> wrote:
>>
>>> Hello,
>>>
>>> As by default Hadoop framework can shuffle (key,value) pair to only one
>>> reducer
>>>
>>> I have use case where i need to shufffle same (key,value) pair to
>>> multiple reducers
>>>
>>> Also I  willing to change the code of hadoop framework if Necessory.
>>>
>>>
>>> Thank you
>>>
>>> --
>>> *
>>> *
>>> *
>>>
>>> Thanx and Regards*
>>> * Vikas Jadhav*
>>>
>>
>>
>>
>> --
>> Don't Grow Old, Grow Up... :-)
>>
>
>
>
> --
> *
> *
> *
>
> Thanx and Regards*
> * Vikas Jadhav*
>


Re: How can I record some position of context in Reduce()?

2013-03-12 Thread samir das mohapatra
Through the RecordReader and FileStatus you can get it.


On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy  wrote:

> Hi,everyone,
> I want to join the k-v pairs in Reduce(),but how to get the record
> position?
> Now,what I thought is to save the context status,but class Context doesn't
> implement a clone construct method.
>
> Any help will be appreciated.
> Thank you very much.
>


Why hadoop is spawing two map over file size 1.5 KB ?

2013-03-12 Thread samir das mohapatra
Hi All,
  I have very fundamental doubt, I have file having size 1.5KB and block
size is default block size, But i could see two mapper it got creted during
the Job. Could you please help  to get whole picture why it is .

Regards,
samir.


Re: Hadoop cluster hangs on big hive job

2013-03-10 Thread samir das mohapatra
Problem I could see in you log file is , No available  free map slot for
job.
I think you have to increase the block size to reduce the # of MAP , Bcz
you  are passing Big data as Input.
The  ideal approach is , first increase the
  1) block size,
   2)  mapp site buffer
3) jvm re-use  etc.

regards,
samir.


On Fri, Mar 8, 2013 at 1:23 AM, Daning Wang  wrote:

> We have hive query processing zipped csv files. the query was scanning for
> 10 days(partitioned by date). data for each day around 130G. The problem is
> not consistent since if you run it again, it might go through. but the
> problem has never happened on the smaller jobs(like processing only one
> days data).
>
> We don't have space issue.
>
> I have attached log file when problem happening. it is stuck like
> following(just search "19706 of 49964")
>
> 2013-03-05 15:13:51,587 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201302270947_0010_r_19_0 0.131468% reduce > copy (19706 of
> 49964 at 0.00 MB/s) >
> 2013-03-05 15:13:51,811 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of
> 49964 at 0.00 MB/s) >
> 2013-03-05 15:13:52,551 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201302270947_0010_r_32_0 0.131468% reduce > copy (19706 of
> 49964 at 0.00 MB/s) >
> 2013-03-05 15:13:52,760 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201302270947_0010_r_00_0 0.131468% reduce > copy (19706 of
> 49964 at 0.00 MB/s) >
> 2013-03-05 15:13:52,946 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201302270947_0010_r_24_0 0.131468% reduce > copy (19706 of
> 49964 at 0.00 MB/s) >
> 2013-03-05 15:13:54,742 INFO org.apache.hadoop.mapred.TaskTracker:
> attempt_201302270947_0010_r_08_0 0.131468% reduce > copy (19706 of
> 49964 at 0.00 MB/s) >
>
> Thanks,
>
> Daning
>
>
> On Thu, Mar 7, 2013 at 12:21 AM, Håvard Wahl Kongsgård <
> haavard.kongsga...@gmail.com> wrote:
>
>> hadoop logs?
>> On 6. mars 2013 21:04, "Daning Wang"  wrote:
>>
>>> We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while
>>> running big jobs. Basically all the nodes are dead, from that
>>> trasktracker's log looks it went into some kinds of loop forever.
>>>
>>> All the log entries like this when problem happened.
>>>
>>> Any idea how to debug the issue?
>>>
>>> Thanks in advance.
>>>
>>>
>>> 2013-03-05 15:13:19,526 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_12_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:19,552 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_28_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:20,858 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_36_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:21,141 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_16_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:21,486 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_19_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:21,692 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:22,448 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_32_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:22,643 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_00_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:22,840 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_24_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:24,628 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_08_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:24,723 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_39_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:25,336 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_04_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:25,539 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_43_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:25,545 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_12_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:25,569 INFO org.apache.hadoop.mapred.TaskTracker:
>>> attempt_201302270947_0010_r_28_0 0.131468% reduce > copy (19706 of
>>> 49964 at 0.00 MB/s) >
>>> 2013-03-05 15:13:25,855 INFO org.apache.hadoop.m

Re: Need help optimizing reducer

2013-03-04 Thread samir das mohapatra
Austin,
  I think  you have to use partitioner to spawn more then one reducer for
small data set.
  Default Partitioner will allow you only one reducer, you have to
overwrite and implement you own logic to spawn more then one reducer.




On Tue, Mar 5, 2013 at 1:27 AM, Austin Chungath  wrote:

> Hi all,
>
> I have 1 reducer and I have around 600 thousand unique keys coming to it.
> The total data is only around 30 mb.
> My logic doesn't allow me to have more than 1 reducer.
> It's taking too long to complete, around 2 hours. (till 66% it's fast then
> it slows down/ I don't really think it has started doing anything till 66%
> but then why does it show like that?).
> Are there any job execution parameters that can help improve reducer
> performace?
> Any suggestions to improve things when we have to live with just one
> reducer?
>
> thanks,
> Austin
>


How to Use flume

2013-03-01 Thread samir das mohapatra
Hi All,
   I am planing to use flume in one of the POC project .  I am new to flume
Do you have any supported doc/link/example from where i will get all the
context ASAP.

Regards,
samir.


Re: Issue with sqoop and HANA/ANY DB Schema name

2013-03-01 Thread samir das mohapatra
Any help...


On Fri, Mar 1, 2013 at 12:06 PM, samir das mohapatra <
samir.help...@gmail.com> wrote:

> Hi All,
>   I am facing one problem , how to specify the schema name before the
> table while executing the sqoop import statement.
>
> $> sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
> com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
> --password   --target-dir  /input/Test1  --verbose
>
> Note : Without schema name above sqoop import is working file but after
> assigning the schema name it is showing  error
>
> Error Logs:
>
> Hi All,
>   I am facing one problem , how to specify the schema name before the
> table while executing the sqoop import statement.
>
> $> sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
> com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
> --password   --target-dir  /input/Test1  --verbose
>
> Note : Without schema name above sqoop import is working file but after
> assigning the schema name it is showing  error
>
> Error Logs:
>
>
> Regards,
> samir.
>


How to use sqoop import

2013-02-28 Thread samir das mohapatra
Hi All,
  Can any one share some example how to run sqoop "Import results of SQL
'statement'"   ?
 for example:
 sqoop import -connect jdbc:.  --driver xxx

after this if i am specifying  --query "select statement " it is even not
recognizing as sqoop  valid statement..

Regards,
samir.


Issue with sqoop and HANA/ANY DB Schema name

2013-02-28 Thread samir das mohapatra
Hi All,
  I am facing one problem , how to specify the schema name before the table
while executing the sqoop import statement.

$> sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
--password   --target-dir  /input/Test1  --verbose

Note : Without schema name above sqoop import is working file but after
assigning the schema name it is showing  error

Error Logs:

Hi All,
  I am facing one problem , how to specify the schema name before the table
while executing the sqoop import statement.

$> sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
--password   --target-dir  /input/Test1  --verbose

Note : Without schema name above sqoop import is working file but after
assigning the schema name it is showing  error

Error Logs:


Regards,
samir.


Re: Issue in Datanode (using CDH4.1.2)

2013-02-28 Thread samir das mohapatra
few more things

Same setup was working in Ubuntu machine(Dev cluster), only failing under
CentOS 6.3(prod Cluster)



On Thu, Feb 28, 2013 at 9:06 PM, samir das mohapatra <
samir.help...@gmail.com> wrote:

> Hi All,
>   I am facing on strange issue, That is In a cluster having 1k machine  i
> could able to start and stop
> NN,DN,JT,TT,SSN. But the problem is  under Name node  Web-URL
> it is showing only one  datanode . I tried to connect node through ssh
> also it was working file and i have  assigned NNURL: port  in core-site
> http://namenode:50070
>
> Again I have checked with datanode logs,  and I got the message like this:
>
> 2013-02-28 06:59:01,652 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
> server: hadoophost1/192.168.1.1:54310
> 2013-02-28 06:59:07,660 INFO org.apache.hadoop.ipc.Client: Retrying
> connect to server: hadoophost1/192.168.1.1:54310. Already tried 0
> time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
> sleepTime=1 SECONDS)
>
> Regards,
> samir.
>
>


Re: How to take Whole Database From RDBMS to HDFS Instead of Table/Table

2013-02-27 Thread samir das mohapatra
Is it good way to take total 5PB data through the JAVA/JDBC Program ?


On Wed, Feb 27, 2013 at 5:56 PM, Michel Segel wrote:

> I wouldn't use sqoop if you are taking everything.
> Simpler to write your own java/jdbc program that writes its output to HDFS.
>
> Just saying...
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Feb 27, 2013, at 5:15 AM, samir das mohapatra 
> wrote:
>
> thanks all.
>
>
>
> On Wed, Feb 27, 2013 at 4:41 PM, Jagat Singh  wrote:
>
>> You might want to read this
>>
>>
>> http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal
>>
>>
>>
>>
>> On Wed, Feb 27, 2013 at 10:09 PM, samir das mohapatra <
>> samir.help...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>>Using sqoop how to take entire database table into HDFS insted of
>>> Table by Table ?.
>>>
>>> How do you guys did it?
>>> Is there some trick?
>>>
>>> Regards,
>>> samir.
>>>
>>
>>
>


Re: How to take Whole Database From RDBMS to HDFS Instead of Table/Table

2013-02-27 Thread samir das mohapatra
thanks all.



On Wed, Feb 27, 2013 at 4:41 PM, Jagat Singh  wrote:

> You might want to read this
>
>
> http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal
>
>
>
>
> On Wed, Feb 27, 2013 at 10:09 PM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>> Hi All,
>>
>>Using sqoop how to take entire database table into HDFS insted of
>> Table by Table ?.
>>
>> How do you guys did it?
>> Is there some trick?
>>
>> Regards,
>> samir.
>>
>
>


How to take Whole Database From RDBMS to HDFS Instead of Table/Table

2013-02-27 Thread samir das mohapatra
Hi All,

   Using sqoop how to take entire database table into HDFS insted of Table
by Table ?.

How do you guys did it?
Is there some trick?

Regards,
samir.


Re: ISSUE IN CDH4.1.2 : transfer data between different HDFS clusters.(using distch)

2013-02-25 Thread samir das mohapatra
I am using CDH4.1.2 with MRv1 not YARN.


On Mon, Feb 25, 2013 at 3:47 PM, samir das mohapatra <
samir.help...@gmail.com> wrote:

> yes
>
>
> On Mon, Feb 25, 2013 at 3:30 PM, Nitin Pawar wrote:
>
>> does this match with your issue
>>
>>
>> https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kIPOvrFaQE8
>>
>>
>> On Mon, Feb 25, 2013 at 3:20 PM, samir das mohapatra <
>> samir.help...@gmail.com> wrote:
>>
>>>
>>>
>>> -- Forwarded message --
>>> From: samir das mohapatra 
>>> Date: Mon, Feb 25, 2013 at 3:05 PM
>>> Subject: ISSUE IN CDH4.1.2 : transfer data between different HDFS
>>> clusters.(using distch)
>>> To: cdh-u...@cloudera.org
>>>
>>>
>>> Hi All,
>>>   I am getting bellow error , can any one help me on the same issue,
>>>
>>> ERROR LOG:
>>> --
>>>
>>> hadoop@hadoophost2:~$ hadoop   distcp hdfs://
>>> 10.192.200.170:50070/tmp/samir.txt hdfs://10.192.244.237:50070/input
>>> 13/02/25 01:34:36 INFO tools.DistCp: srcPaths=[hdfs://
>>> 10.192.200.170:50070/tmp/samir.txt]
>>> 13/02/25 01:34:36 INFO tools.DistCp: destPath=hdfs://
>>> 10.192.244.237:50070/input
>>> With failures, global counters are inaccurate; consider running with -i
>>> Copy failed: java.io.IOException: Failed on local exception:
>>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>>> end-group tag did not match expected tag.; Host Details : local host is:
>>> "hadoophost2/10.192.244.237"; destination host is: "
>>> bl1slu040.corp.adobe.com":50070;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>>> at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>> at $Proxy9.getFileInfo(Unknown Source)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:616)
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>> at $Proxy9.getFileInfo(Unknown Source)
>>> at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:628)
>>> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1507)
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
>>> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1257)
>>> at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:636)
>>> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>>> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>>> message end-group tag did not match expected tag.
>>> at
>>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>>> at
>>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>>> at
>>> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
>>> at
>>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>>> at
>>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>>> at
>>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>>> at
>>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>>> at
>>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>>> at
>>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>>> at
>>> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>>
>>>
>>>
>>> Regards,
>>> samir
>>>
>>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>


Re: ISSUE IN CDH4.1.2 : transfer data between different HDFS clusters.(using distch)

2013-02-25 Thread samir das mohapatra
yes


On Mon, Feb 25, 2013 at 3:30 PM, Nitin Pawar wrote:

> does this match with your issue
>
> https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kIPOvrFaQE8
>
>
> On Mon, Feb 25, 2013 at 3:20 PM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>>
>>
>> ------ Forwarded message --
>> From: samir das mohapatra 
>> Date: Mon, Feb 25, 2013 at 3:05 PM
>> Subject: ISSUE IN CDH4.1.2 : transfer data between different HDFS
>> clusters.(using distch)
>> To: cdh-u...@cloudera.org
>>
>>
>> Hi All,
>>   I am getting bellow error , can any one help me on the same issue,
>>
>> ERROR LOG:
>> --
>>
>> hadoop@hadoophost2:~$ hadoop   distcp hdfs://
>> 10.192.200.170:50070/tmp/samir.txt hdfs://10.192.244.237:50070/input
>> 13/02/25 01:34:36 INFO tools.DistCp: srcPaths=[hdfs://
>> 10.192.200.170:50070/tmp/samir.txt]
>> 13/02/25 01:34:36 INFO tools.DistCp: destPath=hdfs://
>> 10.192.244.237:50070/input
>> With failures, global counters are inaccurate; consider running with -i
>> Copy failed: java.io.IOException: Failed on local exception:
>> com.google.protobuf.InvalidProtocolBufferException: Protocol message
>> end-group tag did not match expected tag.; Host Details : local host is:
>> "hadoophost2/10.192.244.237"; destination host is: "
>> bl1slu040.corp.adobe.com":50070;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
>> at org.apache.hadoop.ipc.Client.call(Client.java:1164)
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>> at $Proxy9.getFileInfo(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:616)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>> at $Proxy9.getFileInfo(Unknown Source)
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:628)
>> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1507)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
>> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1257)
>> at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:636)
>> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
>> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>> Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
>> message end-group tag did not match expected tag.
>> at
>> com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
>> at
>> com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
>> at
>> com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
>> at
>> com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
>> at
>> org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
>> at
>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)
>>
>>
>>
>> Regards,
>> samir
>>
>>
>
>
> --
> Nitin Pawar
>


Fwd: ISSUE IN CDH4.1.2 : transfer data between different HDFS clusters.(using distch)

2013-02-25 Thread samir das mohapatra
-- Forwarded message --
From: samir das mohapatra 
Date: Mon, Feb 25, 2013 at 3:05 PM
Subject: ISSUE IN CDH4.1.2 : transfer data between different HDFS
clusters.(using distch)
To: cdh-u...@cloudera.org


Hi All,
  I am getting bellow error , can any one help me on the same issue,

ERROR LOG:
--

hadoop@hadoophost2:~$ hadoop   distcp hdfs://
10.192.200.170:50070/tmp/samir.txt hdfs://10.192.244.237:50070/input
13/02/25 01:34:36 INFO tools.DistCp: srcPaths=[hdfs://
10.192.200.170:50070/tmp/samir.txt]
13/02/25 01:34:36 INFO tools.DistCp: destPath=hdfs://
10.192.244.237:50070/input
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
"hadoophost2/10.192.244.237"; destination host is:
"bl1slu040.corp.adobe.com":50070;

at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1164)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.getFileInfo(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:628)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1507)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1257)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:636)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
message end-group tag did not match expected tag.
at
com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
at
com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
at
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
at
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
at
org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)



Regards,
samir


Re: ISSUE :Hadoop with HANA using sqoop

2013-02-21 Thread samir das mohapatra
pper.java:182)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech
JDBC: [257]: sql syntax error: incorrect syntax near ".": line 1 col
46 (at pos 46)
at 
com.sap.db.jdbc.exceptions.SQLExceptionSapDB.createException(SQLExceptionSapDB.java:334)
at 
com.sap.db.jdbc.exceptions.SQLExceptionSapDB.generateDatabaseException(SQLExceptionSapDB.java:174)
at 
com.sap.db.jdbc.packet.ReplyPacket.buildExceptionChain(ReplyPacket.java:103)
at com.sap.db.jdbc.ConnectionSapDB.execute(ConnectionSapDB.java:848)
at 
com.sap.db.jdbc.CallableStatementSapDB.sendCommand(CallableStatementSapDB.java:1874)
at com.sap.db.jdbc.StatementSapDB.sendSQL(StatementSapDB.java:945)
at 
com.sap.db.jdbc.CallableStatementSapDB.doParse(CallableStatementSapDB.java:230)
at 
com.sap.db.jdbc.CallableStatementSapDB.constructor(CallableStatementSapDB.java:190)
at 
com.sap.db.jdbc.CallableStatementSapDB.(CallableStatementSapDB.java:101)
at 
com.sap.db.jdbc.CallableStatementSapDBFinalize.(CallableStatementSapDBFinalize.java:31)
at 
com.sap.db.jdbc.ConnectionSapDB.prepareStatement(ConnectionSapDB.java:1088)
at 
com.sap.db.jdbc.trace.Connection.prepareStatement(Connection.java:347)
at 
org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:101)
at 
org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:236)
... 12 more
2013-02-20 23:10:23,906 INFO org.apache.hadoop.mapred.Task: Runnning
cleanup for the task



On Thu, Feb 21, 2013 at 12:03 PM, Harsh J  wrote:

> The error is truncated, check the actual failed task's logs for complete
> info:
>
> Caused by: com.sap… what?
>
> Seems more like a SAP side fault than a Hadoop side one and you should
> ask on their forums with the stacktrace posted.
>
> On Thu, Feb 21, 2013 at 11:58 AM, samir das mohapatra
>  wrote:
> > Hi All
> > Can you plese tell me why I am getting error while loading data from
> > SAP HANA   to Hadoop HDFS using sqoop (4.1.2).
> >
> > Error Log:
> >
> > java.io.IOException: SQLException in nextKeyValue
> >   at
> >
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
> >   at
> >
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:458)
> >   at
> >
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
> >   at
> >
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
> >   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
> >   at
> >
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> >   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> >   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> >   at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> >   at java.security.AccessController.doPrivileged(Native Method)
> >   at javax.security.auth.Subject.doAs(Subject.java:416)
> >   at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> >   at org.apache.hadoop.mapred.Child.main(Child.java:262)
> > Caused by: com.sap
> >
> > Regards,
> > samir.
> >
> >
> >
> > --
> >
> >
> >
>
>
>
> --
> Harsh J
>


Re: ISSUE :Hadoop with HANA using sqoop

2013-02-20 Thread samir das mohapatra
ml#noconfig for more info.
13/02/20 22:38:32 INFO mapred.JobClient: Task Id :
attempt_201302202127_0014_m_00_2, Status : FAILED
java.io.IOException: SQLException in nextKeyValue
at
org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:458)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at
org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: com.sap
attempt_201302202127_0014_m_00_2: log4j:WARN No appenders could be
found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201302202127_0014_m_00_2: log4j:WARN Please initialize the
log4j system properly.
attempt_201302202127_0014_m_00_2: log4j:WARN See
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
13/02/20 22:38:46 INFO mapred.JobClient: Job complete: job_201302202127_0014
13/02/20 22:38:46 INFO mapred.JobClient: Counters: 6
13/02/20 22:38:46 INFO mapred.JobClient:   Job Counters
13/02/20 22:38:46 INFO mapred.JobClient: Failed map tasks=1
13/02/20 22:38:46 INFO mapred.JobClient: Launched map tasks=4
13/02/20 22:38:46 INFO mapred.JobClient: Total time spent by all maps
in occupied slots (ms)=56775
13/02/20 22:38:46 INFO mapred.JobClient: Total time spent by all
reduces in occupied slots (ms)=0
13/02/20 22:38:46 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
13/02/20 22:38:46 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
13/02/20 22:38:46 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
70.5203 seconds (0 bytes/sec)
13/02/20 22:38:46 WARN mapreduce.Counters: Group
org.apache.hadoop.mapred.Task$Counter is deprecated. Use
org.apache.hadoop.mapreduce.TaskCounter instead
13/02/20 22:38:46 INFO mapreduce.ImportJobBase: Retrieved 0 records.
13/02/20 22:38:46 ERROR tool.ImportTool: Error during import: Import job
failed!


On Thu, Feb 21, 2013 at 12:03 PM, Harsh J  wrote:

> The error is truncated, check the actual failed task's logs for complete
> info:
>
> Caused by: com.sap… what?
>
> Seems more like a SAP side fault than a Hadoop side one and you should
> ask on their forums with the stacktrace posted.
>
> On Thu, Feb 21, 2013 at 11:58 AM, samir das mohapatra
>  wrote:
> > Hi All
> > Can you plese tell me why I am getting error while loading data from
> > SAP HANA   to Hadoop HDFS using sqoop (4.1.2).
> >
> > Error Log:
> >
> > java.io.IOException: SQLException in nextKeyValue
> >   at
> >
> org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
> >   at
> >
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:458)
> >   at
> >
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
> >   at
> >
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
> >   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
> >   at
> >
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
> >   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
> >   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> >   at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> >   at java.security.AccessController.doPrivileged(Native Method)
> >   at javax.security.auth.Subject.doAs(Subject.java:416)
> >   at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
> >   at org.apache.hadoop.mapred.Child.main(Child.java:262)
> > Caused by: com.sap
> >
> > Regards,
> > samir.
> >
> >
> >
> > --
> >
> >
> >
>
>
>
> --
> Harsh J
>


Fwd: Delivery Status Notification (Failure)

2013-02-12 Thread samir das mohapatra
Hi All,
   I wanted to know how to connect Hive(hadoop-cdh4 distribution) with
MircoStrategy
   Any help is very helpfull.

  Witing for you response

Note: It is little bit urgent do any one have exprience in that
Thanks,
samir


Re: "Hive Metastore DB Issue ( Cloudera CDH4.1.2 MRv1 with hive-0.9.0-cdh4.1.2)"

2013-02-07 Thread samir das mohapatra
Hi Suresh,
   Thanks for advice,

why you are so monopoly, You shoul not be. Problem is solution not
problem.

Note: I am looking for any user does not matter bcz it is common use
Scenario.


On Fri, Feb 8, 2013 at 3:31 AM, Suresh Srinivas wrote:

> Please only use CDH mailing list and do not copy this to hdfs-user.
>
>
> On Thu, Feb 7, 2013 at 7:20 AM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>> Any Suggestion...
>>
>>
>> On Thu, Feb 7, 2013 at 4:17 PM, samir das mohapatra <
>> samir.help...@gmail.com> wrote:
>>
>>> Hi All,
>>>   I could not see the hive meta  store DB under Mysql  database Under
>>> mysql user hadoop.
>>>
>>> Example:
>>>
>>> $>  mysql –u root -p
>>>  $> Add hadoop user (CREATE USER ‘hadoop'@'localhost' IDENTIFIED BY ‘
>>> hadoop';)
>>>  $>GRANT ALL ON *.* TO ‘hadoop'@‘% IDENTIFIED BY ‘hadoop’
>>>  $> Example (GRANT ALL PRIVILEGES ON *.* TO 'hadoop'@'localhost'
>>> IDENTIFIED BY 'hadoop' WITH GRANT OPTION;)
>>>
>>> Bellow  configuration i am follwing
>>> 
>>>
>>> 
>>> javax.jdo.option.ConnectionURL
>>>
>>> jdbc:mysql://localhost:3306/hadoop?createDatabaseIfNotExist=true
>>> 
>>> 
>>> javax.jdo.option.ConnectionDriverName
>>> com.mysql.jdbc.Driver
>>> 
>>> 
>>>   javax.jdo.option.ConnectionUserName
>>>   hadoop
>>> 
>>> 
>>>javax.jdo.option.ConnectionPassword
>>>hadoop
>>>
>>> 
>>>
>>>
>>>  Note: Previously i was using cdh3 it was perfectly creating under mysql
>>> metastore DB but when i changed cdh3 to cdh4.1.2 with hive as above subject
>>> line , It is not creating.
>>>
>>>
>>> Any suggestiong..
>>>
>>> Regrads,
>>> samir.
>>>
>>
>>
>
>
> --
> http://hortonworks.com/download/
>


Re: "Hive Metastore DB Issue ( Cloudera CDH4.1.2 MRv1 with hive-0.9.0-cdh4.1.2)"

2013-02-07 Thread samir das mohapatra
Any Suggestion...


On Thu, Feb 7, 2013 at 4:17 PM, samir das mohapatra  wrote:

> Hi All,
>   I could not see the hive meta  store DB under Mysql  database Under
> mysql user hadoop.
>
> Example:
>
> $>  mysql –u root -p
>  $> Add hadoop user (CREATE USER ‘hadoop'@'localhost' IDENTIFIED BY ‘
> hadoop';)
>  $>GRANT ALL ON *.* TO ‘hadoop'@‘% IDENTIFIED BY ‘hadoop’
>  $> Example (GRANT ALL PRIVILEGES ON *.* TO 'hadoop'@'localhost'
> IDENTIFIED BY 'hadoop' WITH GRANT OPTION;)
>
> Bellow  configuration i am follwing
> 
>
> 
> javax.jdo.option.ConnectionURL
>
> jdbc:mysql://localhost:3306/hadoop?createDatabaseIfNotExist=true
> 
> 
> javax.jdo.option.ConnectionDriverName
> com.mysql.jdbc.Driver
> 
> 
>   javax.jdo.option.ConnectionUserName
>   hadoop
> 
> 
>javax.jdo.option.ConnectionPassword
>hadoop
>
> 
>
>
> Note: Previously i was using cdh3 it was perfectly creating under mysql
> metastore DB but when i changed cdh3 to cdh4.1.2 with hive as above subject
> line , It is not creating.
>
>
> Any suggestiong..
>
> Regrads,
> samir.
>


All MAP Jobs(Java Custom Map Reduce Program) are assigned to one Node why?

2013-01-31 Thread samir das mohapatra
Hi All,

  I am using cdh4 with MRv1 . When I am running any hadoop mapreduce
program from  java  , all the map task is assigned to one node. It suppose
to distribute the map task among the cluster's nodes.

 Note : 1) My jobtracker web-UI is showing 500 nodes
2) when  it is comming to reducer , then it is sponning into
other  node  (other then map node)

Can nay one guide me why it is like so

Regards,
samir.


Re: What is the best way to load data from one cluster to another cluster (Urgent requirement)

2013-01-30 Thread samir das mohapatra
thanks all.


On Thu, Jan 31, 2013 at 11:19 AM, Satbeer Lamba wrote:

> I might be wrong but have you considered distcp?
> On Jan 31, 2013 11:15 AM, "samir das mohapatra" 
> wrote:
>
>> Hi All,
>>
>>Any one knows,  how to load data from one hadoop cluster(CDH4) to
>> another Cluster (CDH4) . They way our project needs are
>>1) It should  be delta load or incremental load.
>>2) It should be based on the timestamp
>>3) Data volume are 5PB
>>
>> Any Help 
>>
>> Regards,
>> samir.
>>
>


Recommendation required for Right Hadoop Distribution (CDH OR HortonWork)

2013-01-30 Thread samir das mohapatra
Hi All,
   My Company wanted to implement right Distribution for Apache Hadoop
   for  its Production as well as Dev. Can any one suggest me which one
will good for future.

Hints:
They wanted to know both pros and cons.


Regards,
samir.


Re: How to Integrate MicroStrategy with Hadoop

2013-01-30 Thread samir das mohapatra
thanks for quick reply.


On Thu, Jan 31, 2013 at 2:19 AM, Nitin Pawar wrote:

> this is specific to cloudera to please post only to cdh users
>
> here is the link
>
> http://www.cloudera.com/content/cloudera/en/solutions/partner/Microstrategy.html
>
> you can follow links from there on
>
>
> On Thu, Jan 31, 2013 at 2:16 AM, samir das mohapatra <
> samir.help...@gmail.com> wrote:
>
>> We are using coludera Hadoop
>>
>>
>> On Thu, Jan 31, 2013 at 2:12 AM, samir das mohapatra <
>> samir.help...@gmail.com> wrote:
>>
>>> Hi All,
>>>I wanted to know how to connect HAdoop with MircoStrategy
>>>Any help is very helpfull.
>>>
>>>   Witing for you response
>>>
>>> Note: Any Url and Example will be really help full for me.
>>>
>>> Thanks,
>>> samir
>>>
>>
>>
>
>
> --
> Nitin Pawar
>


How to Integrate Cloudera Hadoop With Microstrategy and Hadoop With SAP HANA

2013-01-30 Thread samir das mohapatra
Regards,
samir.


Re: How to Integrate MicroStrategy with Hadoop

2013-01-30 Thread samir das mohapatra
We are using coludera Hadoop


On Thu, Jan 31, 2013 at 2:12 AM, samir das mohapatra <
samir.help...@gmail.com> wrote:

> Hi All,
>I wanted to know how to connect HAdoop with MircoStrategy
>Any help is very helpfull.
>
>   Witing for you response
>
> Note: Any Url and Example will be really help full for me.
>
> Thanks,
> samir
>


How to Integrate SAP HANA WITH Hadoop

2013-01-30 Thread samir das mohapatra
Hi all
I we need the connectivity of SAP HANA with Hadoop,
 Do you have any experience with that can you please share some documents
and example with me ,so that it will be really help full for me

thanks,
samir


How to Integrate MicroStrategy with Hadoop

2013-01-30 Thread samir das mohapatra
Hi All,
   I wanted to know how to connect HAdoop with MircoStrategy
   Any help is very helpfull.

  Witing for you response

Note: Any Url and Example will be really help full for me.

Thanks,
samir


Re: Hadoop Nutch Mkdirs failed to create file

2013-01-24 Thread samir das mohapatra
just try to apply
$>chmod 755 -R  /home/wj/apps/apache-nutch-1.6

then try after it.



On Wed, Jan 23, 2013 at 9:23 PM, 吴靖  wrote:

> hi, everyone!
>  I want use the nutch to crawl the web pages, but problem comes as  the
> log like, I think it maybe some permissions problem,but i am not sure.
> Any help will be appreciated, think you
>
> 2013-01-23 07:37:21,809 ERROR mapred.FileOutputCommitter - Mkdirs failed
> to create file
> :/home/wj/apps/apache-nutch-1.6/bin/crawl/crawldb/190684692/_temporary
> 2013-01-23 07:37:24,836 WARN  mapre d.LocalJobRunner - job_local_0002
> java.io.IOException: The temporary job-output directory
> file:/home/wj/apps/apache-nutch-1.6/bin/crawl/crawldb/190684692/_temporary
> doesn't exist!
> at
> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
> at
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
> at
> org.apache.hadoop.mapred.MapFileOutputFormat.getRecordWriter(MapFileOutputFormat.java:46)
> at
> org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.(ReduceTask.java:448)
> at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:490)
> ** at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
>
>
>