from:"samir das mohapatra"

ISSUE with Filter Hbase Table using SingleColumnValueFilter

2013-11-27 Thread samir das mohapatra

Dear developer

I am looking for a solution where i can applu the *SingleColumnValueFilter
to select only the value  which i will mention in the value parameter not
other then the value which i will pass.*


*  Exxample:*

SingleColumnValueFilter colValFilter = new
SingleColumnValueFilter(Bytes.toBytes(cf1), Bytes.toBytes(code)
, CompareFilter.CompareOp.EQUAL, new
SubstringComparator(SAMIR_AL_START));
colValFilter.setFilterIfMissing(false);
filters.add(colValFilter);

Note: I want only the  *SAMIR_AL_START  value not like XYZ_AL_START
also, I mean I want exact match value not likly.*

*   Right now it is giving both SAMIR_AL_START along with XYZ_AL_START*


*Regards,*
*samir.*

Did anyone work with Hbase mapreduce with multiple table as input ?

2013-11-17 Thread samir das mohapatra

Dear hadoop/hbase developer

Did  Anyone work  with Hbase mapreduce with multiple table as input ?
   Any url-link or example  will help me alot.

  Thanks in advance.

Thanks,
samir.

Re: Getting error from sqoop2 command

2013-10-08 Thread samir das mohapatra

Dear Sqoop user/dev

   I am facing on issue , given below. Do you have any Idea why i am facing
this error and what could be the problem ?

sqoop:000 show connector --all
*Exception has occurred during processing command *
*Exception: com.sun.jersey.api.client.UniformInterfaceException Message:
GET http://localhost:12000/sqoop/v1/connector/all returned a response
status of 404 Not Found*
*
*
*After that i ser the server the it got resolved, but i am getting
connection refused*
*
*
sqoop:000 set server --host localhost --port 8050 --webapp sqoop
Server is set successfully

sqoop:000 show connector --all
*Exception has occurred during processing command *
*Exception: com.sun.jersey.api.client.ClientHandlerException Message:
java.net.ConnectException: Connection refused*



Regards,
samir.


On Tue, Oct 8, 2013 at 12:16 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Dear Sqoop user/dev

I am facing on issue , given below. Do you have any Idea why i am
 facing this error and what could be the problem ?

 sqoop:000 show connector --all
 *Exception has occurred during processing command *
 *Exception: com.sun.jersey.api.client.UniformInterfaceException Message:
 GET http://localhost:12000/sqoop/v1/connector/all returned a response
 status of 404 Not Found*
 *
 *
 *After that i ser the server the it got resolved, but i am getting
 connection refused*
 *
 *
 sqoop:000 set server --host localhost --port 8050 --webapp sqoop
 Server is set successfully

 sqoop:000 show connector --all
 *Exception has occurred during processing command *
 *Exception: com.sun.jersey.api.client.ClientHandlerException Message:
 java.net.ConnectException: Connection refused*



 Regards,
 samir.

Facing issue using Sqoop2

2013-10-08 Thread samir das mohapatra

Dear All

   I am getting error like blow mention, did any one got from sqoop2

Error:

sqoop:000 set server --host hostname1  --port 8050 --webapp sqoop
Server is set successfully
sqoop:000 show server -all
Server host: hostname1
Server port: 8050
Server webapp: sqoop
sqoop:000 show version --all
client version:
  Sqoop 1.99.2-cdh4.4.0 revision
  Compiled by jenkins on Tue Sep  3 20:15:11 PDT 2013
Exception has occurred during processing command
Exception: com.sun.jersey.api.client.ClientHandlerException Message:
java.net.ConnectException: Connection refused

Regards,
samir.

how to use Sqoop command without Hardcoded password while using sqoop command

2013-10-04 Thread samir das mohapatra

Dear Hadoop/Sqoop users


Is there any way to call sqoop command without hard coding the password for
the specific RDBMS. ?. If we are hard coding the password then it will be
huge issue with sequrity.


Regards,
samir.

How to ignore empty file comming out from hive map side join

2013-09-13 Thread samir das mohapatra

Dear Hive/Hadoop Developer

   Just I was  runing hive mapside join , along with output data I colud
see some empty file  in map stage, Why it is ? and how to ignore this file .

Regards,
samir.

While Inserting data into hive Why I colud not able to query ?

2013-07-16 Thread samir das mohapatra

Dear All,
  Did any one faced the issue :
   While Loading  huge dataset into hive table , hive restricting me to
query from same table.

  I have set hive.support.concurrency=true, still showing

conflicting lock present for TABLENAME mode SHARED

property
  namehive.support.concurrency/name
  valuetrue/value
  descriptionWhether hive supports concurrency or not. A zookeeper
instance must be up and running for the default hive lock manager to
support read-write locks./description
/property


If It is  like that then how to solve that issue? is there any row lock
?

Regards

Error While Processing SequeceFile with Lzo Compressed in hive External table (CDH4.3)

2013-06-19 Thread samir das mohapatra

Dear All,

 Any One would have face this type of Issue ?

  I am getting Some error while processing Sequecen file with LZO compresss
in hive query  In CDH4.3.x Distribution.

Error Logs:

SET hive.exec.compress.output=true;

SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;



-rw-r--r--   3 myuser supergroup  25172 2013-06-19 21:25
/user/myDir/00_0  -- Lzo Compressed sequence file



-rw-r--r--   3 myuser   supergroup  71007 2013-06-19 21:42
/user/myDir/00_0   -- Normal sequence file



1.   Now the problem that if I create an External table on top of the
directory to read the data it gives me an error : *Failed with exception
java.io.IOException:java.io.EOFException: Premature EOF from inputStream*

*
*

*Table Creation:*

*
*

*
*

CREATE EXTERNAL TABLE IF NOT EXISTS MyTable

(

userip  string

usertid string

)

ROW FORMAT DELIMITED

   FIELDS TERMINATED BY '\001'  ESCAPED BY '\020'

   COLLECTION ITEMS TERMINATED BY '\002'

   MAP KEYS TERMINATED BY '\003'

   LINES TERMINATED BY '\012'

STORED AS SEQUENCEFILE

LOCATION '/path/to/file';


After that while querying to the table getting error:


*Failed with exception java.io.IOException:java.io.EOFException: Premature
EOF from inputStream*
*
*
*Why it is like that?*
*
*
*Regards,*
*samir*

*
*

*
*

*
*

*T*

*
*

*
*

*
*

How to get the intermediate mapper output file name

2013-06-03 Thread samir das mohapatra

Hi all,
   How to get the mapper output filename  inside the  the mapper .

  or

How to change the  mapper ouput file name.
 Default it looks like part-m-0,part-m-1 etc.

Regards,
samir.

Pulling data from secured hadoop cluster to another hadoop cluster

2013-05-28 Thread samir das mohapatra

Hi All,
  I could able to connect the hadoop (source ) cluster after ssh is
established.

But i wanted to know, If I want to pull some data using distcp from source
secured hadoop box to another hadoop cluster , I could not able to ping
name node machine.  In this approach how to run distcp command from target
cluster in with secured connection.

Source: hadoop.server1 (ssh secured)
Target:  hadoop.server2 (runing distcp here)


running command:

distcp hftp://hadoop.server1:50070/dataSet
hdfs://hadoop.server2:54310/targetDataSet


Regards,
samir.

Re: Pulling data from secured hadoop cluster to another hadoop cluster

2013-05-28 Thread samir das mohapatra

it is not hadoop security issue, the security is in host , I Mean to say in
network level.

 I could not able to ping bcz source system is designed such a way that
only you can connecto through ssh .
If this is the case how to over come this problem.

What extra parameter i need to add i ssh level so that i could able to ping
the machine. All the servers are in same domain.




On Tue, May 28, 2013 at 7:35 PM, Shahab Yunus shahab.yu...@gmail.comwrote:

 Also Samir, when you say 'secured', by any chance that cluster is secured
 with Kerberos (rather than ssh)?

 -Shahab


 On Tue, May 28, 2013 at 8:29 AM, Nitin Pawar nitinpawar...@gmail.comwrote:

 hadoop daemons do not use ssh to communicate.

 if  your distcp job could not connect to remote server then either the
 connection was rejected by the target namenode or the it was not able to
 establish the network connection.

 were you able to see the hdfs on server1 from server2?


 On Tue, May 28, 2013 at 5:17 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Hi All,
   I could able to connect the hadoop (source ) cluster after ssh is
 established.

 But i wanted to know, If I want to pull some data using distcp from
 source secured hadoop box to another hadoop cluster , I could not able to
 ping  name node machine.  In this approach how to run distcp command from
 target cluster in with secured connection.

 Source: hadoop.server1 (ssh secured)
 Target:  hadoop.server2 (runing distcp here)


 running command:

 distcp hftp://hadoop.server1:50070/dataSet
 hdfs://hadoop.server2:54310/targetDataSet


 Regards,
 samir.




 --
 Nitin Pawar

Issue with data Copy from CDH3 to CDH4

2013-05-24 Thread samir das mohapatra

Hi all,

We tried to pull the data from  upstream cluster(cdh3) which is running
cdh3 to down stream system (running cdh4) ,Using *distcp* to copy the data,
it was throughing some exception bcz due to version isssue.

 I wanted to know is there any solution to pull the data from CDH3 to CDH4
without manually.

What is the other approach to solve the problem.(Data are 10 PB)
 Regards,
samir.

Re: how to copy a table from one hbase cluster to another cluster?

2013-03-20 Thread samir das mohapatra

Thanks, for reply

I need to copy the hbase table into another cluster through the java code.
Any example will help to  me



On Wed, Mar 20, 2013 at 8:48 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Samir,

 Is this what you are looking for?

 http://hbase.apache.org/book/ops_mgt.html#copytable

 What kind of help do you need?

 JM

 2013/3/20 samir das mohapatra samir.help...@gmail.com:
  Hi All,
  Can you help me to copy one hbase table  to another cluster hbase
 (Table
  copy) .
 
  Regards,
  samir

Re: how to copy a table from one hbase cluster to another cluster?

2013-03-20 Thread samir das mohapatra

yes, yes just i thought same thing.
many many thanks.


On Wed, Mar 20, 2013 at 8:55 PM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 Hi Samir,

 Have you looked at the link I sent you?

 You have a command line for that, you have an example, and if you need
 to do it in Java, you san simply open the
 org.apache.hadoop.hbase.mapreduce.CopyTable, look into it, and do
 almost the same thing for your needs?

 JM

 2013/3/20 samir das mohapatra samir.help...@gmail.com:
  Thanks, for reply
 
  I need to copy the hbase table into another cluster through the java
 code.
  Any example will help to  me
 
 
 
  On Wed, Mar 20, 2013 at 8:48 PM, Jean-Marc Spaggiari
  jean-m...@spaggiari.org wrote:
 
  Hi Samir,
 
  Is this what you are looking for?
 
  http://hbase.apache.org/book/ops_mgt.html#copytable
 
  What kind of help do you need?
 
  JM
 
  2013/3/20 samir das mohapatra samir.help...@gmail.com:
   Hi All,
   Can you help me to copy one hbase table  to another cluster hbase
   (Table
   copy) .
  
   Regards,
   samir

Re: How to pull Delta data from one cluster to another cluster ?

2013-03-14 Thread samir das mohapatra

how to pull delta data that means filter data not whole data as off now i
know we can do whole data through the distcp, colud you plese help if i am
wrong or any other way to pull efficiently.


like : get data based on filter condition.




On Thu, Mar 14, 2013 at 3:43 PM, Mohammad Tariq donta...@gmail.com wrote:

 Use distcp.

 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Thu, Mar 14, 2013 at 3:40 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Regards,
 samir.

Re: How to pull Delta data from one cluster to another cluster ?

2013-03-14 Thread samir das mohapatra

will sqoop support inter-cluster data copy  after filder.

Senario:
   1) cluster-1, cluster-2
   2)  taking data from cluster-1 to cluster-2 based on filter condition
   Will sqoop  support ?



On Thu, Mar 14, 2013 at 4:19 PM, Tariq donta...@gmail.com wrote:

 You can do that through Pig.

 samir das mohapatra samir.help...@gmail.com wrote:

 how to pull delta data that means filter data not whole data as off now i
 know we can do whole data through the distcp, colud you plese help if i am
 wrong or any other way to pull efficiently.


 like : get data based on filter condition.




 On Thu, Mar 14, 2013 at 3:43 PM, Mohammad Tariq donta...@gmail.comwrote:

 Use distcp.

 Warm Regards,
 Tariq
 https://mtariq.jux.com/
 cloudfront.blogspot.com


 On Thu, Mar 14, 2013 at 3:40 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Regards,
 samir.




 --
 Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Is there any way to get information from Hbase once some record get updated?

2013-03-14 Thread samir das mohapatra

Hi All,

  Is there any way to get information  from Hbase once some record get
updated? , Like the Database Trigger.

Regards,
samir.

Re: How to shuffle (Key,Value) pair from mapper to multiple reducer

2013-03-13 Thread samir das mohapatra

Use can use Custom Partitioner for that same.



Regards,

Samir.


On Wed, Mar 13, 2013 at 2:29 PM, Vikas Jadhav vikascjadha...@gmail.comwrote:


 Hi
 I am specifying requirement again with example.



 I have use case where i need to shufffle same (key,value) pair to multiple
 reducers


 For Example  we have pair  (1,ABC) and two reducers (reducer0 and
 reducer1) are there then

 by default this pair will go to reduce1 (cause  (key % numOfReducer) =
 (1%2) )


 how i should shuffle this pair to both reducer.

 Also I willing to change the code of hadoop framework if Necessory.

  Thank you

 On Wed, Mar 13, 2013 at 12:51 PM, feng lu amuseme...@gmail.com wrote:

 Hi

 you can use Job#setNumReduceTasks(int tasks) method to set the number of
 reducer to output.


 On Wed, Mar 13, 2013 at 2:15 PM, Vikas Jadhav 
 vikascjadha...@gmail.comwrote:

 Hello,

 As by default Hadoop framework can shuffle (key,value) pair to only one
 reducer

 I have use case where i need to shufffle same (key,value) pair to
 multiple reducers

 Also I  willing to change the code of hadoop framework if Necessory.


 Thank you

 --
 *
 *
 *

 Thanx and Regards*
 * Vikas Jadhav*




 --
 Don't Grow Old, Grow Up... :-)




 --
 *
 *
 *

 Thanx and Regards*
 * Vikas Jadhav*

Why hadoop is spawing two map over file size 1.5 KB ?

2013-03-12 Thread samir das mohapatra

Hi All,
  I have very fundamental doubt, I have file having size 1.5KB and block
size is default block size, But i could see two mapper it got creted during
the Job. Could you please help  to get whole picture why it is .

Regards,
samir.

Re: How can I record some position of context in Reduce()?

2013-03-12 Thread samir das mohapatra

Through the RecordReader and FileStatus you can get it.


On Tue, Mar 12, 2013 at 4:08 PM, Roth Effy effyr...@gmail.com wrote:

 Hi,everyone,
 I want to join the k-v pairs in Reduce(),but how to get the record
 position?
 Now,what I thought is to save the context status,but class Context doesn't
 implement a clone construct method.

 Any help will be appreciated.
 Thank you very much.

Re: Hadoop cluster hangs on big hive job

2013-03-10 Thread samir das mohapatra

Problem I could see in you log file is , No available  free map slot for
job.
I think you have to increase the block size to reduce the # of MAP , Bcz
you  are passing Big data as Input.
The  ideal approach is , first increase the
  1) block size,
   2)  mapp site buffer
3) jvm re-use  etc.

regards,
samir.


On Fri, Mar 8, 2013 at 1:23 AM, Daning Wang dan...@netseer.com wrote:

 We have hive query processing zipped csv files. the query was scanning for
 10 days(partitioned by date). data for each day around 130G. The problem is
 not consistent since if you run it again, it might go through. but the
 problem has never happened on the smaller jobs(like processing only one
 days data).

 We don't have space issue.

 I have attached log file when problem happening. it is stuck like
 following(just search 19706 of 49964)

 2013-03-05 15:13:51,587 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_19_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:51,811 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_39_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:52,551 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_32_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:52,760 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_00_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:52,946 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_24_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:54,742 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_08_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 

 Thanks,

 Daning


 On Thu, Mar 7, 2013 at 12:21 AM, Håvard Wahl Kongsgård 
 haavard.kongsga...@gmail.com wrote:

 hadoop logs?
 On 6. mars 2013 21:04, Daning Wang dan...@netseer.com wrote:

 We have 5 nodes cluster(Hadoop 1.0.4), It hung a couple of times while
 running big jobs. Basically all the nodes are dead, from that
 trasktracker's log looks it went into some kinds of loop forever.

 All the log entries like this when problem happened.

 Any idea how to debug the issue?

 Thanks in advance.


 2013-03-05 15:13:19,526 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_12_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:19,552 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_28_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:20,858 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_36_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:21,141 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_16_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:21,486 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_19_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:21,692 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_39_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:22,448 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_32_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:22,643 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_00_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:22,840 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_24_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:24,628 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_08_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:24,723 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_39_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:25,336 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_04_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:25,539 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_43_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:25,545 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_12_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:25,569 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_28_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:25,855 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_24_0 0.131468% reduce  copy (19706 of
 49964 at 0.00 MB/s) 
 2013-03-05 15:13:26,876 INFO org.apache.hadoop.mapred.TaskTracker:
 attempt_201302270947_0010_r_36_0

Re: Need help optimizing reducer

2013-03-04 Thread samir das mohapatra

Austin,
  I think  you have to use partitioner to spawn more then one reducer for
small data set.
  Default Partitioner will allow you only one reducer, you have to
overwrite and implement you own logic to spawn more then one reducer.




On Tue, Mar 5, 2013 at 1:27 AM, Austin Chungath austi...@gmail.com wrote:

 Hi all,

 I have 1 reducer and I have around 600 thousand unique keys coming to it.
 The total data is only around 30 mb.
 My logic doesn't allow me to have more than 1 reducer.
 It's taking too long to complete, around 2 hours. (till 66% it's fast then
 it slows down/ I don't really think it has started doing anything till 66%
 but then why does it show like that?).
 Are there any job execution parameters that can help improve reducer
 performace?
 Any suggestions to improve things when we have to live with just one
 reducer?

 thanks,
 Austin

Re: Issue with sqoop and HANA/ANY DB Schema name

2013-03-01 Thread samir das mohapatra

Any help...


On Fri, Mar 1, 2013 at 12:06 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Hi All,
   I am facing one problem , how to specify the schema name before the
 table while executing the sqoop import statement.

 $ sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
 com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
 --password   --target-dir  /input/Test1  --verbose

 Note : Without schema name above sqoop import is working file but after
 assigning the schema name it is showing  error

 Error Logs:

 Hi All,
   I am facing one problem , how to specify the schema name before the
 table while executing the sqoop import statement.

 $ sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
 com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
 --password   --target-dir  /input/Test1  --verbose

 Note : Without schema name above sqoop import is working file but after
 assigning the schema name it is showing  error

 Error Logs:


 Regards,
 samir.

Re: Issue in Datanode (using CDH4.1.2)

2013-02-28 Thread samir das mohapatra

few more things

Same setup was working in Ubuntu machine(Dev cluster), only failing under
CentOS 6.3(prod Cluster)



On Thu, Feb 28, 2013 at 9:06 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Hi All,
   I am facing on strange issue, That is In a cluster having 1k machine  i
 could able to start and stop
 NN,DN,JT,TT,SSN. But the problem is  under Name node  Web-URL
 it is showing only one  datanode . I tried to connect node through ssh
 also it was working file and i have  assigned NNURL: port  in core-site
 http://namenode:50070

 Again I have checked with datanode logs,  and I got the message like this:

 2013-02-28 06:59:01,652 WARN
 org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
 server: hadoophost1/192.168.1.1:54310
 2013-02-28 06:59:07,660 INFO org.apache.hadoop.ipc.Client: Retrying
 connect to server: hadoophost1/192.168.1.1:54310. Already tried 0
 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
 sleepTime=1 SECONDS)

 Regards,
 samir.

Issue with sqoop and HANA/ANY DB Schema name

2013-02-28 Thread samir das mohapatra

Hi All,
  I am facing one problem , how to specify the schema name before the table
while executing the sqoop import statement.

$ sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
--password   --target-dir  /input/Test1  --verbose

Note : Without schema name above sqoop import is working file but after
assigning the schema name it is showing  error

Error Logs:

Hi All,
  I am facing one problem , how to specify the schema name before the table
while executing the sqoop import statement.

$ sqoop import  --connect  jdbc:sap://host:port/db_name  --driver
com.sap.db.jdbc.Driver   --table  SchemaName.Test-m  1 --username 
--password   --target-dir  /input/Test1  --verbose

Note : Without schema name above sqoop import is working file but after
assigning the schema name it is showing  error

Error Logs:


Regards,
samir.

How to use sqoop import

2013-02-28 Thread samir das mohapatra

Hi All,
  Can any one share some example how to run sqoop Import results of SQL
'statement'   ?
 for example:
 sqoop import -connect jdbc:.  --driver xxx

after this if i am specifying  --query select statement  it is even not
recognizing as sqoop  valid statement..

Regards,
samir.

How to take Whole Database From RDBMS to HDFS Instead of Table/Table

2013-02-27 Thread samir das mohapatra

Hi All,

   Using sqoop how to take entire database table into HDFS insted of Table
by Table ?.

How do you guys did it?
Is there some trick?

Regards,
samir.

Re: How to take Whole Database From RDBMS to HDFS Instead of Table/Table

2013-02-27 Thread samir das mohapatra

thanks all.



On Wed, Feb 27, 2013 at 4:41 PM, Jagat Singh jagatsi...@gmail.com wrote:

 You might want to read this


 http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal




 On Wed, Feb 27, 2013 at 10:09 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Hi All,

Using sqoop how to take entire database table into HDFS insted of
 Table by Table ?.

 How do you guys did it?
 Is there some trick?

 Regards,
 samir.

Re: How to take Whole Database From RDBMS to HDFS Instead of Table/Table

2013-02-27 Thread samir das mohapatra

Is it good way to take total 5PB data through the JAVA/JDBC Program ?


On Wed, Feb 27, 2013 at 5:56 PM, Michel Segel michael_se...@hotmail.comwrote:

 I wouldn't use sqoop if you are taking everything.
 Simpler to write your own java/jdbc program that writes its output to HDFS.

 Just saying...

 Sent from a remote device. Please excuse any typos...

 Mike Segel

 On Feb 27, 2013, at 5:15 AM, samir das mohapatra samir.help...@gmail.com
 wrote:

 thanks all.



 On Wed, Feb 27, 2013 at 4:41 PM, Jagat Singh jagatsi...@gmail.com wrote:

 You might want to read this


 http://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_literal_sqoop_import_all_tables_literal




 On Wed, Feb 27, 2013 at 10:09 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Hi All,

Using sqoop how to take entire database table into HDFS insted of
 Table by Table ?.

 How do you guys did it?
 Is there some trick?

 Regards,
 samir.

Fwd: ISSUE IN CDH4.1.2 : transfer data between different HDFS clusters.(using distch)

2013-02-25 Thread samir das mohapatra

-- Forwarded message --
From: samir das mohapatra samir.help...@gmail.com
Date: Mon, Feb 25, 2013 at 3:05 PM
Subject: ISSUE IN CDH4.1.2 : transfer data between different HDFS
clusters.(using distch)
To: cdh-u...@cloudera.org

Hi All,
  I am getting bellow error , can any one help me on the same issue,

ERROR LOG:
--

hadoop@hadoophost2:~$ hadoop   distcp hdfs://
10.192.200.170:50070/tmp/samir.txt hdfs://10.192.244.237:50070/input
13/02/25 01:34:36 INFO tools.DistCp: srcPaths=[hdfs://
10.192.200.170:50070/tmp/samir.txt]
13/02/25 01:34:36 INFO tools.DistCp: destPath=hdfs://
10.192.244.237:50070/input
With failures, global counters are inaccurate; consider running with -i
Copy failed: java.io.IOException: Failed on local exception:
com.google.protobuf.InvalidProtocolBufferException: Protocol message
end-group tag did not match expected tag.; Host Details : local host is:
hadoophost2/10.192.244.237; destination host is:
bl1slu040.corp.adobe.com:50070;

at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
at org.apache.hadoop.ipc.Client.call(Client.java:1164)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy9.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy9.getFileInfo(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:628)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1507)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1257)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:636)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
message end-group tag did not match expected tag.
at
com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
at
com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
at
com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
at
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
at
com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
at
com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
at
org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
at
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)

Regards,
samir

Re: ISSUE IN CDH4.1.2 : transfer data between different HDFS clusters.(using distch)

2013-02-25 Thread samir das mohapatra

yes


On Mon, Feb 25, 2013 at 3:30 PM, Nitin Pawar nitinpawar...@gmail.comwrote:

 does this match with your issue

 https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kIPOvrFaQE8


 On Mon, Feb 25, 2013 at 3:20 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:



 -- Forwarded message --
 From: samir das mohapatra samir.help...@gmail.com
 Date: Mon, Feb 25, 2013 at 3:05 PM
 Subject: ISSUE IN CDH4.1.2 : transfer data between different HDFS
 clusters.(using distch)
 To: cdh-u...@cloudera.org


 Hi All,
   I am getting bellow error , can any one help me on the same issue,

 ERROR LOG:
 --

 hadoop@hadoophost2:~$ hadoop   distcp hdfs://
 10.192.200.170:50070/tmp/samir.txt hdfs://10.192.244.237:50070/input
 13/02/25 01:34:36 INFO tools.DistCp: srcPaths=[hdfs://
 10.192.200.170:50070/tmp/samir.txt]
 13/02/25 01:34:36 INFO tools.DistCp: destPath=hdfs://
 10.192.244.237:50070/input
 With failures, global counters are inaccurate; consider running with -i
 Copy failed: java.io.IOException: Failed on local exception:
 com.google.protobuf.InvalidProtocolBufferException: Protocol message
 end-group tag did not match expected tag.; Host Details : local host is:
 hadoophost2/10.192.244.237; destination host is: 
 bl1slu040.corp.adobe.com:50070;
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
 at org.apache.hadoop.ipc.Client.call(Client.java:1164)
 at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 at $Proxy9.getFileInfo(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
 at $Proxy9.getFileInfo(Unknown Source)
 at
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:628)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1507)
 at
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
 at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1257)
 at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:636)
 at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
 at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
 message end-group tag did not match expected tag.
 at
 com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
 at
 com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
 at
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
 at
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
 at
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
 at
 org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
 at
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
 at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)



 Regards,
 samir




 --
 Nitin Pawar

Re: ISSUE IN CDH4.1.2 : transfer data between different HDFS clusters.(using distch)

2013-02-25 Thread samir das mohapatra

I am using CDH4.1.2 with MRv1 not YARN.


On Mon, Feb 25, 2013 at 3:47 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 yes


 On Mon, Feb 25, 2013 at 3:30 PM, Nitin Pawar nitinpawar...@gmail.comwrote:

 does this match with your issue


 https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kIPOvrFaQE8


 On Mon, Feb 25, 2013 at 3:20 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:



 -- Forwarded message --
 From: samir das mohapatra samir.help...@gmail.com
 Date: Mon, Feb 25, 2013 at 3:05 PM
 Subject: ISSUE IN CDH4.1.2 : transfer data between different HDFS
 clusters.(using distch)
 To: cdh-u...@cloudera.org


 Hi All,
   I am getting bellow error , can any one help me on the same issue,

 ERROR LOG:
 --

 hadoop@hadoophost2:~$ hadoop   distcp hdfs://
 10.192.200.170:50070/tmp/samir.txt hdfs://10.192.244.237:50070/input
 13/02/25 01:34:36 INFO tools.DistCp: srcPaths=[hdfs://
 10.192.200.170:50070/tmp/samir.txt]
 13/02/25 01:34:36 INFO tools.DistCp: destPath=hdfs://
 10.192.244.237:50070/input
 With failures, global counters are inaccurate; consider running with -i
 Copy failed: java.io.IOException: Failed on local exception:
 com.google.protobuf.InvalidProtocolBufferException: Protocol message
 end-group tag did not match expected tag.; Host Details : local host is:
 hadoophost2/10.192.244.237; destination host is: 
 bl1slu040.corp.adobe.com:50070;
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:759)
 at org.apache.hadoop.ipc.Client.call(Client.java:1164)
 at
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
 at $Proxy9.getFileInfo(Unknown Source)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
 at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
 at $Proxy9.getFileInfo(Unknown Source)
 at
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:628)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1507)
 at
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
 at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1257)
 at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:636)
 at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
 at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
 at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
 Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol
 message end-group tag did not match expected tag.
 at
 com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:73)
 at
 com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124)
 at
 com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:213)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
 at
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
 at
 com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
 at
 com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
 at
 org.apache.hadoop.ipc.protobuf.RpcPayloadHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcPayloadHeaderProtos.java:985)
 at
 org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:882)
 at org.apache.hadoop.ipc.Client$Connection.run(Client.java:813)



 Regards,
 samir




 --
 Nitin Pawar

Re: ISSUE :Hadoop with HANA using sqoop

2013-02-21 Thread samir das mohapatra

)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech
JDBC: [257]: sql syntax error: incorrect syntax near .: line 1 col
46 (at pos 46)
at 
com.sap.db.jdbc.exceptions.SQLExceptionSapDB.createException(SQLExceptionSapDB.java:334)
at 
com.sap.db.jdbc.exceptions.SQLExceptionSapDB.generateDatabaseException(SQLExceptionSapDB.java:174)
at 
com.sap.db.jdbc.packet.ReplyPacket.buildExceptionChain(ReplyPacket.java:103)
at com.sap.db.jdbc.ConnectionSapDB.execute(ConnectionSapDB.java:848)
at 
com.sap.db.jdbc.CallableStatementSapDB.sendCommand(CallableStatementSapDB.java:1874)
at com.sap.db.jdbc.StatementSapDB.sendSQL(StatementSapDB.java:945)
at 
com.sap.db.jdbc.CallableStatementSapDB.doParse(CallableStatementSapDB.java:230)
at 
com.sap.db.jdbc.CallableStatementSapDB.constructor(CallableStatementSapDB.java:190)
at 
com.sap.db.jdbc.CallableStatementSapDB.init(CallableStatementSapDB.java:101)
at 
com.sap.db.jdbc.CallableStatementSapDBFinalize.init(CallableStatementSapDBFinalize.java:31)
at 
com.sap.db.jdbc.ConnectionSapDB.prepareStatement(ConnectionSapDB.java:1088)
at 
com.sap.db.jdbc.trace.Connection.prepareStatement(Connection.java:347)
at 
org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:101)
at 
org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:236)
... 12 more
2013-02-20 23:10:23,906 INFO org.apache.hadoop.mapred.Task: Runnning
cleanup for the task



On Thu, Feb 21, 2013 at 12:03 PM, Harsh J ha...@cloudera.com wrote:

 The error is truncated, check the actual failed task's logs for complete
 info:

 Caused by: com.sap… what?

 Seems more like a SAP side fault than a Hadoop side one and you should
 ask on their forums with the stacktrace posted.

 On Thu, Feb 21, 2013 at 11:58 AM, samir das mohapatra
 samir.help...@gmail.com wrote:
  Hi All
  Can you plese tell me why I am getting error while loading data from
  SAP HANA   to Hadoop HDFS using sqoop (4.1.2).
 
  Error Log:
 
  java.io.IOException: SQLException in nextKeyValue
at
 
 org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:265)
at
 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:458)
at
 
 org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at
 
 org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at
 
 org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:182)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
  Caused by: com.sap
 
  Regards,
  samir.
 
 
 
  --
 
 
 



 --
 Harsh J

Fwd: Delivery Status Notification (Failure)

2013-02-12 Thread samir das mohapatra

Hi All,
   I wanted to know how to connect Hive(hadoop-cdh4 distribution) with
MircoStrategy
   Any help is very helpfull.

  Witing for you response

Note: It is little bit urgent do any one have exprience in that
Thanks,
samir

Re: Hive Metastore DB Issue ( Cloudera CDH4.1.2 MRv1 with hive-0.9.0-cdh4.1.2)

2013-02-07 Thread samir das mohapatra

Hi Suresh,
   Thanks for advice,

why you are so monopoly, You shoul not be. Problem is solution not
problem.

Note: I am looking for any user does not matter bcz it is common use
Scenario.


On Fri, Feb 8, 2013 at 3:31 AM, Suresh Srinivas sur...@hortonworks.comwrote:

 Please only use CDH mailing list and do not copy this to hdfs-user.


 On Thu, Feb 7, 2013 at 7:20 AM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Any Suggestion...


 On Thu, Feb 7, 2013 at 4:17 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

 Hi All,
   I could not see the hive meta  store DB under Mysql  database Under
 mysql user hadoop.

 Example:

 $  mysql –u root -p
  $ Add hadoop user (CREATE USER ‘hadoop'@'localhost' IDENTIFIED BY ‘
 hadoop';)
  $GRANT ALL ON *.* TO ‘hadoop'@‘% IDENTIFIED BY ‘hadoop’
  $ Example (GRANT ALL PRIVILEGES ON *.* TO 'hadoop'@'localhost'
 IDENTIFIED BY 'hadoop' WITH GRANT OPTION;)

 Bellow  configuration i am follwing
 

 property
 namejavax.jdo.option.ConnectionURL/name

 valuejdbc:mysql://localhost:3306/hadoop?createDatabaseIfNotExist=true/value
 /property
 property
 namejavax.jdo.option.ConnectionDriverName/name
 valuecom.mysql.jdbc.Driver/value
 /property
 property
   namejavax.jdo.option.ConnectionUserName/name
   valuehadoop/value
 /property
 property
namejavax.jdo.option.ConnectionPassword/name
valuehadoop/value

 /property


  Note: Previously i was using cdh3 it was perfectly creating under mysql
 metastore DB but when i changed cdh3 to cdh4.1.2 with hive as above subject
 line , It is not creating.


 Any suggestiong..

 Regrads,
 samir.





 --
 http://hortonworks.com/download/

All MAP Jobs(Java Custom Map Reduce Program) are assigned to one Node why?

2013-01-31 Thread samir das mohapatra

Hi All,

  I am using cdh4 with MRv1 . When I am running any hadoop mapreduce
program from  java  , all the map task is assigned to one node. It suppose
to distribute the map task among the cluster's nodes.

 Note : 1) My jobtracker web-UI is showing 500 nodes
2) when  it is comming to reducer , then it is sponning into
other  node  (other then map node)

Can nay one guide me why it is like so

Regards,
samir.

How to Integrate MicroStrategy with Hadoop

2013-01-30 Thread samir das mohapatra

Hi All,
   I wanted to know how to connect HAdoop with MircoStrategy
   Any help is very helpfull.

  Witing for you response

Note: Any Url and Example will be really help full for me.

Thanks,
samir

How to Integrate SAP HANA WITH Hadoop

2013-01-30 Thread samir das mohapatra

Hi all
I we need the connectivity of SAP HANA with Hadoop,
 Do you have any experience with that can you please share some documents
and example with me ,so that it will be really help full for me

thanks,
samir

Re: How to Integrate MicroStrategy with Hadoop

2013-01-30 Thread samir das mohapatra

We are using coludera Hadoop


On Thu, Jan 31, 2013 at 2:12 AM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Hi All,
I wanted to know how to connect HAdoop with MircoStrategy
Any help is very helpfull.

   Witing for you response

 Note: Any Url and Example will be really help full for me.

 Thanks,
 samir

Recommendation required for Right Hadoop Distribution (CDH OR HortonWork)

2013-01-30 Thread samir das mohapatra

Hi All,
   My Company wanted to implement right Distribution for Apache Hadoop
   for  its Production as well as Dev. Can any one suggest me which one
will good for future.

Hints:
They wanted to know both pros and cons.


Regards,
samir.

Re: What is the best way to load data from one cluster to another cluster (Urgent requirement)

2013-01-30 Thread samir das mohapatra

thanks all.


On Thu, Jan 31, 2013 at 11:19 AM, Satbeer Lamba satbeer.la...@gmail.comwrote:

 I might be wrong but have you considered distcp?
 On Jan 31, 2013 11:15 AM, samir das mohapatra samir.help...@gmail.com
 wrote:

 Hi All,

Any one knows,  how to load data from one hadoop cluster(CDH4) to
 another Cluster (CDH4) . They way our project needs are
1) It should  be delta load or incremental load.
2) It should be based on the timestamp
3) Data volume are 5PB

 Any Help 

 Regards,
 samir.

Re: Hadoop Nutch Mkdirs failed to create file

2013-01-24 Thread samir das mohapatra

just try to apply
$chmod 755 -R  /home/wj/apps/apache-nutch-1.6

then try after it.



On Wed, Jan 23, 2013 at 9:23 PM, 吴靖 qhwj2...@126.com wrote:

 hi, everyone!
  I want use the nutch to crawl the web pages, but problem comes as  the
 log like, I think it maybe some permissions problem,but i am not sure.
 Any help will be appreciated, think you

 2013-01-23 07:37:21,809 ERROR mapred.FileOutputCommitter - Mkdirs failed
 to create file
 :/home/wj/apps/apache-nutch-1.6/bin/crawl/crawldb/190684692/_temporary
 2013-01-23 07:37:24,836 WARN  mapre d.LocalJobRunner - job_local_0002
 java.io.IOException: The temporary job-output directory
 file:/home/wj/apps/apache-nutch-1.6/bin/crawl/crawldb/190684692/_temporary
 doesn't exist!
 at
 org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
 at
 org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
 at
 org.apache.hadoop.mapred.MapFileOutputFormat.getRecordWriter(MapFileOutputFormat.java:46)
 at
 org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.init(ReduceTask.java:448)
 at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:490)
 ** at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
 at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)

Re: different input/output formats

2012-05-30 Thread samir das mohapatra

PFA.

On Wed, May 30, 2012 at 2:45 AM, Mark question markq2...@gmail.com wrote:

 Hi Samir, can you email me your main class.. or if you can check mine, it
 is as follows:

 public class SortByNorm1 extends Configured implements Tool {

@Override public int run(String[] args) throws Exception {

if (args.length != 2) {
System.err.printf(Usage:bin/hadoop jar norm1.jar inputDir
 outputDir\n);
ToolRunner.printGenericCommandUsage(System.err);
return -1;
}
JobConf conf = new JobConf(new Configuration(),SortByNorm1.class);
conf.setJobName(SortDocByNorm1);
conf.setMapperClass(Norm1Mapper.class);
conf.setMapOutputKeyClass(FloatWritable.class);
conf.setMapOutputValueClass(Text.class);
conf.setNumReduceTasks(0);
conf.setReducerClass(Norm1Reducer.class);
 conf.setOutputKeyClass(FloatWritable.class);
conf.setOutputValueClass(Text.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);

TextInputFormat.addInputPath(conf, new Path(args[0]));
SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));
 JobClient.runJob(conf);
return 0;
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new SortByNorm1(), args);
System.exit(exitCode);
 }


 On Tue, May 29, 2012 at 1:55 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

  Hi Mark
 See the out put for that same  Application .
 I am  not getting any error.
 
 
  On Wed, May 30, 2012 at 1:27 AM, Mark question markq2...@gmail.com
 wrote:
 
  Hi guys, this is a very simple  program, trying to use TextInputFormat
 and
  SequenceFileoutputFormat. Should be easy but I get the same error.
 
  Here is my configurations:
 
 conf.setMapperClass(myMapper.class);
 conf.setMapOutputKeyClass(FloatWritable.class);
 conf.setMapOutputValueClass(Text.class);
 conf.setNumReduceTasks(0);
 conf.setOutputKeyClass(FloatWritable.class);
 conf.setOutputValueClass(Text.class);
 
 conf.setInputFormat(TextInputFormat.class);
 conf.setOutputFormat(SequenceFileOutputFormat.class);
 
 TextInputFormat.addInputPath(conf, new Path(args[0]));
 SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));
 
 
  myMapper class is:
 
  public class myMapper extends MapReduceBase implements
  MapperLongWritable,Text,FloatWritable,Text {
 
 public void map(LongWritable offset, Text
  val,OutputCollectorFloatWritable,Text output, Reporter reporter)
 throws IOException {
 output.collect(new FloatWritable(1), val);
  }
  }
 
  But I get the following error:
 
  12/05/29 12:54:31 INFO mapreduce.Job: Task Id :
  attempt_201205260045_0032_m_00_0, Status : FAILED
  java.io.IOException: wrong key class: org.apache.hadoop.io.LongWritable
 is
  not class org.apache.hadoop.io.FloatWritable
 at
  org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:998)
 at
 
 
 org.apache.hadoop.mapred.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:75)
 at
 
 
 org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.collect(MapTask.java:705)
 at
 
 
 org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:508)
 at
 
 
 filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:59)
 at
 
 
 filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:1)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.security.Use
 
  Where is the writing of LongWritable coming from ??
 
  Thank you,
  Mark

Re: How to mapreduce in the scenario

2012-05-30 Thread samir das mohapatra

Yes . Hadoop Is only for Huge Dataset Computaion .
  May not good for small dataset.

On Wed, May 30, 2012 at 6:53 AM, liuzhg liu...@cernet.com wrote:

 Hi,

 Mike, Nitin, Devaraj, Soumya, samir, Robert

 Thank you all for your suggestions.

 Actually, I want to know if hadoop has any advantage than routine database
 in performance for solving this kind of problem ( join data ).



 Best Regards,

 Gump





 On Tue, May 29, 2012 at 6:53 PM, Soumya Banerjee
 soumya.sbaner...@gmail.com wrote:

 Hi,

 You can also try to use the Hadoop Reduce Side Join functionality.
 Look into the contrib/datajoin/hadoop-datajoin-*.jar for the base MAP and
 Reduce classes to do the same.

 Regards,
 Soumya.


 On Tue, May 29, 2012 at 4:10 PM, Devaraj k devara...@huawei.com wrote:

  Hi Gump,
 
Mapreduce fits well for solving these types(joins) of problem.
 
  I hope this will help you to solve the described problem..
 
  1. Mapoutput key and value classes : Write a map out put key
  class(Text.class), value class(CombinedValue.class). Here value class
  should be able to hold the values from both the files(a.txt and b.txt) as
  shown below.
 
  class CombinedValue implements WritableComparator
  {
String name;
int age;
String address;
boolean isLeft; // flag to identify from which file
  }
 
  2. Mapper : Write a map() function which can parse from both the
  files(a.txt, b.txt) and produces common output key and value class.
 
  3. Partitioner : Write the partitioner in such a way that it will Send
 all
  the (key, value) pairs to same reducer which are having same key.
 
  4. Reducer : In the reduce() function, you will receive the records from
  both the files and you can combine those easily.
 
 
  Thanks
  Devaraj
 
 
  
  From: liuzhg [liu...@cernet.com]
  Sent: Tuesday, May 29, 2012 3:45 PM
  To: common-user@hadoop.apache.org
  Subject: How to mapreduce in the scenario
 
  Hi,
 
  I wonder that if Hadoop can solve effectively the question as following:
 
  ==
  input file: a.txt, b.txt
  result: c.txt
 
  a.txt:
  id1,name1,age1,...
  id2,name2,age2,...
  id3,name3,age3,...
  id4,name4,age4,...
 
  b.txt：
  id1,address1,...
  id2,address2,...
  id3,address3,...
 
  c.txt
  id1,name1,age1,address1,...
  id2,name2,age2,address2,...
  
 
  I know that it can be done well by database.
  But I want to handle it with hadoop if possible.
  Can hadoop meet the requirement?
 
  Any suggestion can help me. Thank you very much!
 
  Best Regards,
 
  Gump

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

2012-05-30 Thread samir das mohapatra

 In your logs details  i colud not find the NN stating.

 It is the Problem of NN itself.

  Harsh also  suggested for that same.

On Sun, May 27, 2012 at 10:51 PM, Rohit Pandey rohitpandey...@gmail.comwrote:

 Hello Hadoop community,

 I have been trying to set up a double node Hadoop cluster (following
 the instructions in -

 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
 )
 and am very close to running it apart from one small glitch - when I
 start the dfs (using start-dfs.sh), it says:

 10.63.88.53: starting datanode, logging to
 /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-ubuntu.out
 10.63.88.109: starting datanode, logging to

 /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-pandro51-OptiPlex-960.out
 10.63.88.109: starting secondarynamenode, logging to

 /usr/local/hadoop/bin/../logs/hadoop-pandro51-secondarynamenode-pandro51-OptiPlex-960.out
 starting jobtracker, logging to

 /usr/local/hadoop/bin/../logs/hadoop-pandro51-jobtracker-pandro51-OptiPlex-960.out
 10.63.88.109: starting tasktracker, logging to

 /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-pandro51-OptiPlex-960.out
 10.63.88.53: starting tasktracker, logging to
 /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-ubuntu.out

 which looks like it's been successful in starting all the nodes.
 However, when I check them out by running 'jps', this is what I see:
 27531 SecondaryNameNode
 27879 Jps

 As you can see, there is no datanode and name node. I have been
 racking my brains at this for quite a while now. Checked all the
 inputs and every thing. Any one know what the problem might be?

 --

 Thanks in advance,

 Rohit

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

2012-05-30 Thread samir das mohapatra

*Step wise Details (Ubantu 10.x version ): Go through properly and Run one
by one. it will sove your problem (You can change the path,IP ,Host name as
you like to do)*
-
1. Start the terminal

2. Disable ipv6 on all machines
pico /etc/sysctl.conf 10. Download and install hadoop:

3. Add these files to the EOF cd /usr/local/hadoop
net.ipv6.conf.all.disable_ipv6 = 1 sudo wget –c
http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u2.tar.gz
net.ipv6.conf.default.disable_ipv6 = 1 11. Unzip the tar
net.ipv6.conf.lo.disable_ipv6 = 1 sudo tar -zxvf
/usr/local/hadoop/hadoop-0.20.2-chd3u2.tar.gz
net.ipv6.conf.lo.disable_ipv6 = 1 12. Change permissions on hadoop folder
by granting all to hadoop

3. Reboot the system sudo chown -R hadoop:hadoop /usr/local/hadoop
sudo reboot sudo chmod 750 -R /usr/local/hadoop

4. Install java 13. Create the HDFS directory
sudo apt-get install openjdk-6-jdk openjdk-6-jre sudo mkdir
hadoop-datastore // inside the usr local hadoop folder

5. Check if ssh is installed, if not do so: sudo mkdir
hadoop-datastore/hadoop-hadoop
sudo apt-get install openssh-server openssh-client 14. Add the binaries
path and hadoop home in the environment file

6. Create a group and user called hadoop sudo pico /etc/environment
sudo addgroup hadoop set the bin path as well as hadoop home path
sudo adduser --ingroup hadoop hadoop source /etc/environment

7. Assign all the permissions to the Hadoop user 15. Configure the hadoop
env.sh file
sudo visudo cd /usr/local/hadoop/hadoop-0.20.2-cdh3u3/
Add the following line in the file sudo pico conf/hadoop-env.sh
hadoop ALL =(ALL) ALL add the following line in there:

8. Check if hadoop user has ssh installed export
HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
su hadoop export JAVA_HOME=/usr/lib/jvm/java-6-openjdk
ssh-keygen -t rsa -P  next page
Press Enter when asked.
cat $HOME/.ssh/id_rsa.pub  $HOME/.ssh/authorized_keys
ssh localhost
Copy the servers RSA public key from server to all nodes
in the authorized_keys file as shown in the above step

9. Make hadoop installation directory:
sudo mkdir /usr/local/


10. Download and install hadoop:
cd /usr/local/hadoop
sudo wget –c http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u2.tar.gz


11. Unzip the tar
sudo tar -zxvf /usr/local/hadoop/hadoop-0.20.2-chd3u2.tar.gz

12. Change permissions on hadoop folder by granting all to hadoop
sudo chown -R hadoop:hadoop /usr/local/hadoop
sudo chmod 750 -R /usr/local/hadoop

13. Create the HDFS directory
sudo mkdir hadoop-datastore // inside the usr local hadoop folder
sudo mkdir hadoop-datastore/hadoop-hadoop

14. Add the binaries path and hadoop home in the environment file
  sudo pico /etc/environment
 // set the bin path as well as hadoop home path
source /etc/environment

15. Configure the hadoop env.sh file

cd /usr/local/hadoop/hadoop-0.20.2-cdh3u3/

sudo pico conf/hadoop-env.sh
//add the following line in there:
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk


16. Configuring the core-site.xml

?xml version=1.0? ?xml-stylesheet type=text/xsl
href=configuration.xsl?
configuration
property
namehadoop.tmp.dir/name
value/usr/local/hadoop/hadoop-datastore/hadoop-${user.name}/value
descriptionA base for other temporary directories./description
/property
property
namefs.default.name/name
valuehdfs://IP of namenode:54310/value
descriptionLocation of the Namenode/description
/property
/configuration

17. Configuring the hdfs-site.xml
?xml version=1.0? ?xml-stylesheet type=text/xsl
href=configuration.xsl?
configuration
property
namedfs.replication/name
value2/value
descriptionDefault block replication./description
/property
/configuration

18. Configuring the mapred-site.xml
?xml version=1.0? ?xml-stylesheet type=text/xsl
href=configuration.xsl?
configuration
property
namemapred.job.tracker/name
valueIP of job tracker:54311/value
descriptionHost and port of the jobtracker.
/description
/property
/configuration

19. Add all the IP addresses in the conf/slaves file
sudo pico /usr/local/hadoop/hadoop-0.20.2-cdh3u2/conf/slaves
 Add the list of IP addresses that will host data nodes, in this file

-

*Hadoop Commands: Now restart the hadoop cluster*
start-all.sh/stop-all.sh
start-dfs.sh/stop-dfs.sh
start-mapred.sh/stop-mapred.sh
hadoop dfs -ls /virtual dfs path
hadoop dfs copyFromLocal local path dfs path

Re: different input/output formats

2012-05-30 Thread samir das mohapatra

Hi
  I think attachment will not got thgrough the common-user@hadoop.apache.org.

Ok Please have a look bellow.

MAP

package test;

import java.io.IOException;

import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class myMapper extends MapReduceBase implements
MapperLongWritable,Text,FloatWritable,Text {

   public void map(LongWritable offset, Text
val,OutputCollectorFloatWritable,Text output, Reporter reporter)  throws
IOException {
   output.collect(new FloatWritable(1), val);
}
}

REDUCER
--
Prepare reducer  what exactly you want for.



JOB


package test;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.SequenceFileOutputFormat;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class TestDemo extends Configured implements Tool{

public static void main(String args[]) throws Exception{

int res = ToolRunner.run(new Configuration(), new
TestDemo(),args);
System.exit(res);

}

@Override
public int run(String[] args) throws Exception {
JobConf conf = new JobConf(TestDemo.class);
String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
conf.setJobName(TestCustomInputOutput);


   conf.setMapperClass(myMapper.class);
   conf.setMapOutputKeyClass(FloatWritable.class);
   conf.setMapOutputValueClass(Text.class);
   conf.setNumReduceTasks(0);
   conf.setOutputKeyClass(FloatWritable.class);
   conf.setOutputValueClass(Text.class);

   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(SequenceFileOutputFormat.class);

   TextInputFormat.addInputPath(conf, new Path(args[0]));
   SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));

JobClient.runJob(conf);
return 0;
}
}

On Wed, May 30, 2012 at 6:57 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 PFA.


 On Wed, May 30, 2012 at 2:45 AM, Mark question markq2...@gmail.comwrote:

 Hi Samir, can you email me your main class.. or if you can check mine, it
 is as follows:

 public class SortByNorm1 extends Configured implements Tool {

@Override public int run(String[] args) throws Exception {

if (args.length != 2) {
System.err.printf(Usage:bin/hadoop jar norm1.jar inputDir
 outputDir\n);
ToolRunner.printGenericCommandUsage(System.err);
return -1;
}
JobConf conf = new JobConf(new Configuration(),SortByNorm1.class);
conf.setJobName(SortDocByNorm1);
conf.setMapperClass(Norm1Mapper.class);
conf.setMapOutputKeyClass(FloatWritable.class);
conf.setMapOutputValueClass(Text.class);
conf.setNumReduceTasks(0);
conf.setReducerClass(Norm1Reducer.class);
 conf.setOutputKeyClass(FloatWritable.class);
conf.setOutputValueClass(Text.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);

TextInputFormat.addInputPath(conf, new Path(args[0]));
SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));
 JobClient.runJob(conf);
return 0;
}
public static void main(String[] args) throws Exception {
int exitCode = ToolRunner.run(new SortByNorm1(), args);
System.exit(exitCode);
 }


 On Tue, May 29, 2012 at 1:55 PM, samir das mohapatra 
 samir.help...@gmail.com wrote:

  Hi Mark
 See the out put for that same  Application .
 I am  not getting any error.
 
 
  On Wed, May 30, 2012 at 1:27 AM, Mark question markq2...@gmail.com
 wrote:
 
  Hi guys, this is a very simple  program, trying to use TextInputFormat
 and
  SequenceFileoutputFormat. Should be easy but I get the same error.
 
  Here is my configurations:
 
 conf.setMapperClass(myMapper.class);
 conf.setMapOutputKeyClass(FloatWritable.class);
 conf.setMapOutputValueClass(Text.class);
 conf.setNumReduceTasks(0);
 conf.setOutputKeyClass

Re: How to Integrate LDAP in Hadoop ?

2012-05-29 Thread samir das mohapatra

It is cloudera version .20

On Tue, May 29, 2012 at 4:14 PM, Michel Segel michael_se...@hotmail.comwrote:

 Which release? Version?
 I believe there are variables in the *-site.xml that allow LDAP
 integration ...



 Sent from a remote device. Please excuse any typos...

 Mike Segel

 On May 26, 2012, at 7:40 AM, samir das mohapatra samir.help...@gmail.com
 wrote:

  Hi All,
 
Did any one work on hadoop with LDAP integration.
Please help me for same.
 
  Thanks
   samir

Re: How to mapreduce in the scenario

2012-05-29 Thread samir das mohapatra

Yes it is possible by using MultipleInputs format to multiple mapper
(basically 2 different mapper)

Setp: 1

MultipleInputs.addInputPath(conf, new Path(args[0]), TextInputFormat.class,
*Mapper1.class*);
 MultipleInputs.addInputPath(conf, new Path(args[1]),
TextInputFormat.class, *Mapper2.class*);

while defining two mappers value  put some identifier
(*output.collect(new Text(key), new Text(*identifier+~ *+value));*)
related to a.txt and b.txt so that it will easy to distinct two file mapper
output within the reducer.


Step 2:
  put b.txt in the distcach and compare the reducer value against the
b.txt  List
String currValue = values.next().toString();
String valueSplitted[] = currValue.split(~);
   if(valueSplitted[0].equals(A)) // A:- Identifier from A
mapper
{
   //where process A file
}
else if(valueSplitted[0].equals(B)) //B:- Identifier from
B mapper
{
   //here process B file
}

   output.collect(new Text(key), new Text(Formated Value as like
you to display));



Decide the key  as like what you want to produce the result.

After that you have to use one reducer to perform the ouput.

thanks
samir

On Tue, May 29, 2012 at 3:45 PM, liuzhg liu...@cernet.com wrote:

 Hi,

 I wonder that if Hadoop can solve effectively the question as following:

 ==
 input file: a.txt, b.txt
 result: c.txt

 a.txt:
 id1,name1,age1,...
 id2,name2,age2,...
 id3,name3,age3,...
 id4,name4,age4,...

 b.txt：
 id1,address1,...
 id2,address2,...
 id3,address3,...

 c.txt
 id1,name1,age1,address1,...
 id2,name2,age2,address2,...
 

 I know that it can be done well by database.
 But I want to handle it with hadoop if possible.
 Can hadoop meet the requirement?

 Any suggestion can help me. Thank you very much!

 Best Regards,

 Gump

Re: different input/output formats

2012-05-29 Thread samir das mohapatra

Hi  Mark

  public void map(LongWritable offset, Text
val,OutputCollector
FloatWritable,Text output, Reporter reporter)
   throws IOException {
   output.collect(new FloatWritable(*1*), val); *//chanage 1 to 1.0f
then it will work.*
}

let me know the status after the change


On Wed, May 30, 2012 at 1:27 AM, Mark question markq2...@gmail.com wrote:

 Hi guys, this is a very simple  program, trying to use TextInputFormat and
 SequenceFileoutputFormat. Should be easy but I get the same error.

 Here is my configurations:

conf.setMapperClass(myMapper.class);
conf.setMapOutputKeyClass(FloatWritable.class);
conf.setMapOutputValueClass(Text.class);
conf.setNumReduceTasks(0);
conf.setOutputKeyClass(FloatWritable.class);
conf.setOutputValueClass(Text.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);

TextInputFormat.addInputPath(conf, new Path(args[0]));
SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));


 myMapper class is:

 public class myMapper extends MapReduceBase implements
 MapperLongWritable,Text,FloatWritable,Text {

public void map(LongWritable offset, Text
 val,OutputCollectorFloatWritable,Text output, Reporter reporter)
throws IOException {
output.collect(new FloatWritable(1), val);
 }
 }

 But I get the following error:

 12/05/29 12:54:31 INFO mapreduce.Job: Task Id :
 attempt_201205260045_0032_m_00_0, Status : FAILED
 java.io.IOException: wrong key class: org.apache.hadoop.io.LongWritable is
 not class org.apache.hadoop.io.FloatWritable
at
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:998)
at

 org.apache.hadoop.mapred.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:75)
at

 org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.collect(MapTask.java:705)
at

 org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:508)
at

 filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:59)
at

 filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.Use

 Where is the writing of LongWritable coming from ??

 Thank you,
 Mark

Re: different input/output formats

2012-05-29 Thread samir das mohapatra

Hi Mark
   See the out put for that same  Application .
   I am  not getting any error.


On Wed, May 30, 2012 at 1:27 AM, Mark question markq2...@gmail.com wrote:

 Hi guys, this is a very simple  program, trying to use TextInputFormat and
 SequenceFileoutputFormat. Should be easy but I get the same error.

 Here is my configurations:

conf.setMapperClass(myMapper.class);
conf.setMapOutputKeyClass(FloatWritable.class);
conf.setMapOutputValueClass(Text.class);
conf.setNumReduceTasks(0);
conf.setOutputKeyClass(FloatWritable.class);
conf.setOutputValueClass(Text.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(SequenceFileOutputFormat.class);

TextInputFormat.addInputPath(conf, new Path(args[0]));
SequenceFileOutputFormat.setOutputPath(conf, new Path(args[1]));


 myMapper class is:

 public class myMapper extends MapReduceBase implements
 MapperLongWritable,Text,FloatWritable,Text {

public void map(LongWritable offset, Text
 val,OutputCollectorFloatWritable,Text output, Reporter reporter)
throws IOException {
output.collect(new FloatWritable(1), val);
 }
 }

 But I get the following error:

 12/05/29 12:54:31 INFO mapreduce.Job: Task Id :
 attempt_201205260045_0032_m_00_0, Status : FAILED
 java.io.IOException: wrong key class: org.apache.hadoop.io.LongWritable is
 not class org.apache.hadoop.io.FloatWritable
at
 org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:998)
at

 org.apache.hadoop.mapred.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:75)
at

 org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.collect(MapTask.java:705)
at

 org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:508)
at

 filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:59)
at

 filter.stat.cosine.preprocess.SortByNorm1$Norm1Mapper.map(SortByNorm1.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.Use

 Where is the writing of LongWritable coming from ??

 Thank you,
 Mark

How to configure application for Eternal jar

2012-05-26 Thread samir das mohapatra

Hi All,
  How to configure the external jar , which is use by application
internally.
   For eample:
  JDBC ,Hive Driver etc.

Note:- I dont have  permission to start and stop the hadoop machine.
 So  I need to configure application level (Not hadoop level )

  If we will put jar inside the lib folder of the hadoop then i think we
need to re-start the hadoop
  without this,  is there any other way  to do so.




Thanks
samir

Re: Right way to implement MR ?

2012-05-24 Thread samir das mohapatra

Thanks
  Harsh J for your help.

On Thu, May 24, 2012 at 1:24 AM, Harsh J ha...@cloudera.com wrote:

 Samir,

 You can use MultipleInputs for multiple forms of inputs per mapper
 (with their own input K/V types, but common output K/V types) with a
 common reduce-side join/compare.

 See
 http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html
 .

 On Thu, May 24, 2012 at 1:17 AM, samir das mohapatra
 samir.help...@gmail.com wrote:
  Hi All,
  How to compare to input file In M/R Job.
  let A Log file around 30GB
 and B Log file size is around 60 GB
 
   I wanted to know how  i will  define K,V inside the mapper.
 
   Thanks
   samir.



 --
 Harsh J

Re: RemoteException writing files

2012-05-19 Thread samir das mohapatra

Hi
  This Could be due to the Following reason

1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
any available DataNodes
 2) Namenode not able to start properly
 3) other wise some IP Issue .
Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
local)

   Follow URL:
http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F


Thanks
 samir


On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.comwrote:

 Hi folks,

 (Resending to this group, sent to common-dev before, pretty sure that's for
 Hadoop internal development - sorry for that..)

 I'm pretty stuck here.  I've been researching for hours and I haven't made
 any forward progress on this one.

 I have a vmWare installation of Cloudera Hadoop 0.20.  The following
 commands to create a directory and copy a file from the shared folder *work
 fine*, so I'm confident everything is setup correctly:

 [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir
 [cloudera@localhost bin]$ hadoop fs -put /mnt/hgfs/shared_folder/file1.txt
 /user/cloudera/testdir/file1.txt

 The file shows up fine in the HDFS doing it this way on the Linux VM.

 *However*, when I try doing the equivalent operation in Java everything
 works great until I try to close() FSDataOutputStream.
 I'm left with the new directory and a zero byte size file.  One suspicious
 thing is that the user is admin instead of cloudera which I haven't
 figured out why.  Here is the error:

 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in createBlockOutputStream
 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further
 information
 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block
 blk_1931357292676354131_1068
 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010
 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead
 of 1
at

 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533)
at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 There are certainly lots of search references to *could only be replicated
 to 0 nodes, instead of 1* but chasing down those suggestions hasn't
 helped.
 I have run *jps* and* netstat* and that looks good.  All services are
 running, all port seem to be good.  The *health check* looks good, plenty
 of disk space, no failed nodes...

 Here is the java (it fails when it hits fs.close():

 import java.io.BufferedReader;
 import java.io.FileInputStream;
 import java.io.FileReader;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FSDataInputStream;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;

 public class TestFileTrans {

public static void main(String[] args) {

Configuration config = new Configuration();

config.addResource(new
 Path(c:/_bigdata/client_libs/core-site.xml));
config.addResource(new
 Path(c:/_bigdata/client_libs/hdfs-site.xml));

System.out.println(hadoop.tmp.dir:  +
 config.get(hadoop.tmp.dir));
try{
FileSystem dfs = FileSystem.get(config);

// this will default to admin unless the workingDirectory is
 explicitly set..

System.out.println(HDFS Working Directory:  +
 dfs.getWorkingDirectory().toString());

String dirName = testdir;
Path src = new Path(dfs.getWorkingDirectory()+/+dirName);
dfs.mkdirs(src);

System.out.println(HDFS Directory created:  +
 dfs.getWorkingDirectory().toString());

loadFile(dfs, src);

}catch(IOException e){
System.out.println(Error + e.getMessage());
}

}

private static void loadFile(FileSystem dfs, Path src) throws
 IOException{

FileInputStream fis = new
 FileInputStream(c:/_bigdata/shared_folder/file1.txt);

int len = fis.available();

byte[] btr = new byte[len];

fis.read(btr);

FSDataOutputStream fs = dfs.create(new Path(src.toString()
 +/file1.txt));

fs.write(btr);

fs.flush();
fs.close();

}
 }

 Any help would be greatly appreciated!

Re: RemoteException writing files

2012-05-19 Thread samir das mohapatra

Hi
  This Could be due to the Following reason

1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
any available DataNodes
 2) Namenode not able to start properly
 3) other wise some IP Issue .
Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
local)

   Follow URL:
http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F

Thanks
 samir

On Sat, May 19, 2012 at 11:30 PM, samir das mohapatra 
samir.help...@gmail.com wrote:

 Hi
   This Could be due to the Following reason

 1) The *NameNode http://wiki.apache.org/hadoop/NameNode* does not have
 any available DataNodes
  2) Namenode not able to start properly
  3) other wise some IP Issue .
 Note:- Pleaes  mention localhost instead of 127.0.0.1 (If it is in
 local)

Follow URL:

 http://wiki.apache.org/hadoop/FAQ#What_does_.22file_could_only_be_replicated_to_0_nodes.2C_instead_of_1.22_mean.3F


 Thanks
  samir



 On Sat, May 19, 2012 at 8:59 PM, Todd McFarland toddmcf2...@gmail.comwrote:

 Hi folks,

 (Resending to this group, sent to common-dev before, pretty sure that's
 for
 Hadoop internal development - sorry for that..)

 I'm pretty stuck here.  I've been researching for hours and I haven't made
 any forward progress on this one.

 I have a vmWare installation of Cloudera Hadoop 0.20.  The following
 commands to create a directory and copy a file from the shared folder
 *work
 fine*, so I'm confident everything is setup correctly:

 [cloudera@localhost bin]$ hadoop fs -mkdir /user/cloudera/testdir
 [cloudera@localhost bin]$ hadoop fs -put
 /mnt/hgfs/shared_folder/file1.txt
 /user/cloudera/testdir/file1.txt

 The file shows up fine in the HDFS doing it this way on the Linux VM.

 *However*, when I try doing the equivalent operation in Java everything
 works great until I try to close() FSDataOutputStream.
 I'm left with the new directory and a zero byte size file.  One suspicious
 thing is that the user is admin instead of cloudera which I haven't
 figured out why.  Here is the error:

 12/05/19 09:45:46 INFO hdfs.DFSClient: Exception in
 createBlockOutputStream
 127.0.0.1:50010 java.net.ConnectException: Connection refused: no further
 information
 12/05/19 09:45:46 INFO hdfs.DFSClient: Abandoning block
 blk_1931357292676354131_1068
 12/05/19 09:45:46 INFO hdfs.DFSClient: Excluding datanode 127.0.0.1:50010
 12/05/19 09:45:46 WARN hdfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/admin/testdir/file1.txt could only be replicated to 0 nodes, instead
 of 1
at

 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1533)
at

 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:667)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

 There are certainly lots of search references to *could only be
 replicated
 to 0 nodes, instead of 1* but chasing down those suggestions hasn't
 helped.
 I have run *jps* and* netstat* and that looks good.  All services are
 running, all port seem to be good.  The *health check* looks good, plenty
 of disk space, no failed nodes...

 Here is the java (it fails when it hits fs.close():

 import java.io.BufferedReader;
 import java.io.FileInputStream;
 import java.io.FileReader;
 import java.io.IOException;
 import org.apache.hadoop.conf.Configuration;
 import org.apache.hadoop.fs.FSDataInputStream;
 import org.apache.hadoop.fs.FSDataOutputStream;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.Path;

 public class TestFileTrans {

public static void main(String[] args) {

Configuration config = new Configuration();

config.addResource(new
 Path(c:/_bigdata/client_libs/core-site.xml));
config.addResource(new
 Path(c:/_bigdata/client_libs/hdfs-site.xml));

System.out.println(hadoop.tmp.dir:  +
 config.get(hadoop.tmp.dir));
try{
FileSystem dfs = FileSystem.get(config);

// this will default to admin unless the workingDirectory is
 explicitly set..

System.out.println(HDFS Working Directory:  +
 dfs.getWorkingDirectory().toString());

String dirName = testdir;
Path src = new Path(dfs.getWorkingDirectory()+/+dirName);
dfs.mkdirs(src);

System.out.println(HDFS Directory created:  +
 dfs.getWorkingDirectory().toString());

loadFile(dfs, src);

}catch(IOException e){
System.out.println(Error + e.getMessage());
}

}

private static void loadFile(FileSystem dfs, Path src) throws
 IOException{

FileInputStream fis = new
 FileInputStream(c:/_bigdata/shared_folder/file1.txt);

int len = fis.available

Re: hadoop File loading

2012-05-15 Thread samir das mohapatra

HI,

Your requirment is that your M/R will use  full xml file while operating.
(If it is write then please one of the approach bellow)
So you can put this xml file in  DistrubutedChache which will shared
accross the M/R . So that your will get whole xml instead of chunk of data.

Thanks
   Samir

On Tue, May 15, 2012 at 11:30 PM, @dataElGrande markydale...@gmail.comwrote:


 You should check out Pentaho's howto's dealing with Hadoop and MapReducer.
 Hope this helps! http://wiki.pentaho.com/display/BAD/How+To%27s



 hari708 wrote:
 
  Hi,
  I have a big file consisting of XML data.the XML is not represented as a
  single line in the file. if we stream this file using ./hadoop dfs -put
  command to a hadoop directory .How the distribution happens.?
  Basically in My mapreduce program i am expecting a complete XML as my
  input.i have a CustomReader(for XML) in my mapreduce job configuration.My
  main confusion is if namenode distribute data to DataNodes ,there is a
  chance that a part of xml can go to one data node and other half can go
 in
  another datanode.If that is the case will my custom XMLReader in the
  mapreduce be able to combine it(as mapreduce reads data locally only).
  Please help me on this?
 

 --
 View this message in context:
 http://old.nabble.com/hadoop-File-loading-tp32871902p33849683.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: Moving files from JBoss server to HDFS

2012-05-12 Thread samir das mohapatra

Hi  financeturd financet...@yahoo.com,
My Point of view  second step like bellow is the good approach

{Separate server} -- {JBoss server}
and then
{Separate server} -- HDFS

thanks
  samir

On Sat, May 12, 2012 at 6:00 AM, financeturd financeturd 
financet...@yahoo.com wrote:

 Hello,

 We have a large number of
 custom-generated files (not just web logs) that we need to move from our
 JBoss servers to HDFS.  Our first implementation ran a cron job every 5
 minutes to move our files from the output directory to HDFS.

 Is this recommended?  We are being told by our IT team that our JBoss
 servers should not have access to HDFS for security reasons.  The files
 must be sucked to HDFS by other servers that do not accept traffic
 from the outside.  In essence, they are asking for a layer of
 indirection.  Instead of:
 {JBoss server} -- {HDFS}
 it's being requested that it look like:
 {Separate server} -- {JBoss server}
 and then
 {Separate server} -- HDFS


 While I understand in principle what is being said, the security of having
 processes on JBoss servers writing files to HDFS doesn't seem any worse
 than having Tomcat servers access a central database, which they do.

 Can anyone comment on what a recommended approach would be?  Should our
 JBoss servers push their data to HDFS or should the data be pulled by
 another server and then placed into HDFS?

 Thank you!
 FT

Re: java.io.IOException: Task process exit with nonzero status of 1

2012-05-11 Thread samir das mohapatra

Hi Mohit,

 1)  Hadoop is more portable with   Linux,Ubantu or any non dos file
system.
   but you are running hadoop on window it colud be the problem bcz hadoop
will generate some partial out put file for temporary use.
 2) Another thing is that your are running hadoop version as 0.19 , I think
if you upgrade the version it will solve your problem. why bcz example what
exactly you are using it is having some problem with FileRead and Write
with Window OS.

3)  Check your input file data bcz i could see your mapper is also 0%
4) If your are all right with whole scenario . please could your share your
logs under hadoopversion/logs
there it self we can trace it very clearly.

Thanks
 SAMIR






On Fri, May 11, 2012 at 12:26 PM, Mohit Kundra mohit@gmail.com wrote:

 Hi ,

 I am new user to hadoop . I have installed hadoop0.19.1 on single windows
 machine.
 Its http://localhost:50030/jobtracker.jsp and
 http://localhost:50070/dfshealth.jsp pages are working fine but when i am
 executing  bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
 It is showing below

 $ bin/hadoop jar hadoop-0.19.1-examples.jar pi 5 100
 cygpath: cannot create short name of D:hadoop-0.19.1logs
 Number of Maps = 5 Samples per Map = 100
 Wrote input for Map #0
 Wrote input for Map #1
 Wrote input for Map #2
 Wrote input for Map #3
 Wrote input for Map #4
 Starting Job
 12/05/11 12:07:26 INFO mapred.JobClient:
 Running job: job_20120513_0002
 12/05/11 12:07:27 INFO mapred.JobClient:  map 0% reduce 0%
 12/05/11 12:07:35 INFO mapred.JobClient: Task Id :
 attempt_20120513_0002_m_06_ 0, Status : FAILED
 java.io.IOException: Task process exit with nonzero status of 1.
 at org.apache.hadoop.mapred.TaskRunner.run (TaskRunner.java:425)



 Please tell me what is the root cause

 regards ,
 Mohit

58 matches

Mail list logo