Sending the entire file content as value to the mapper

2013-07-11 Thread Kasi Subrahmanyam
Hi Team,

I have a file which has semi structured text  data with no definite start
and end points.
How can i send the entire content of the file at once as key or value to
the mapper instead of line by line.

Thanks,
Subbu


RE: Sending the entire file content as value to the mapper

2013-07-11 Thread Charles Baker
Hi Subbu. Sounds like you'll have to implement a custom non-splittable
InputFormat which instantiates a custom RecordReader which in turn consumes
the entire file when it's next(K,V) method is called. Once implemented, you
specify the input format to the JobConf object:

 

conf.setInputFormat(MyInputFormat.class);

 

 

-Chuck

 

From: Kasi Subrahmanyam [mailto:kasisubbu...@gmail.com] 
Sent: Thursday, July 11, 2013 1:08 AM
To: common-u...@hadoop.apache.org; mapreduce-user@hadoop.apache.org
Subject: Sending the entire file content as value to the mapper

 

Hi Team,

 

I have a file which has semi structured text  data with no definite start and
end points.

How can i send the entire content of the file at once as key or value to the
mapper instead of line by line.

 

Thanks,

Subbu

/prefont face=arial size=2 color=#736F6E



a 
href=http://www.sdl.com/?utm_source=Emailutm_medium=Email%2BSignatureutm_campaign=SDL%2BStandard%2BEmail%2BSignature;
img src=http://www.sdl.com/email.png; border=0brbrwww.sdl.com
/abrbr

font face=arial size=1 color=#736F6E

bSDL PLC confidential, all rights reserved./b

If you are not the intended recipient of this mail SDL requests and requires 
that you delete it without acting upon or copying any of its contents, 
and we further request that you advise us.BRBR
SDL Enterprise Technologies, Inc. - all rights reserved.  The information 
contained in this email may be confidential and/or legally privileged. It has 
been sent for the sole use of the intended recipient(s). If you are not the 
intended recipient of this mail, you are hereby notified that any unauthorized 
review, use, disclosure, dissemination, distribution, or copying of this 
communication, or any of its contents, is strictly prohibited. If you have 
received this communication in error, please reply to the sender and destroy 
all copies of the message.
BRRegistered address: 201 Edgewater Drive, Suite 225, Wakefield, MA 01880, USA
/font


Sending the entire file content as value to the mapper

2013-07-11 Thread Kasi Subrahmanyam
Hi Team,

I have a file which has semi structured text  data with no definite start
and end points.
How can i send the entire content of the file at once as key or value to
the mapper instead of line by line.

Thanks,
Subbu


RE: Sending the entire file content as value to the mapper

2013-07-11 Thread Devaraj k
Hi,

  You could send the file meta info to the map function as key/value through 
the split, and then you can read the entire file in your map function.

Thanks
Devaraj k


-Original Message-
From: Kasi Subrahmanyam [mailto:kasisubbu...@gmail.com] 
Sent: 11 July 2013 13:38
To: common-user@hadoop.apache.org; mapreduce-u...@hadoop.apache.org
Subject: Sending the entire file content as value to the mapper

Hi Team,

I have a file which has semi structured text  data with no definite start and 
end points.
How can i send the entire content of the file at once as key or value to the 
mapper instead of line by line.

Thanks,
Subbu


Re: Task failure in slave node

2013-07-11 Thread devara...@huawei.com
Hi,

   It seems mahout-examples-0.7-job.jar is depending on other jars/classes.
While running Job Tasks it is not able to find those classes in the
classpath and failing those tasks.

You need to provide the dependent jar files while submitting/running Job.


Thanks
Devaraj k




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Task-failure-in-slave-node-tp4077284p4077290.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.


Datanodes using public ip, why?

2013-07-11 Thread Ben Kim
Hello Hadoop Community!

I've setup datanodes with private network by adding private hostname's to
the slaves file.
but it looks like when i lookup on the webUI datenodes are registered with
public hostnames.

are they actually networking with public network?

all datanodes have eth0 with public address and eth1 with private address.

what am i missing?

Thanks a whole lot

*Benjamin Kim*
*benkimkimben at gmail*


Re: Datanodes using public ip, why?

2013-07-11 Thread Thanh Do
have you tried playing with this config parameter

dfs.datanode.dns.interface

?


On Thu, Jul 11, 2013 at 4:20 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hello Hadoop Community!

 I've setup datanodes with private network by adding private hostname's to
 the slaves file.
 but it looks like when i lookup on the webUI datenodes are registered with
 public hostnames.

 are they actually networking with public network?

 all datanodes have eth0 with public address and eth1 with private address.

 what am i missing?

 Thanks a whole lot

 *Benjamin Kim*
 *benkimkimben at gmail*



Re: Datanodes using public ip, why?

2013-07-11 Thread Alex Levin
make sure that your hostnames resolved ( dns or/and hosts files ) with
private IPs.

if you have records in the nodes hosts files like
public IP hosname

remove (or comment) them

Alex
On Jul 11, 2013 2:21 AM, Ben Kim benkimkim...@gmail.com wrote:

 Hello Hadoop Community!

 I've setup datanodes with private network by adding private hostname's to
 the slaves file.
 but it looks like when i lookup on the webUI datenodes are registered with
 public hostnames.

 are they actually networking with public network?

 all datanodes have eth0 with public address and eth1 with private address.

 what am i missing?

 Thanks a whole lot

 *Benjamin Kim*
 *benkimkimben at gmail*



Re: ConnectionException in container, happens only sometimes

2013-07-11 Thread Andrei
Here are logs of RM and 2 NMs:

RM (master-host): http://pastebin.com/q4qJP8Ld
NM where AM ran (slave-1-host): http://pastebin.com/vSsz7mjG
NM where slave container ran (slave-2-host): http://pastebin.com/NMFi6gRp

The only related error I've found in them is the following (from RM logs):

...
2013-07-11 07:46:06,225 ERROR
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService:
AppAttemptId doesnt exist in cache appattempt_1373465780870_0005_01
2013-07-11 07:46:06,227 WARN org.apache.hadoop.ipc.Server: IPC Server
Responder, call org.apache.hadoop.yarn.api.AMRMProtocolPB.allocate from
10.128.40.184:47101: output error
2013-07-11 07:46:06,228 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 8030 caught an exception
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:265)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:456)
at org.apache.hadoop.ipc.Server.channelWrite(Server.java:2140)
at org.apache.hadoop.ipc.Server.access$2000(Server.java:108)
at org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:939)
at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:1005)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1747)
2013-07-11 07:46:11,238 INFO org.apache.hadoop.yarn.util.RackResolver:
Resolved my_user to /default-rack
2013-07-11 07:46:11,283 INFO
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService:
NodeManager from node my_user(cmPort: 59267 httpPort: 8042) registered with
capability: 8192, assigned nodeId my_user:59267
...

Though from stack trace it's hard to tell where this error came from.

Let me know if you need any more information.










On Thu, Jul 11, 2013 at 1:00 AM, Andrei faithlessfri...@gmail.com wrote:

 Hi Omkar,

 I'm out of office now, so I'll post it as fast as get back there.

 Thanks


 On Thu, Jul 11, 2013 at 12:39 AM, Omkar Joshi ojo...@hortonworks.comwrote:

 can you post RM/NM logs too.?

 Thanks,
 Omkar Joshi
 *Hortonworks Inc.* http://www.hortonworks.com




Task failure in slave node

2013-07-11 Thread Margusja

Hi

I have tow nodes:
n1 (master, salve) and n2 (slave)

after set up I ran wordcount example and it worked fine:
[hduser@n1 ~]$ hadoop jar /usr/local/hadoop/hadoop-examples-1.0.4.jar 
wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output
13/07/11 15:30:44 INFO input.FileInputFormat: Total input paths to 
process : 7
13/07/11 15:30:44 INFO util.NativeCodeLoader: Loaded the native-hadoop 
library

13/07/11 15:30:44 WARN snappy.LoadSnappy: Snappy native library not loaded
13/07/11 15:30:44 INFO mapred.JobClient: Running job: job_201307111355_0015
13/07/11 15:30:45 INFO mapred.JobClient:  map 0% reduce 0%
13/07/11 15:31:03 INFO mapred.JobClient:  map 42% reduce 0%
13/07/11 15:31:06 INFO mapred.JobClient:  map 57% reduce 0%
13/07/11 15:31:09 INFO mapred.JobClient:  map 71% reduce 0%
13/07/11 15:31:15 INFO mapred.JobClient:  map 100% reduce 0%
13/07/11 15:31:18 INFO mapred.JobClient:  map 100% reduce 23%
13/07/11 15:31:27 INFO mapred.JobClient:  map 100% reduce 100%
13/07/11 15:31:32 INFO mapred.JobClient: Job complete: job_201307111355_0015
13/07/11 15:31:32 INFO mapred.JobClient: Counters: 30
13/07/11 15:31:32 INFO mapred.JobClient:   Job Counters
13/07/11 15:31:32 INFO mapred.JobClient: Launched reduce tasks=1
13/07/11 15:31:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=67576
13/07/11 15:31:32 INFO mapred.JobClient: Total time spent by all 
reduces waiting after reserving slots (ms)=0
13/07/11 15:31:32 INFO mapred.JobClient: Total time spent by all 
maps waiting after reserving slots (ms)=0

13/07/11 15:31:32 INFO mapred.JobClient: Rack-local map tasks=3
13/07/11 15:31:32 INFO mapred.JobClient: Launched map tasks=7
13/07/11 15:31:32 INFO mapred.JobClient: Data-local map tasks=4
13/07/11 15:31:32 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=21992
13/07/11 15:31:32 INFO mapred.JobClient:   File Output Format Counters
13/07/11 15:31:32 INFO mapred.JobClient: Bytes Written=1412505
13/07/11 15:31:32 INFO mapred.JobClient:   FileSystemCounters
13/07/11 15:31:32 INFO mapred.JobClient: FILE_BYTES_READ=5414195
13/07/11 15:31:32 INFO mapred.JobClient: HDFS_BYTES_READ=6950820
13/07/11 15:31:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=8744993
13/07/11 15:31:32 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1412505
13/07/11 15:31:32 INFO mapred.JobClient:   File Input Format Counters
13/07/11 15:31:32 INFO mapred.JobClient: Bytes Read=6950001
13/07/11 15:31:32 INFO mapred.JobClient:   Map-Reduce Framework
13/07/11 15:31:32 INFO mapred.JobClient: Map output materialized 
bytes=3157469

13/07/11 15:31:32 INFO mapred.JobClient: Map input records=137146
13/07/11 15:31:32 INFO mapred.JobClient: Reduce shuffle bytes=2904836
13/07/11 15:31:32 INFO mapred.JobClient: Spilled Records=594764
13/07/11 15:31:32 INFO mapred.JobClient: Map output bytes=11435849
13/07/11 15:31:32 INFO mapred.JobClient: Total committed heap usage 
(bytes)=1128136704

13/07/11 15:31:32 INFO mapred.JobClient: CPU time spent (ms)=18230
13/07/11 15:31:32 INFO mapred.JobClient: Combine input records=1174991
13/07/11 15:31:32 INFO mapred.JobClient: SPLIT_RAW_BYTES=819
13/07/11 15:31:32 INFO mapred.JobClient: Reduce input records=218990
13/07/11 15:31:32 INFO mapred.JobClient: Reduce input groups=128513
13/07/11 15:31:32 INFO mapred.JobClient: Combine output records=218990
13/07/11 15:31:32 INFO mapred.JobClient: Physical memory (bytes) 
snapshot=1179656192

13/07/11 15:31:32 INFO mapred.JobClient: Reduce output records=128513
13/07/11 15:31:32 INFO mapred.JobClient: Virtual memory (bytes) 
snapshot=22992117760

13/07/11 15:31:32 INFO mapred.JobClient: Map output records=1174991

from web interface (http://n1:50030/) I saw that both (n1 and n2 ) were 
used without any errors.


Problems appear if I try to use following commands in master (n1):

[hduser@n1 ~]$hadoop jar 
mahout-distribution-0.7/mahout-examples-0.7-job.jar 
org.apache.mahout.classifier.df.mapreduce.BuildForest 
-Dmapred.max.split.size=1874231 -p -d testdata/bal_ee_2009.csv -ds 
testdata/bal_ee_2009.csv.info -sl 10 -o bal_ee_2009_out -t 1


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[file:/usr/local/hadoop-1.0.4/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop-1.0.4/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.

13/07/11 15:36:50 INFO mapreduce.BuildForest: Partial Mapred implementation
13/07/11 15:36:50 INFO mapreduce.BuildForest: Building the forest...
13/07/11 15:36:50 WARN mapred.JobClient: No job jar file set.  User 
classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/07/11 15:36:50 INFO input.FileInputFormat: Total input paths to 
process : 1
13/07/11 15:36:50 INFO util.NativeCodeLoader: Loaded the native-hadoop 
library

13/07/11 15:36:50 WARN snappy.LoadSnappy: Snappy native 

copy files from ftp to hdfs in parallel, distcp failed

2013-07-11 Thread Hao Ren

Hi,

I am running a hdfs on Amazon EC2

Say, I have a ftp server where stores some data.

I just want to copy these data directly to hdfs in a parallel way (which 
maybe more efficient).


I think hadoop distcp is what I need.

But

$ bin/hadoop distcp ftp://username:passwd@hostname/some/path/ 
hdfs://namenode/some/path


doesn't work.

13/07/05 16:13:46 INFO tools.DistCp: 
srcPaths=[ftp://username:passwd@hostname/some/path/]

13/07/05 16:13:46 INFO tools.DistCp: destPath=hdfs://namenode/some/path
Copy failed: org.apache.hadoop.mapred.InvalidInputException: Input 
source ftp://username:passwd@hostname/some/path/ does not exist.

at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:641)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

I checked the path by copying the ftp path in Chrome , and the file 
really exists, I can even download it.


And then, I tried to list the files under the path by:

$ bin/hadoop dfs -ls ftp://username:passwd@hostname/some/path/

It ends with:

ls: Cannot access ftp://username:passwd@hostname/some/path/: No 
such file or directory.


That seems the same pb.

Any workaround here ?

Thank you in advance.

Hao.

--
Hao Ren
ClaraVista
www.claravista.fr


Cloudera links and Document

2013-07-11 Thread Sathish Kumar
Hi All,

Can anyone help me the link or document that explain the below.

How Cloudera Manager works and handle the clusters (Agent and Master
Server)?
How the Cloudera Manager Process Flow works?
Where can I locate Cloudera configuration files and explanation in brief?


Regards
Sathish


Re: Task failure in slave node

2013-07-11 Thread Azuryy Yu
sorry for typo,

mahout, not mahou.  sent from mobile
On Jul 11, 2013 9:40 PM, Azuryy Yu azury...@gmail.com wrote:

 hi,

 put all mahou jars under hadoop_home/lib, then restart cluster.
  On Jul 11, 2013 8:45 PM, Margusja mar...@roo.ee wrote:

 Hi

 I have tow nodes:
 n1 (master, salve) and n2 (slave)

 after set up I ran wordcount example and it worked fine:
 [hduser@n1 ~]$ hadoop jar /usr/local/hadoop/hadoop-**examples-1.0.4.jar
 wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output
 13/07/11 15:30:44 INFO input.FileInputFormat: Total input paths to
 process : 7
 13/07/11 15:30:44 INFO util.NativeCodeLoader: Loaded the native-hadoop
 library
 13/07/11 15:30:44 WARN snappy.LoadSnappy: Snappy native library not loaded
 13/07/11 15:30:44 INFO mapred.JobClient: Running job:
 job_201307111355_0015
 13/07/11 15:30:45 INFO mapred.JobClient:  map 0% reduce 0%
 13/07/11 15:31:03 INFO mapred.JobClient:  map 42% reduce 0%
 13/07/11 15:31:06 INFO mapred.JobClient:  map 57% reduce 0%
 13/07/11 15:31:09 INFO mapred.JobClient:  map 71% reduce 0%
 13/07/11 15:31:15 INFO mapred.JobClient:  map 100% reduce 0%
 13/07/11 15:31:18 INFO mapred.JobClient:  map 100% reduce 23%
 13/07/11 15:31:27 INFO mapred.JobClient:  map 100% reduce 100%
 13/07/11 15:31:32 INFO mapred.JobClient: Job complete:
 job_201307111355_0015
 13/07/11 15:31:32 INFO mapred.JobClient: Counters: 30
 13/07/11 15:31:32 INFO mapred.JobClient:   Job Counters
 13/07/11 15:31:32 INFO mapred.JobClient: Launched reduce tasks=1
 13/07/11 15:31:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=67576
 13/07/11 15:31:32 INFO mapred.JobClient: Total time spent by all
 reduces waiting after reserving slots (ms)=0
 13/07/11 15:31:32 INFO mapred.JobClient: Total time spent by all maps
 waiting after reserving slots (ms)=0
 13/07/11 15:31:32 INFO mapred.JobClient: Rack-local map tasks=3
 13/07/11 15:31:32 INFO mapred.JobClient: Launched map tasks=7
 13/07/11 15:31:32 INFO mapred.JobClient: Data-local map tasks=4
 13/07/11 15:31:32 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=21992
 13/07/11 15:31:32 INFO mapred.JobClient:   File Output Format Counters
 13/07/11 15:31:32 INFO mapred.JobClient: Bytes Written=1412505
 13/07/11 15:31:32 INFO mapred.JobClient:   FileSystemCounters
 13/07/11 15:31:32 INFO mapred.JobClient: FILE_BYTES_READ=5414195
 13/07/11 15:31:32 INFO mapred.JobClient: HDFS_BYTES_READ=6950820
 13/07/11 15:31:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=8744993
 13/07/11 15:31:32 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1412505
 13/07/11 15:31:32 INFO mapred.JobClient:   File Input Format Counters
 13/07/11 15:31:32 INFO mapred.JobClient: Bytes Read=6950001
 13/07/11 15:31:32 INFO mapred.JobClient:   Map-Reduce Framework
 13/07/11 15:31:32 INFO mapred.JobClient: Map output materialized
 bytes=3157469
 13/07/11 15:31:32 INFO mapred.JobClient: Map input records=137146
 13/07/11 15:31:32 INFO mapred.JobClient: Reduce shuffle bytes=2904836
 13/07/11 15:31:32 INFO mapred.JobClient: Spilled Records=594764
 13/07/11 15:31:32 INFO mapred.JobClient: Map output bytes=11435849
 13/07/11 15:31:32 INFO mapred.JobClient: Total committed heap usage
 (bytes)=1128136704
 13/07/11 15:31:32 INFO mapred.JobClient: CPU time spent (ms)=18230
 13/07/11 15:31:32 INFO mapred.JobClient: Combine input records=1174991
 13/07/11 15:31:32 INFO mapred.JobClient: SPLIT_RAW_BYTES=819
 13/07/11 15:31:32 INFO mapred.JobClient: Reduce input records=218990
 13/07/11 15:31:32 INFO mapred.JobClient: Reduce input groups=128513
 13/07/11 15:31:32 INFO mapred.JobClient: Combine output records=218990
 13/07/11 15:31:32 INFO mapred.JobClient: Physical memory (bytes)
 snapshot=1179656192
 13/07/11 15:31:32 INFO mapred.JobClient: Reduce output records=128513
 13/07/11 15:31:32 INFO mapred.JobClient: Virtual memory (bytes)
 snapshot=22992117760
 13/07/11 15:31:32 INFO mapred.JobClient: Map output records=1174991

 from web interface (http://n1:50030/) I saw that both (n1 and n2 ) were
 used without any errors.

 Problems appear if I try to use following commands in master (n1):

 [hduser@n1 ~]$hadoop jar 
 mahout-distribution-0.7/**mahout-examples-0.7-job.jar
 org.apache.mahout.classifier.**df.mapreduce.BuildForest
 -Dmapred.max.split.size=**1874231 -p -d testdata/bal_ee_2009.csv -ds
 testdata/bal_ee_2009.csv.info -sl 10 -o bal_ee_2009_out -t 1

 SLF4J: Class path contains multiple SLF4J bindings.
 SLF4J: Found binding in [file:/usr/local/hadoop-1.0.4/**org/slf4j/impl/**
 StaticLoggerBinder.class]
 SLF4J: Found binding in [jar:file:/usr/local/hadoop-1.**
 0.4/lib/slf4j-log4j12-1.4.3.**jar!/org/slf4j/impl/**
 StaticLoggerBinder.class]
 SLF4J: See 
 http://www.slf4j.org/codes.**html#multiple_bindingshttp://www.slf4j.org/codes.html#multiple_bindingsfor
  an explanation.
 13/07/11 15:36:50 INFO mapreduce.BuildForest: Partial Mapred
 implementation
 13/07/11 15:36:50 INFO mapreduce.BuildForest: 

Re: Cloudera links and Document

2013-07-11 Thread Ram
Hi,
Go through the links.

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Managing-Clusters/cmmc_CM_architecture.html

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Managing-Clusters/cmmc_CM_architecture.html

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Installation-Guide/cmig_installing_configuring_dbs.html


Hi,



From,
Ramesh.




On Thu, Jul 11, 2013 at 6:58 PM, Sathish Kumar sa848...@gmail.com wrote:

 Hi All,

 Can anyone help me the link or document that explain the below.

 How Cloudera Manager works and handle the clusters (Agent and Master
 Server)?
 How the Cloudera Manager Process Flow works?
 Where can I locate Cloudera configuration files and explanation in brief?


 Regards
 Sathish




Re: Task failure in slave node

2013-07-11 Thread Margusja

Than you, it resolved the problem.
Funny, I don't remember that I copied mahout libs to n1 hadoop but there 
they are.


Tervitades, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
skype: margusja
-BEGIN PUBLIC KEY-
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCvbeg7LwEC2SCpAEewwpC3ajxE
5ZsRMCB77L8bae9G7TslgLkoIzo9yOjPdx2NN6DllKbV65UjTay43uUDyql9g3tl
RhiJIcoAExkSTykWqAIPR88LfilLy1JlQ+0RD8OXiWOVVQfhOHpQ0R/jcAkM2lZa
BjM8j36yJvoBVsfOHQIDAQAB
-END PUBLIC KEY-

On 7/11/13 4:41 PM, Azuryy Yu wrote:


sorry for typo,

mahout, not mahou.  sent from mobile

On Jul 11, 2013 9:40 PM, Azuryy Yu azury...@gmail.com 
mailto:azury...@gmail.com wrote:


hi,

put all mahou jars under hadoop_home/lib, then restart cluster.

On Jul 11, 2013 8:45 PM, Margusja mar...@roo.ee
mailto:mar...@roo.ee wrote:

Hi

I have tow nodes:
n1 (master, salve) and n2 (slave)

after set up I ran wordcount example and it worked fine:
[hduser@n1 ~]$ hadoop jar
/usr/local/hadoop/hadoop-examples-1.0.4.jar wordcount
/user/hduser/gutenberg /user/hduser/gutenberg-output
13/07/11 15:30:44 INFO input.FileInputFormat: Total input
paths to process : 7
13/07/11 15:30:44 INFO util.NativeCodeLoader: Loaded the
native-hadoop library
13/07/11 15:30:44 WARN snappy.LoadSnappy: Snappy native
library not loaded
13/07/11 15:30:44 INFO mapred.JobClient: Running job:
job_201307111355_0015
13/07/11 15:30:45 INFO mapred.JobClient:  map 0% reduce 0%
13/07/11 15:31:03 INFO mapred.JobClient:  map 42% reduce 0%
13/07/11 15:31:06 INFO mapred.JobClient:  map 57% reduce 0%
13/07/11 15:31:09 INFO mapred.JobClient:  map 71% reduce 0%
13/07/11 15:31:15 INFO mapred.JobClient:  map 100% reduce 0%
13/07/11 15:31:18 INFO mapred.JobClient:  map 100% reduce 23%
13/07/11 15:31:27 INFO mapred.JobClient:  map 100% reduce 100%
13/07/11 15:31:32 INFO mapred.JobClient: Job complete:
job_201307111355_0015
13/07/11 15:31:32 INFO mapred.JobClient: Counters: 30
13/07/11 15:31:32 INFO mapred.JobClient:   Job Counters
13/07/11 15:31:32 INFO mapred.JobClient: Launched reduce
tasks=1
13/07/11 15:31:32 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=67576
13/07/11 15:31:32 INFO mapred.JobClient: Total time spent
by all reduces waiting after reserving slots (ms)=0
13/07/11 15:31:32 INFO mapred.JobClient: Total time spent
by all maps waiting after reserving slots (ms)=0
13/07/11 15:31:32 INFO mapred.JobClient: Rack-local map
tasks=3
13/07/11 15:31:32 INFO mapred.JobClient: Launched map tasks=7
13/07/11 15:31:32 INFO mapred.JobClient: Data-local map
tasks=4
13/07/11 15:31:32 INFO mapred.JobClient:
SLOTS_MILLIS_REDUCES=21992
13/07/11 15:31:32 INFO mapred.JobClient:   File Output Format
Counters
13/07/11 15:31:32 INFO mapred.JobClient: Bytes Written=1412505
13/07/11 15:31:32 INFO mapred.JobClient: FileSystemCounters
13/07/11 15:31:32 INFO mapred.JobClient: FILE_BYTES_READ=5414195
13/07/11 15:31:32 INFO mapred.JobClient: HDFS_BYTES_READ=6950820
13/07/11 15:31:32 INFO mapred.JobClient:
FILE_BYTES_WRITTEN=8744993
13/07/11 15:31:32 INFO mapred.JobClient:
HDFS_BYTES_WRITTEN=1412505
13/07/11 15:31:32 INFO mapred.JobClient:   File Input Format
Counters
13/07/11 15:31:32 INFO mapred.JobClient: Bytes Read=6950001
13/07/11 15:31:32 INFO mapred.JobClient:   Map-Reduce Framework
13/07/11 15:31:32 INFO mapred.JobClient: Map output
materialized bytes=3157469
13/07/11 15:31:32 INFO mapred.JobClient: Map input
records=137146
13/07/11 15:31:32 INFO mapred.JobClient: Reduce shuffle
bytes=2904836
13/07/11 15:31:32 INFO mapred.JobClient: Spilled
Records=594764
13/07/11 15:31:32 INFO mapred.JobClient: Map output
bytes=11435849
13/07/11 15:31:32 INFO mapred.JobClient: Total committed
heap usage (bytes)=1128136704
13/07/11 15:31:32 INFO mapred.JobClient: CPU time spent
(ms)=18230
13/07/11 15:31:32 INFO mapred.JobClient: Combine input
records=1174991
13/07/11 15:31:32 INFO mapred.JobClient: SPLIT_RAW_BYTES=819
13/07/11 15:31:32 INFO mapred.JobClient: Reduce input
records=218990
13/07/11 15:31:32 INFO mapred.JobClient: Reduce input
groups=128513
13/07/11 15:31:32 INFO mapred.JobClient: Combine output
records=218990
13/07/11 15:31:32 INFO mapred.JobClient: Physical memory
(bytes) snapshot=1179656192
13/07/11 15:31:32 INFO mapred.JobClient: Reduce output
records=128513
13/07/11 

RE: New Distributed Cache

2013-07-11 Thread Botelho, Andrew
So in my driver code, I try to store the file in the cache with this line of 
code:

job.addCacheFile(new URI(file location));

Then in my Mapper code, I do this to try and access the cached file:

URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);

However, I get a NullPointerException when I do that in the Mapper code.

Any suggesstions?

Andrew

From: Shahab Yunus [mailto:shahab.yu...@gmail.com]
Sent: Wednesday, July 10, 2013 9:43 PM
To: user@hadoop.apache.org
Subject: Re: New Distributed Cache

Also, once you have the array of URIs after calling getCacheFiles  you can 
iterate over them using File class or Path 
(http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))

Regards,
Shahab

On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi 
ojo...@hortonworks.commailto:ojo...@hortonworks.com wrote:
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
Hortonworks Inc.http://www.hortonworks.com

On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew 
andrew.bote...@emc.commailto:andrew.bote...@emc.com wrote:
Hi,

I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this 
code:

Job job = Job.getInstance();
job.addCacheFile(new URI(file name));

That all compiles fine.  In the Mapper code, I try accessing the cached file 
with this method:

Path[] localPaths = context.getLocalCacheFiles();

However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I 
am using Hadoop 2.0.5)

Thanks in advance,

Andrew




Re: Cloudera links and Document

2013-07-11 Thread Suresh Srinivas
Sathish, this mailing list for Apache Hadoop related questions. Please post
questions related to other distributions to appropriate vendor's mailing
list.



On Thu, Jul 11, 2013 at 6:28 AM, Sathish Kumar sa848...@gmail.com wrote:

 Hi All,

 Can anyone help me the link or document that explain the below.

 How Cloudera Manager works and handle the clusters (Agent and Master
 Server)?
 How the Cloudera Manager Process Flow works?
 Where can I locate Cloudera configuration files and explanation in brief?


 Regards
 Sathish




-- 
http://hortonworks.com/download/


How are 'PHYSICAL_MEMORY_BYTES' and 'VIRTUAL_MEMORY_BYTES' calculated?

2013-07-11 Thread hadoop qi
Hello,

I am wondering how memory counters  'PHYSICAL_MEMORY_BYTES'  and
'VIRTUAL_MEMORY_BYTES'  are calculated? They are peaks of memory usage or
cumulative usage?

Thanks for help,


Re: Cloudera links and Document

2013-07-11 Thread Alejandro Abdelnur
Satish,

the right alias for Cloudera Manager questions scm-us...@cloudera.org

Thanks


On Thu, Jul 11, 2013 at 9:20 AM, Suresh Srinivas sur...@hortonworks.comwrote:

 Sathish, this mailing list for Apache Hadoop related questions. Please
 post questions related to other distributions to appropriate vendor's
 mailing list.



 On Thu, Jul 11, 2013 at 6:28 AM, Sathish Kumar sa848...@gmail.com wrote:

 Hi All,

 Can anyone help me the link or document that explain the below.

 How Cloudera Manager works and handle the clusters (Agent and Master
 Server)?
 How the Cloudera Manager Process Flow works?
 Where can I locate Cloudera configuration files and explanation in brief?


 Regards
 Sathish




 --
 http://hortonworks.com/download/




-- 
Alejandro


Re: New Distributed Cache

2013-07-11 Thread Omkar Joshi
Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
api which is returning null..

 Path[] cachedFilePaths =

  context.getLocalCacheFiles(); // I am checking why it is
deprecated...

  for (Path cachedFilePath : cachedFilePaths) {

File cachedFile = new File(cachedFilePath.toUri().getRawPath());

System.out.println(cached fie path  

+ cachedFile.getAbsolutePath());

  }

I hope this helps for the time being.. JobContext was suppose to replace
DistributedCache api (it will be deprecated) however there is some problem
with that or I am missing something... Will reply if I find the solution to
it.

context.getCacheFiles will give you the uri used for localizing files...
(original uri used for adding it to cache)... However you can use
DistributedCache.getCacheFiles() api till context api is fixed.

context.getLocalCacheFiles .. will give you the actual file path on node
manager... (after file is localized).

Thanks,
Omkar Joshi
*Hortonworks Inc.* http://www.hortonworks.com


On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew andrew.bote...@emc.comwrote:

 So in my driver code, I try to store the file in the cache with this line
 of code:

 ** **

 job.addCacheFile(new URI(file location));

 ** **

 Then in my Mapper code, I do this to try and access the cached file:

 ** **

 URI[] localPaths = context.getCacheFiles();

 File f = new File(localPaths[0]);

 ** **

 However, I get a NullPointerException when I do that in the Mapper code.**
 **

 ** **

 Any suggesstions?

 ** **

 Andrew

 ** **

 *From:* Shahab Yunus [mailto:shahab.yu...@gmail.com]
 *Sent:* Wednesday, July 10, 2013 9:43 PM
 *To:* user@hadoop.apache.org
 *Subject:* Re: New Distributed Cache

 ** **

 Also, once you have the array of URIs after calling getCacheFiles  you
 can iterate over them using File class or Path (
 http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
 )

 ** **

 Regards,

 Shahab

 ** **

 On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi ojo...@hortonworks.com
 wrote:

 did you try JobContext.getCacheFiles() ?

 ** **


 

 Thanks,

 Omkar Joshi

 *Hortonworks Inc.* http://www.hortonworks.com

 ** **

 On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew andrew.bote...@emc.com
 wrote:

 Hi,

  

 I am trying to store a file in the Distributed Cache during my Hadoop job.
 

 In the driver class, I tell the job to store the file in the cache with
 this code:

  

 Job job = Job.getInstance();

 job.addCacheFile(new URI(file name));

  

 That all compiles fine.  In the Mapper code, I try accessing the cached
 file with this method:

  

 Path[] localPaths = context.getLocalCacheFiles();

  

 However, I am getting warnings that this method is deprecated.

 Does anyone know the newest way to access cached files in the Mapper code?
 (I am using Hadoop 2.0.5)

  

 Thanks in advance,

  

 Andrew

 ** **

 ** **



Re: copy files from ftp to hdfs in parallel, distcp failed

2013-07-11 Thread பாலாஜி நாராயணன்
On 11 July 2013 06:27, Hao Ren h@claravista.fr wrote:

 Hi,

 I am running a hdfs on Amazon EC2

 Say, I have a ftp server where stores some data.


I just want to copy these data directly to hdfs in a parallel way (which
 maybe more efficient).

 I think hadoop distcp is what I need.


http://hadoop.apache.org/docs/stable/distcp.html

DistCp (distributed copy) is a tool used for large inter/intra-cluster
copying. It uses MapReduce to effect its distribution, error handling and
recovery, and reporting


I doubt this is going to help. Are these lot of files. If yes, how about
multiple copy jobs to hdfs?
-balaji


Re: CompositeInputFormat

2013-07-11 Thread Jay Vyas
Map Side joins will use the CompositeInputFormat.  They will only really be
worth doing if one data set is small, and the other is large.

This is a good example :
http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/

the trick is to google for CompositeInputFormat.compose()  :)


On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew andrew.bote...@emc.comwrote:

 Hi,

 ** **

 I want to perform a JOIN on two sets of data with Hadoop.  I read that the
 class CompositeInputFormat can be used to perform joins on data, but I
 can’t find any examples of how to do it.

 Could someone help me out? It would be much appreciated. J

 ** **

 Thanks in advance,

 ** **

 Andrew




-- 
Jay Vyas
http://jayunit100.blogspot.com


RE: CompositeInputFormat

2013-07-11 Thread Botelho, Andrew
Sorry I should've specified that I need an example of CompositeInputFormat that 
uses the new API.
The example linked below uses old API objects like JobConf.

Any known examples of CompositeInputFormat using the new API?

Thanks in advance,

Andrew

From: Jay Vyas [mailto:jayunit...@gmail.com]
Sent: Thursday, July 11, 2013 5:10 PM
To: common-u...@hadoop.apache.org
Subject: Re: CompositeInputFormat

Map Side joins will use the CompositeInputFormat.  They will only really be 
worth doing if one data set is small, and the other is large.
This is a good example : 
http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/
the trick is to google for CompositeInputFormat.compose()  :)

On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew 
andrew.bote...@emc.commailto:andrew.bote...@emc.com wrote:
Hi,

I want to perform a JOIN on two sets of data with Hadoop.  I read that the 
class CompositeInputFormat can be used to perform joins on data, but I can't 
find any examples of how to do it.
Could someone help me out? It would be much appreciated. :)

Thanks in advance,

Andrew



--
Jay Vyas
http://jayunit100.blogspot.com


Staging directory ENOTDIR error.

2013-07-11 Thread Jay Vyas
Hi , I'm getting an ungoogleable exception, never seen this before.

This is on a hadoop 1.1. cluster... It appears that its permissions
related...
Any thoughts as to how this could crop up?

I assume its a bug in my filesystem, but not sure.

13/07/11 18:39:43 ERROR security.UserGroupInformation:
PriviledgedActionException as:root cause:ENOTDIR: Not a directory
ENOTDIR: Not a directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:699)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:654)
at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)


-- 
Jay Vyas
http://jayunit100.blogspot.com


Re: Issues Running Hadoop 1.1.2 on multi-node cluster

2013-07-11 Thread siddharth mathur
I figured out the issue!

The problem was in the permission to rum Hadoop scripts from root user. I
create a dedicated hadoop user to rum hadoop cluster but one of the time i
accidentally started hadoop from root. Hence, some of the permissions of
hadoop scripts changed.

The solution is to again change the ownership of the hadoop folder to the
dedicated user using chown. Its working fine now.


Thanks a lot for the pointers!


Regards,
Siddharth


On Thu, Jul 11, 2013 at 1:43 AM, Ram pramesh...@gmail.com wrote:

 Hi,
Please check all directories/files are existed in local system
 configured mapres-site.xml and permissions to the files/directories as
 mapred as user and hadoop as a group.

 Hi,



 From,
 P.Ramesh Babu,
 +91-7893442722.



 On Wed, Jul 10, 2013 at 9:36 PM, Leonid Fedotov 
 lfedo...@hortonworks.comwrote:

 Make sure your mapred.local.dir (check it in mapred-site.xml) is actually
 exists and writable by your mapreduce usewr.

  *Thank you!*
 *
 *
 *Sincerely,*
 *Leonid Fedotov*


 On Jul 9, 2013, at 6:09 PM, Kiran Dangeti wrote:

 Hi Siddharth,

 While running the multi-node we need to take care of the local host of
 the slave machine from the error messages the task tracker root directory
 not able to get to the masters. Please check and rerun it.

 Thanks,
 Kiran


 On Tue, Jul 9, 2013 at 10:26 PM, siddharth mathur sidh1...@gmail.comwrote:

 Hi,

 I have installed Hadoop 1.1.2 on a 5 nodes cluster. I installed it
 watching this tutorial *
 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
 *

 When I startup the hadoop, I get the folloing error in *all* the
 tasktrackers.

 
 2013-07-09 12:15:22,301 INFO org.apache.hadoop.mapred.UserLogCleaner:
 Adding job_201307051203_0001 for user-log deletion with
 retainTimeStamp:1373472921775
 2013-07-09 12:15:22,301 INFO org.apache.hadoop.mapred.UserLogCleaner:
 Adding job_201307051611_0001 for user-log deletion with
 retainTimeStamp:1373472921775
 2013-07-09 12:15:22,601 INFO org.apache.hadoop.mapred.TaskTracker:*Failed 
 to get system directory
 *...
 2013-07-09 12:15:25,164 INFO org.apache.hadoop.mapred.TaskTracker:
 Failed to get system directory...
 2013-07-09 12:15:27,901 INFO org.apache.hadoop.mapred.TaskTracker:
 Failed to get system directory...
 2013-07-09 12:15:30,144 INFO org.apache.hadoop.mapred.TaskTracker:
 Failed to get system directory...
 

 *But everything looks fine in the webUI. *

 When I run a job, I get the following error but the job completes
 anyways. I have* attached the* *screenshots* of the maptask failed
 error log in the UI.

 **
 13/07/09 12:29:37 INFO input.FileInputFormat: Total input paths to
 process : 2
 13/07/09 12:29:37 INFO util.NativeCodeLoader: Loaded the native-hadoop
 library
 13/07/09 12:29:37 WARN snappy.LoadSnappy: Snappy native library not
 loaded
 13/07/09 12:29:37 INFO mapred.JobClient: Running job:
 job_201307091215_0001
 13/07/09 12:29:38 INFO mapred.JobClient:  map 0% reduce 0%
 13/07/09 12:29:41 INFO mapred.JobClient: Task Id :
 attempt_201307091215_0001_m_01_0, Status : FAILED
 Error initializing attempt_201307091215_0001_m_01_0:
 ENOENT: No such file or directory
 at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
 at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:699)
 at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:654)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
 at
 org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:240)
 at
 org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:205)
 at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1331)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
 at
 org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1306)
 at
 org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1221)
 at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2581)
 at java.lang.Thread.run(Thread.java:724)

 13/07/09 12:29:41 WARN mapred.JobClient: Error reading task
 outputhttp://dmkd-1:50060/tasklog?plaintext=trueattemptid=attempt_201307091215_0001_m_01_0filter=stdout
 13/07/09 12:29:41 WARN mapred.JobClient: Error reading task
 outputhttp://dmkd-1:50060/tasklog?plaintext=trueattemptid=attempt_201307091215_0001_m_01_0filter=stderr
 13/07/09 12:29:45 INFO mapred.JobClient:  map 50% reduce 0%
 13/07/09 12:29:53 INFO mapred.JobClient:  map 50% reduce 16%
 13/07/09 12:30:38 INFO mapred.JobClient: Task Id :
 attempt_201307091215_0001_m_00_1, Status : FAILED
 Error initializing attempt_201307091215_0001_m_00_1:
 ENOENT: No such file or 

RE: CompositeInputFormat

2013-07-11 Thread Devaraj k
Hi Andrew,

You could make use of hadoop data join classes to perform the join or you can 
refer these classes for better idea to perform join.

http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-datajoin

Thanks
Devaraj k

From: Botelho, Andrew [mailto:andrew.bote...@emc.com]
Sent: 12 July 2013 03:33
To: user@hadoop.apache.org
Subject: RE: CompositeInputFormat

Sorry I should've specified that I need an example of CompositeInputFormat that 
uses the new API.
The example linked below uses old API objects like JobConf.

Any known examples of CompositeInputFormat using the new API?

Thanks in advance,

Andrew

From: Jay Vyas [mailto:jayunit...@gmail.com]
Sent: Thursday, July 11, 2013 5:10 PM
To: common-u...@hadoop.apache.orgmailto:common-u...@hadoop.apache.org
Subject: Re: CompositeInputFormat

Map Side joins will use the CompositeInputFormat.  They will only really be 
worth doing if one data set is small, and the other is large.
This is a good example : 
http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/
the trick is to google for CompositeInputFormat.compose()  :)

On Thu, Jul 11, 2013 at 5:02 PM, Botelho, Andrew 
andrew.bote...@emc.commailto:andrew.bote...@emc.com wrote:
Hi,

I want to perform a JOIN on two sets of data with Hadoop.  I read that the 
class CompositeInputFormat can be used to perform joins on data, but I can't 
find any examples of how to do it.
Could someone help me out? It would be much appreciated. :)

Thanks in advance,

Andrew



--
Jay Vyas
http://jayunit100.blogspot.com


RE: Staging directory ENOTDIR error.

2013-07-11 Thread Devaraj k
Hi Jay,

   Here client is trying to create a staging directory in local file system,  
which actually should create in HDFS.

Could you check whether do you have configured fs.defaultFS configuration in 
client with the HDFS.


Thanks
Devaraj k

From: Jay Vyas [mailto:jayunit...@gmail.com]
Sent: 12 July 2013 04:12
To: common-u...@hadoop.apache.org
Subject: Staging directory ENOTDIR error.

Hi , I'm getting an ungoogleable exception, never seen this before.
This is on a hadoop 1.1. cluster... It appears that its permissions related...
Any thoughts as to how this could crop up?
I assume its a bug in my filesystem, but not sure.

13/07/11 18:39:43 ERROR security.UserGroupInformation: 
PriviledgedActionException as:root cause:ENOTDIR: Not a directory
ENOTDIR: Not a directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:699)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:654)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at 
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at 
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)


--
Jay Vyas
http://jayunit100.blogspot.com