Re: How to obtain the exception actually failed the job on Mapper or Reducer at runtime?

2013-12-10 Thread Silvina Caíno Lores
Hi, You can check the userlogs directory where the job and attempt logs are stored. For each attempt you should have a stderr, stdout and syslog file. The first two hold the program output for each stream (useful for debug purposes), while the last contains execution details provided by the platfo

Re: issue about corrupt block test

2013-12-10 Thread Harsh J
Block files are not stored in a flat directory (to avoid FS limits of max files under a dir). Instead of looking for them right under finalized, issue a "find" query with the pattern instead and you should be able to spot it. On Wed, Dec 11, 2013 at 9:10 AM, ch huang wrote: > hi,maillist: >

Re: Writing to remote HDFS using C# on Windows

2013-12-10 Thread Fengyun RAO
thanks, Ian, It works! The problem is that I could APPEND a file, create a directory, but not CREATE a file, or copy a file. when I try to create, the log at nfs gateway says: ERROR nfs3.RpcProgramNfs3: Setting file size is not supported when creating file: test dir fileId:16386 However, if I moun

Re: Hadoop-MapReduce

2013-12-10 Thread Ranjini Rathinam
hi, I have fixed the error , the code is running fine, but this code just split the part of the tag. i want to convert into text format so that i can load them into tables of hbase and hive. I have used the DOM Parser but this parser uses File as Object but hdfs uses FileSystem. Eg, File fXml

Re: issue about Shuffled Maps in MR job summary

2013-12-10 Thread ch huang
i read the doc, and find if i have 8 reducer ,a map task will output 8 partition ,each partition will be send to a different reducer, so if i increase reduce number ,the partition number increase ,but the volume on network traffic is same,why sometime ,increase reducer number will not decrease job

RE: issue about Shuffled Maps in MR job summary

2013-12-10 Thread Vinayakumar B
It looks simple, :) Shuffled Maps= Number of Map Tasks * Number of Reducers Thanks and Regards, Vinayakumar B From: ch huang [mailto:justlo...@gmail.com] Sent: 11 December 2013 10:56 To: user@hadoop.apache.org Subject: issue about Shuffled Maps in MR job summary hi,maillist: i run te

issue about Shuffled Maps in MR job summary

2013-12-10 Thread ch huang
hi,maillist: i run terasort with 16 reducers and 8 reducers,when i double reducer number, the Shuffled maps is also double ,my question is the job only run 20 map tasks (total input file is 10,and each file is 100M,my block size is 64M,so split is 20) why i need shuffle 160 maps in 8 red

issue about corrupt block test

2013-12-10 Thread ch huang
hi,maillist: i try to corrupt a block of a file in my benchmark environment, as the following command i find blk_2504407693800874616_106252 ,it's replica on 192.168.10.224 is my target ,but i find all the datadir in 192.168.10.224 ,can not fine the datafile belongs to this replic ,why?

How to obtain the exception actually failed the job on Mapper or Reducer at runtime?

2013-12-10 Thread Kan Tao
Hi guys, Does anyone knows how to ‘capture’ the exception which actually failed the job running on Mapper or Reducer at runtime? It seems Hadoop is designed to be fault tolerant that the failed jobs will be automatically rerun for a certain amount of times and won’t actually expose the real prob

Re: multiusers in hadoop through LDAP

2013-12-10 Thread Jay Vyas
So, not knowing much about LDAP, but being very interested in the multiuser problem on multiuser filesystems, i was excited to see this question Im researching the same thing at the moment, and it seems obviated by the fact that : - the FileSystem API itslef provides implementations for gettin

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread ch huang
thanks for reply,but if the block just has 1 corrupt replica,hdfs fsck can not tell you which block of which file has a replica been corrupted,fsck just useful on all of one block's replica bad On Wed, Dec 11, 2013 at 10:01 AM, Adam Kawa wrote: > When you identify a file with corrupt block(s),

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread Adam Kawa
When you identify a file with corrupt block(s), then you can locate the machines that stores its block by typing $ sudo -u hdfs hdfs fsck -files -blocks -locations 2013/12/11 Adam Kawa > Maybe this can work for you > $ sudo -u hdfs hdfs fsck / -list-corruptfileblocks > ? > > > 2013/12/11 ch hu

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread Adam Kawa
Maybe this can work for you $ sudo -u hdfs hdfs fsck / -list-corruptfileblocks ? 2013/12/11 ch huang > thanks for reply, what i do not know is how can i locate the block which > has the corrupt replica,(so i can observe how long the corrupt replica will > be removed and a new health replica rep

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread ch huang
thanks for reply, what i do not know is how can i locate the block which has the corrupt replica,(so i can observe how long the corrupt replica will be removed and a new health replica replace it,because i get nagios alert for three days,i do not sure if it is the same corrupt replica cause the ale

RE: how to handle the corrupt block in HDFS?

2013-12-10 Thread Vinayakumar B
Hi ch huang, It may seem strange, but the fact is, CorruptBlocks through JMX or http://NNIP:50070/jmx means "Number of blocks with corrupt replicas". May not be all replicas are corrupt. This you can check through jconsole for description. Where as Corrupt blocks throug

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread ch huang
"By default this higher replication level is 10. " is this value can be control via some option or variable? i only hive a 5-worknode cluster,and i think 5 replicas should be better,because every node can get a local replica. another question is ,why hdfs fsck check the cluster is healthy and no c

Re: issue about running example job use custom mapreduce var

2013-12-10 Thread ch huang
yes,you are right ,thanks On Wed, Dec 11, 2013 at 7:16 AM, Adam Kawa wrote: > Accidentally, I clicked "Sent" by a mistake. Plase try: > > hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar > terasort *-Dmapreduce.job.reduces=34* /alex/terasort/1G-input > /alex/terasort/1G-output

Re: multiusers in hadoop through LDAP

2013-12-10 Thread Adam Kawa
Please have a look at hadoop.security.group.mapping.ldap.* settings as Hardik Pandya suggests. = In advance, just to share our story related to LDAP + hadoop.security.group.mapping.ldap.*, if you run into the same limitation as we did: In many cases hadoop.security.group.mapping.ldap.* shoul

Re: Job stuck in running state on Hadoop 2.2.0

2013-12-10 Thread Adam Kawa
It sounds like the job was successfully submitted to the cluster, but there as some problem when starting/running AM, so that no progress is made. It happened to me once, when I was playing with YARN on a cluster consisting of very small machines, and I mis-configured YARN to allocated to AM more m

Re: Versioninfo and platformName issue.

2013-12-10 Thread Adam Kawa
Hi, Do you have Hadoop libs properly installed? Does "$ hadoop version" command run successfully? If true, then It sounds like some classpath issue... 2013/12/10 Manish Bhoge > Sent from Rocket Mail via Android > > -- > * From: * Manish Bhoge ; > * To: * user@hadoo

Re: issue about running example job use custom mapreduce var

2013-12-10 Thread Adam Kawa
Accidentally, I clicked "Sent" by a mistake. Plase try: hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort *-Dmapreduce.job.reduces=34* /alex/terasort/1G-input /alex/terasort/1G-output 2013/12/11 Adam Kawa > Please try > > > > 2013/12/10 ch huang > >> hi,maillist: >>

Re: issue about running example job use custom mapreduce var

2013-12-10 Thread Adam Kawa
Please try 2013/12/10 ch huang > hi,maillist: > i try assign reduce number in commandline but seems not > useful,i run tera sort like this > > # hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar > terasort /alex/terasort/1G-input /alex/terasort/1G-output > -Dma

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread Patai Sangbutsarakum
10 copies for those job.jar and split are controlled by mapred.submit.replication property at job init level. On Mon, Dec 9, 2013 at 5:20 PM, ch huang wrote: > more strange , in my HDFS cluster ,every block has three replicas,but i > find some one has ten replicas ,why? > > # sudo -u hdfs hado

AM web app security in 2.2

2013-12-10 Thread Thomas Weise
What is the recommended way to secure the AM app master web app (if any) in 2.x? The client can obtain the tracking URL from the app report using appReport.getOriginalTrackingUrl() and bypass the RM proxy. In case of POST request, which is not currently supported by the proxy, that is the only opti

Re: multiusers in hadoop through LDAP

2013-12-10 Thread Hardik Pandya
have you looked at hadoop.security.group.mapping.ldap.* in hadoop-common/core-default.xml additional resource

Re: set up a hadoop cluster

2013-12-10 Thread Smarty Juice
hadoop default replication factor is 3 configured in hdfs-default.xml(1 master (namenode) and 2 slaves - data nodes ) useful resource

Re: Unable to run dfsadmin -upgradeProgress status in Apache Hadoop 2.1.0-beta

2013-12-10 Thread Smarty Juice
Can you please try below instead? let me know if it works hadoop dfsadmin -upgradeProgress status On Tue, Dec 10, 2013 at 6:00 AM, Nirmal Kumar wrote: > Hi All, > > > > hadoop dfsadmin -help > > DEPRECATED: Use of this script to execute hdfs command is deprecated. > > Instead use the hdfs c

Re: set up a hadoop cluster

2013-12-10 Thread Smarty Juice
Hadoop default replication factor is 3 and you can configure it in hdfs-default.xml, youu should have one Master (NameNode and 2 Slaves (Data Nodes) and please follow http://www.michael-noll.com/tutorials/runn

Re: Execute hadoop job remotely and programmatically

2013-12-10 Thread Mirko Kämpf
Hi Yexi, please have a look at the -libjars option of the hadoop cmd. It tells the system what additional libs have to be sent to the cluster before the job can start. Each time you submit the job, this kind of distribution happens again. So its not a good idea for really large libs, those you sho

Unable to run dfsadmin -upgradeProgress status in Apache Hadoop 2.1.0-beta

2013-12-10 Thread Nirmal Kumar
Hi All, hadoop dfsadmin -help DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. hadoop dfsadmin is the command to execute DFS administrative commands. The full syntax is: hadoop dfsadmin [-report] [-safemode ] [-saveNamespace]

Re: Compression LZO class not found issue in Hadoop-2.2.0

2013-12-10 Thread shashwat shriparv
Set the class path to where hadoop*Lzo jar file is, and then try.. *Thanks & Regards* ∞ Shashwat Shriparv On Tue, Dec 10, 2013 at 4:00 PM, Vinayakumar B wrote: > Hi Viswa, > > > > Sorry for the late reply, > > > > Have you restarted NodeManagers after copying the lzo jars to lib? > > > >

Re: how to handle the corrupt block in HDFS?

2013-12-10 Thread shashwat shriparv
How many nodes you have? and if fsck is giving you healthy status no need to worry. with the replication 10 what i may conclude that you have 10 listed datanodes so 10 replicated jar files for the job to run. *Thanks & Regards* ∞ Shashwat Shriparv On Tue, Dec 10, 2013 at 3:50 PM, Vinayakum

RE: Compression LZO class not found issue in Hadoop-2.2.0

2013-12-10 Thread Vinayakumar B
Hi Viswa, Sorry for the late reply, Have you restarted NodeManagers after copying the lzo jars to lib? Thanks and Regards, Vinayakumar B From: Viswanathan J [mailto:jayamviswanat...@gmail.com] Sent: 06 December 2013 23:32 To: user@hadoop.apache.org Subject: Compression LZO class not found issue

RE: how to handle the corrupt block in HDFS?

2013-12-10 Thread Vinayakumar B
Hi ch huang, It may seem strange, but the fact is, CorruptBlocks through JMX means "Number of blocks with corrupt replicas". May not be all replicas are corrupt. This you can check though jconsole for description. Where as Corrupt blocks through fsck means, blocks with all replicas corrupt(no

Re: Job stuck in running state on Hadoop 2.2.0

2013-12-10 Thread Silvina Caíno Lores
Thank you! I realized that, despite I exported the variables in the scripts, there were a few errors and my desired configuration wasn't being used (which explained other strange behavior). However, I'm still getting the same issue with the examples, for instance: hadoop jar ~/hadoop-2.2.0-maven/

Re: Job stuck in running state on Hadoop 2.2.0

2013-12-10 Thread Taka Shinagawa
I had a similar problem after setting up Hadoop 2.2.0 based on the instructions at http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html Although it's not documented on the page, I needed to edit hadoop-env.sh and yarn-env.sh as well to update JAVA_HOME, HADOOP

RE: how to handle the corrupt block in HDFS?

2013-12-10 Thread Peter Marron
Hi, I am sure that there are others who will answer this better, but anyway. The default replication level for files in HDFS is 3 and so most files that you see will have a replication level of 3. However when you run a Map/Reduce job the system knows in advance that every node will need a copy of

multiusers in hadoop through LDAP

2013-12-10 Thread YouPeng Yang
Hi In my cluster ,I want to have multiusers for different purpose.The usual method is to add a user through the OS on Hadoop NameNode . I notice the hadoop also support to LDAP, could I add user through LDAP instead through OS? So that if a user is authenticated by the LDAP ,who will also ac