Re: No bulid.xml when to build FUSE

2013-04-13 Thread YouPeng Yang
Hi Harsh I have found out the reason and the solutions after I check fuse-dfs source code.So I reply again to close this question. The reason that error come out for is the hadoop-*-.jars need to be in the CLASSPATH.So I add them to the CLASSPATH,and it work. Thank you Regards.

Re: Maven build YARN ResourceManager only

2013-04-13 Thread Chin-Jung Hsu
Hi Chris, Appreciate your help, and sorry for the crossposting on both 'user' and 'yarn-dev'. I first posted on yarn-dev and didn't see anything. I then thought I might not be able to post on that list. That's why I posted it again on 'user'. Should I post this kind of questions on 'yarn-dev'

Mapper always hangs at the same spot

2013-04-13 Thread Chris Hokamp
Hello, We have a job where all mappers finish except for one, which always hangs at the same spot (i.e. reaches 49%, then never progresses). This is likely due to a bug in the wiki parser in our Pig UDF. We can afford to lose the data this mapper is working on if it would allow the job to

Re: Mapper always hangs at the same spot

2013-04-13 Thread Harsh J
When you say never progresses, do you see the MR framework kill it automatically after 10 minutes of inactivity or does it never ever exit? You can lower the timeout period on tasks via mapred.task.timeout set in msec. You could also set mapred.max.map.failures.percent to a non-zero value to

Re: Maven build YARN ResourceManager only

2013-04-13 Thread Chris Nauroth
No problem! I think yarn-dev is appropriate, so I'm removing user (bcc'd one last time). The user list is focused on how to use Hadoop, and the *-dev lists are focused on how to develop Hadoop. What specific problem are you seeing when you try to compile hadoop-yarn-server-resourcemanager

Re: Mapper always hangs at the same spot

2013-04-13 Thread Chris Hokamp
When you say never progresses, do you see the MR framework kill it automatically after 10 minutes of inactivity or does it never ever exit? The latter -- it never exits. Killing it manually seems like a good option for now. We already have mapred.max.map.failures.percent set to a non-zero value,

Re: Why do some blocks refuse to replicate...?

2013-04-13 Thread Felix GV
Oups, I just peeked inside my drafts folder and saw this update I never sent out. Here goes: Ok well rsyncing everything (including the whole subdirXX hierarchies) and restarting the destination DN worked. I'll definitely have to script it with something similar to what you suggested if I hit

Job cleanup

2013-04-13 Thread Robert Dyer
What does the job cleanup task do? My understanding was it just cleaned up any intermediate/temporary files and moved the reducer output to the output directory? Does it do more? One of my jobs runs, all maps and reduces finish, but then the job cleanup task never finishes. Instead it gets

Re: Mapper always hangs at the same spot

2013-04-13 Thread Edward Capriolo
Your application logic is likely stuck in a loop. On Sat, Apr 13, 2013 at 12:47 PM, Chris Hokamp chris.hok...@gmail.comwrote: When you say never progresses, do you see the MR framework kill it automatically after 10 minutes of inactivity or does it never ever exit? The latter -- it never

Few noob MR questions

2013-04-13 Thread Vjeran Marcinko
Hello, I am complete Hadoop and MR newbiew, so please help me with following. I can see that primary way to submit Hadoop MR job is via following command (wordcount example): hadoop jar wordcount.jar org.mycompany.WordCount 1.Although, looking at all MR examples out there, I

Re: R environment with Hadoop

2013-04-13 Thread Jens Scheidtmann
Dear Rahul, check out http://blog.revolutionanalytics.com/2012/03/r-and-hadoop-step-by-step-tutorials.html Also there is Introduction to Data Science on Coursera, https://www.coursera.org/course/datasci, which among other topics also covers Hadoop and R. Best regards, Jens

Re: Few noob MR questions

2013-04-13 Thread Jens Scheidtmann
Dear Vjeran, your own jobs should implement the Tool Interface and ToolRunner. This gives additional standard options on the command line. Also have a look at class ProgramDriver as used here:

Re: Hadoop update mysql table

2013-04-13 Thread Jens Scheidtmann
Dear Linlin, seems you are having a SQL problem and not a hadoop one. If you INSERT another row with the same primary key, you will get this error. Are you really using UPDATE for writing back? Also check if MERGE is more appropriate, see http://en.wikipedia.org/wiki/Merge_%28SQL%29 Best

Re: Hadoop update mysql table

2013-04-13 Thread Sékine Coulibaly
Linlin, A primary key means a unicity constraint being set on that column. You need to update that line, not insert it, unless you use INSERT ... ON DUPLICATE KEY UPDATE ... to perform your upserts. This sounds more like a mySQL issue. BR 2013/4/14 Jens Scheidtmann jens.scheidtm...@gmail.com

Re: Mapper always hangs at the same spot

2013-04-13 Thread Azuryy Yu
agree. just check your app. or paste map code here. --Send from my Sony mobile. On Apr 14, 2013 4:08 AM, Edward Capriolo edlinuxg...@gmail.com wrote: Your application logic is likely stuck in a loop. On Sat, Apr 13, 2013 at 12:47 PM, Chris Hokamp chris.hok...@gmail.comwrote: When you say

Re: Mapper always hangs at the same spot

2013-04-13 Thread Chris Hokamp
The UDF and our Pig scripts work fine for most languages' wikidumps, and this hanging mapper issue only pops up with English wikidumps. It is certainly an issue with the wikiparser getting stuck in a recursive loop, and it must be a markup-related bug since this only happens with English. We're

Re: Few noob MR questions

2013-04-13 Thread Bjorn Jonsson
Correct, you can use java -jar to submit a job...with the driver code in a plain static main method. I do it all the time. You can of course run a Job straight from your IDE Java code also. You can check out the .runJar() method in the Hadoop API Javadoc to see what the hadoop command does

Re: Copy Vs DistCP

2013-04-13 Thread Ted Dunning
Lance, Never say never. Linux programs can read from the right kind of Hadoop cluster without using FUSE. On Fri, Apr 12, 2013 at 10:15 AM, Lance Norskog goks...@gmail.com wrote: Shell 'cp' only works if you use 'fuse', which makes the HDFS file system visible as a Unix mounted file

Best Hadoop dev environment [WAS: RE: Few noob MR questions]

2013-04-13 Thread Vjeran Marcinko
Hi again, You actually touched what I'm trying to do here - setup best Hadooop development environment. Moreoever, don't ask me why, my development machine is on Windows, so I don't have my Hadoop on it, so I use linux virtual machine with Hadoop running in it, so I would like mostly to