Help with Hadoop Eclipse Plugin on Mac OS X Lion
Hello, I am having problems getting my hadoop eclipse plugin to work on Mac OS X Lion. I have tried the following combinations: Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? Thank you for your time and help I greatly appreciate it! Sincerely, Will
RE: Help with Hadoop Eclipse Plugin on Mac OS X Lion
Oops guess the formatting went away: I have tried the following combinations: * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jar * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jar * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > From: seventeen_reas...@hotmail.com > To: common-user@hadoop.apache.org > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > Hello, > I am having problems getting my hadoop eclipse plugin to work on Mac OS X > Lion. > > I have tried the following combinations: > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop 0.20.203, Eclipse > 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse > 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > Thank you for your time and help I greatly appreciate it! > > > Sincerely, > > > Will > >
Does Hadoop 0.20.205 and Ganglia 3.1.7 compatible with each other ?
or Do I have to apply some hadoop patch for this ? Thanks, Praveenesh
Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion
Why do you need a plugin at all? you can do away with it by having a maven project i.e. having a pom.xml and setting hadoop as one of the dependencies. Then use regular maven commands to build etc.. e.g. mvn eclipse:eclipse would be an interesting command. On Fri, Dec 2, 2011 at 1:59 PM, Will L wrote: > > > Oops guess the formatting went away: > I have tried the following combinations: > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jar > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jar > * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.205.0.jar > > > From: seventeen_reas...@hotmail.com > > To: common-user@hadoop.apache.org > > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > > > > > > > > Hello, > > I am having problems getting my hadoop eclipse plugin to work on Mac OS > X Lion. > > > > I have tried the following combinations: > > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > 0.20.203, Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > > > > Thank you for your time and help I greatly appreciate it! > > > > > > Sincerely, > > > > > > Will > > > > > >
RE: Help with Hadoop Eclipse Plugin on Mac OS X Lion
I got the setup working under my laptop running OS X Snow Leopard without any problems and I would like to use my new laptop running OS X Lion. The plugin is helpful in that I can see hadoop output being dumped to the eclipse console and it used to integrate well with the Eclipse IDE making my development life a little easier. Thank you for your time and help. Sincerely, Will Lieu > Date: Fri, 2 Dec 2011 21:44:36 +0530 > Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion > From: prashant.ii...@gmail.com > To: common-user@hadoop.apache.org > > Why do you need a plugin at all? > > you can do away with it by having a maven project i.e. having a pom.xml and > setting hadoop as one of the dependencies. Then use regular maven commands > to build etc.. e.g. mvn eclipse:eclipse would be an interesting command. > > On Fri, Dec 2, 2011 at 1:59 PM, Will L wrote: > > > > > > > Oops guess the formatting went away: > > I have tried the following combinations: > > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jar > > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > > * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jar > > * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > > * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.205.0.jar > > > > > From: seventeen_reas...@hotmail.com > > > To: common-user@hadoop.apache.org > > > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > > > > > > > > > > > > > > > Hello, > > > I am having problems getting my hadoop eclipse plugin to work on Mac OS > > X Lion. > > > > > > I have tried the following combinations: > > > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 > > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > > 0.20.203, Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 > > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > > 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > > > > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > > > > > > > Thank you for your time and help I greatly appreciate it! > > > > > > > > > Sincerely, > > > > > > > > > Will > > > > > > > > > >
Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion
nice to know Will, well the way i said you have the same luxury as far as you are running in stand-alone mode which is ideal for development. On Fri, Dec 2, 2011 at 10:02 PM, Will L wrote: > > > I got the setup working under my laptop running OS X Snow Leopard without > any problems and I would like to use my new laptop running OS X Lion. > > The plugin is helpful in that I can see hadoop output being dumped to the > eclipse console and it used to integrate well with the Eclipse IDE making my > development life a little easier. > > Thank you for your time and help. > > Sincerely, > > Will Lieu > > > Date: Fri, 2 Dec 2011 21:44:36 +0530 > > Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > From: prashant.ii...@gmail.com > > To: common-user@hadoop.apache.org > > > > Why do you need a plugin at all? > > > > you can do away with it by having a maven project i.e. having a pom.xml > and > > setting hadoop as one of the dependencies. Then use regular maven > commands > > to build etc.. e.g. mvn eclipse:eclipse would be an interesting command. > > > > On Fri, Dec 2, 2011 at 1:59 PM, Will L >wrote: > > > > > > > > > > > Oops guess the formatting went away: > > > I have tried the following combinations: > > > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > > hadoop-eclipse-plugin-0.20.203.0.jar > > > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > > > * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), > > > hadoop-eclipse-plugin-0.20.203.0.jar > > > * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), > > > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > > > * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), > > > hadoop-eclipse-plugin-0.20.205.0.jar > > > > > > > From: seventeen_reas...@hotmail.com > > > > To: common-user@hadoop.apache.org > > > > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > > > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello, > > > > I am having problems getting my hadoop eclipse plugin to work on Mac > OS > > > X Lion. > > > > > > > > I have tried the following combinations: > > > > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 > > > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > > > 0.20.203, Eclipse 3.7.1 (32-bit), > > > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 > > > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > > > 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > > > > > > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > > > > > > > > > > Thank you for your time and help I greatly appreciate it! > > > > > > > > > > > > Sincerely, > > > > > > > > > > > > Will > > > > > > > > > > > > > > > >
How do I programmatically get total job execution time?
After my Hadoop job has successfully completed I'd like to log the total amount of time it took. This is the "Finished in" statistic in the web UI. How do I get this number programmatically? Is there some way I can query the Job object? I didn't see anything in the API documentation.
Re: How do I programmatically get total job execution time?
On Fri, Dec 2, 2011 at 9:57 AM, W.P. McNeill wrote: > After my Hadoop job has successfully completed I'd like to log the total > amount of time it took. This is the "Finished in" statistic in the web UI. > How do I get this number programmatically? Is there some way I can query > the Job object? I didn't see anything in the API documentation. This probably *doesn't* help you, but if you're using (or planning on using) oozie, it has a restful API that can give you this information. Thanks, Tom
Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion
I am running eclipse plugin in Lion OS X on eclipse 3.7. Take the plugin from contrib folder in dump to your eclipse plugin library. If doesn't work remove eclipse and reinstall a fresh version. -Jignesh On Dec 2, 2011, at 11:59 AM, Prashant Sharma wrote: > nice to know Will, well the way i said you have the same luxury as far as > you are running in stand-alone mode which is ideal for development. > > On Fri, Dec 2, 2011 at 10:02 PM, Will L wrote: > >> >> >> I got the setup working under my laptop running OS X Snow Leopard without >> any problems and I would like to use my new laptop running OS X Lion. >> >> The plugin is helpful in that I can see hadoop output being dumped to the >> eclipse console and it used to integrate well with the Eclipse IDE making my >> development life a little easier. >> >> Thank you for your time and help. >> >> Sincerely, >> >> Will Lieu >> >>> Date: Fri, 2 Dec 2011 21:44:36 +0530 >>> Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion >>> From: prashant.ii...@gmail.com >>> To: common-user@hadoop.apache.org >>> >>> Why do you need a plugin at all? >>> >>> you can do away with it by having a maven project i.e. having a pom.xml >> and >>> setting hadoop as one of the dependencies. Then use regular maven >> commands >>> to build etc.. e.g. mvn eclipse:eclipse would be an interesting command. >>> >>> On Fri, Dec 2, 2011 at 1:59 PM, Will L >> wrote: >>> Oops guess the formatting went away: I have tried the following combinations: * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jar * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jar * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > From: seventeen_reas...@hotmail.com > To: common-user@hadoop.apache.org > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > Hello, > I am having problems getting my hadoop eclipse plugin to work on Mac >> OS X Lion. > > I have tried the following combinations: > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > Thank you for your time and help I greatly appreciate it! > > > Sincerely, > > > Will > > >> >>
Re: How do I programmatically get total job execution time?
I remember hitting this once in 0.20 - seems like an API limitation. The resolution we took back then was to get a list of all tasks, and get the end time with the last ended task's completion time (sort and pick). There may be other ways though - others can comment on that perhaps (metrics? job-history?) On 02-Dec-2011, at 11:27 PM, W.P. McNeill wrote: > After my Hadoop job has successfully completed I'd like to log the total > amount of time it took. This is the "Finished in" statistic in the web UI. > How do I get this number programmatically? Is there some way I can query > the Job object? I didn't see anything in the API documentation.
Re: How do I programmatically get total job execution time?
As Harsh said, I don't think there is a simple way to way to find when the job ended, especially after the job is completed. But cant you just wait for your job to complete and log the time when the job completed? Raj > > From: Harsh J >To: common-user@hadoop.apache.org >Sent: Friday, December 2, 2011 12:53 PM >Subject: Re: How do I programmatically get total job execution time? > >I remember hitting this once in 0.20 - seems like an API limitation. The >resolution we took back then was to get a list of all tasks, and get the end >time with the last ended task's completion time (sort and pick). There may be >other ways though - others can comment on that perhaps (metrics? job-history?) > >On 02-Dec-2011, at 11:27 PM, W.P. McNeill wrote: > >> After my Hadoop job has successfully completed I'd like to log the total >> amount of time it took. This is the "Finished in" statistic in the web UI. >> How do I get this number programmatically? Is there some way I can query >> the Job object? I didn't see anything in the API documentation. > >02-Dec-2011, at 11:27 PM, W.P. McNeill wrote: > >> After my Hadoop job has successfully completed I'd like to log the total >> amount of time it took. This is the "Finished in" statistic in the web UI. >> How do I get this number programmatically? Is there some way I can query >> the Job object? I didn't see anything in the API documentation. > > > >
RE: Hadoop-streaming using binary executable c program
Hi. I was trying to run hadoop streaming and before that I check with the following : bin/hadoop fs -cat /user/yehdego/Hadoop-Data-New/RF00171_A.bpseqL3G1_seg_Optimized_Method.txt | head -2 | ./HADOOP Were HADOOP is a shell script: #!/bin/shrm -f temp.txt;while read line doecho $line >> temp.txt;doneexec /data/yehdego/hadoop-0.20.2/PKNOTSRG/src/bin/pknotsRG -k o -F temp.txt; and its working, but when i try running on streaming using the following: bin/hadoop jar /data/yehdego/hadoop-0.20.2/hadoop-0.20.2-streaming.jar -mapper ./HADOOP -file /data/yehdego/hadoop-0.20.2/HADOOP -file /data/yehdego/hadoop-0.20.2/PKNOTSRG/src/bin/pknotsRG -reducer ./ReduceLatest.py -file /data/yehdego/hadoop-0.20.2/ReduceLatest.py -input /user/yehdego/Hadoop-Data-New/RF00171_A.bpseqL3G1_seg_Optimized_Method.txt -output /user/yehdego/RF171_NEW/RF00171_A.bpseqL3G1_Optimized_Method40.txt -verbose it failed with the following error: PipeMapRed\.waitOutputThreads(): subprocess failed with code 126at org\.apache\.hadoop\.streaming\.PipeMapRed\.waitOutputThreads(PipeMapRed\.java:311) at org\.apache\.hadoop\.streaming\.PipeMapRed\.mapRedFinished(PipeMapRed\.java:545) at org\.apache\.hadoop\.streaming\.PipeMapper\.close(PipeMapper\.java:132) at org\.apache\.hadoop\.mapred\.MapRunner\.run(MapRunner\.java:57) at org\.apache\.hadoop\.streaming\.PipeMapRunner\.run(PipeMapRunner\.java:36) at org\.apache\.hadoop\.mapred\.MapTask\.runOldMapper(MapTask\.java:358)at org\.apache\.hadoop\.mapred\.MapTask\.run(MapTask\.java:307) at org\.apache\.hadoop\.mapred\.Child\.main(Child\.java:170) Any idea on this problem ? Regards, Daniel T. Yehdego Computational Science Program University of Texas at El Paso, UTEP dtyehd...@miners.utep.edu > From: ev...@yahoo-inc.com > To: common-user@hadoop.apache.org > Date: Mon, 25 Jul 2011 14:47:34 -0700 > Subject: Re: Hadoop-streaming using binary executable c program > > This is likely to be slow and it is not ideal. The ideal would be to modify > pknotsRG to be able to read from stdin, but that may not be possible. > > The shell script would probably look something like the following > > #!/bin/sh > rm -f temp.txt; > while read line > do > echo $line >> temp.txt; > done > exec pknotsRG temp.txt; > > Place it in a file say hadoopPknotsRG Then you probably want to run > > chmod +x hadoopPknotsRG > > After that you want to test it with > > hadoop fs -cat > /user/yehdego/RNAData/RF00028_B.bpseqL3G5_seg_Centered_Method.txt | head -2 | > ./hadoopPknotsRG > > If that works then you can try it with Hadoop streaming > > HADOOP_HOME$ bin/hadoop jar > /data/yehdego/hadoop-0.20.2/hadoop-0.20.2-streaming.jar -mapper > ./hadoopPknotsRG -file /data/yehdego/hadoop-0.20.2/pknotsRG -file > /data/yehdego/hadoop-0.20.2/hadoopPknotsRG -input > /user/yehdego/RF00028_B.bpseqL3G5_seg_Centered_Method.txt -output > /user/yehdego/RF-out -reducer NONE -verbose > > --Bobby > > On 7/25/11 3:37 PM, "Daniel Yehdego" wrote: > > > > Good afternoon Bobby, > > Thanks, you gave me a great help in finding out what the problem was. After I > put the command line you suggested me, I found out that there was a > segmentation error. > The binary executable program pknotsRG only reads a file with a sequence in > it. This means, there should be a shell script, as you have said, that will > take the data coming > from stdin and write it to a temporary file. Any idea on how to do this job > in shell script. The thing is I am from a biology background and don't have > much experience in CS. > looking forward to hear from you. Thanks so much. > > Regards, > > Daniel T. Yehdego > Computational Science Program > University of Texas at El Paso, UTEP > dtyehd...@miners.utep.edu > > > From: ev...@yahoo-inc.com > > To: common-user@hadoop.apache.org > > Date: Fri, 22 Jul 2011 12:39:08 -0700 > > Subject: Re: Hadoop-streaming using binary executable c program > > > > I would suggest that you do the following to help you debug. > > > > hadoop fs -cat > > /user/yehdego/RNAData/RF00028_B.bpseqL3G5_seg_Centered_Method.txt | head -2 > > | /data/yehdego/hadoop-0.20.2/pknotsRG-1.3/src/pknotsRG - > > > > This is simulating what hadoop streaming is doing. Here we are taking the > > first 2 lines out of the input file and feeding them to the stdin of > > pknotsRG. The first step is to make sure that you can get your program to > > run correctly with something like this. You may need to change the command > > line to pknotsRG to get it to read the data it is processing from stdin, > > instead of from a file. Alternatively you may need to write a shell script > > that will take the data coming from stdin. Write it to a file and then > > call pknotsRG on that temporary file. Once you have this working then you > > should try it again with streaming. > > > > --Bobby Evans > > > > On 7/22/11 12:31 PM, "Daniel Yehdego" wrote: > > > > > > > > Hi Bobby, Thanks f
RE: Help with Hadoop Eclipse Plugin on Mac OS X Lion
What version of Hadoop are you running on OS X Lion and are you running 32-bit or 64-bit version of Eclipse? > Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion > From: jign...@websoft.com > Date: Fri, 2 Dec 2011 14:37:28 -0500 > To: common-user@hadoop.apache.org > > I am running eclipse plugin in Lion OS X on eclipse 3.7. > > Take the plugin from contrib folder in dump to your eclipse plugin library. > If doesn't work remove eclipse and reinstall a fresh version. > > -Jignesh > > On Dec 2, 2011, at 11:59 AM, Prashant Sharma wrote: > > > nice to know Will, well the way i said you have the same luxury as far as > > you are running in stand-alone mode which is ideal for development. > > > > On Fri, Dec 2, 2011 at 10:02 PM, Will L > > wrote: > > > >> > >> > >> I got the setup working under my laptop running OS X Snow Leopard without > >> any problems and I would like to use my new laptop running OS X Lion. > >> > >> The plugin is helpful in that I can see hadoop output being dumped to the > >> eclipse console and it used to integrate well with the Eclipse IDE making > >> my > >> development life a little easier. > >> > >> Thank you for your time and help. > >> > >> Sincerely, > >> > >> Will Lieu > >> > >>> Date: Fri, 2 Dec 2011 21:44:36 +0530 > >>> Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion > >>> From: prashant.ii...@gmail.com > >>> To: common-user@hadoop.apache.org > >>> > >>> Why do you need a plugin at all? > >>> > >>> you can do away with it by having a maven project i.e. having a pom.xml > >> and > >>> setting hadoop as one of the dependencies. Then use regular maven > >> commands > >>> to build etc.. e.g. mvn eclipse:eclipse would be an interesting command. > >>> > >>> On Fri, Dec 2, 2011 at 1:59 PM, Will L >>> wrote: > >>> > > > Oops guess the formatting went away: > I have tried the following combinations: > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jar > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jar > * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.205.0.jar > > > From: seventeen_reas...@hotmail.com > > To: common-user@hadoop.apache.org > > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > > > > > > > > Hello, > > I am having problems getting my hadoop eclipse plugin to work on Mac > >> OS > X Lion. > > > > I have tried the following combinations: > > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > 0.20.203, Eclipse 3.7.1 (32-bit), > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > > > > Thank you for your time and help I greatly appreciate it! > > > > > > Sincerely, > > > > > > Will > > > > > > > >> > >> >
Re: How do I programmatically get total job execution time?
Hi, Ran a job using new MR API in stand alone mode and 0.21. Both, Job#getFinishTime and Job#getStartTime are returning 0. Not sure, if this is a bug. Thanks, Praveen On Sat, Dec 3, 2011 at 6:14 AM, Raj V wrote: > As Harsh said, I don't think there is a simple way to way to find when the > job ended, especially after the job is completed. > > But cant you just wait for your job to complete and log the time when the > job completed? > > Raj > > > > > > > From: Harsh J > >To: common-user@hadoop.apache.org > >Sent: Friday, December 2, 2011 12:53 PM > >Subject: Re: How do I programmatically get total job execution time? > > > >I remember hitting this once in 0.20 - seems like an API limitation. The > resolution we took back then was to get a list of all tasks, and get the > end time with the last ended task's completion time (sort and pick). There > may be other ways though - others can comment on that perhaps (metrics? > job-history?) > > > >On 02-Dec-2011, at 11:27 PM, W.P. McNeill wrote: > > > >> After my Hadoop job has successfully completed I'd like to log the total > >> amount of time it took. This is the "Finished in" statistic in the web > UI. > >> How do I get this number programmatically? Is there some way I can query > >> the Job object? I didn't see anything in the API documentation. > > > >02-Dec-2011, at 11:27 PM, W.P. McNeill wrote: > > > >> After my Hadoop job has successfully completed I'd like to log the total > >> amount of time it took. This is the "Finished in" statistic in the web > UI. > >> How do I get this number programmatically? Is there some way I can query > >> the Job object? I didn't see anything in the API documentation. > > > > > > > > >
RE: Help with Hadoop Eclipse Plugin on Mac OS X Lion
I am using 64-Bit Eclipse 3.7.1 Cocoa with Hadoop 0.20.205.0. I get the following error message: An internal error occurred during: "Connecting to DFS localhost". org/apache/commons/configuration/Configuration > From: seventeen_reas...@hotmail.com > To: common-user@hadoop.apache.org > Subject: RE: Help with Hadoop Eclipse Plugin on Mac OS X Lion > Date: Fri, 2 Dec 2011 20:51:02 -0800 > > > What version of Hadoop are you running on OS X Lion and are you running > 32-bit or 64-bit version of Eclipse? > > > Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > From: jign...@websoft.com > > Date: Fri, 2 Dec 2011 14:37:28 -0500 > > To: common-user@hadoop.apache.org > > > > I am running eclipse plugin in Lion OS X on eclipse 3.7. > > > > Take the plugin from contrib folder in dump to your eclipse plugin library. > > If doesn't work remove eclipse and reinstall a fresh version. > > > > -Jignesh > > > > On Dec 2, 2011, at 11:59 AM, Prashant Sharma wrote: > > > > > nice to know Will, well the way i said you have the same luxury as far as > > > you are running in stand-alone mode which is ideal for development. > > > > > > On Fri, Dec 2, 2011 at 10:02 PM, Will L > > > wrote: > > > > > >> > > >> > > >> I got the setup working under my laptop running OS X Snow Leopard without > > >> any problems and I would like to use my new laptop running OS X Lion. > > >> > > >> The plugin is helpful in that I can see hadoop output being dumped to the > > >> eclipse console and it used to integrate well with the Eclipse IDE > > >> making my > > >> development life a little easier. > > >> > > >> Thank you for your time and help. > > >> > > >> Sincerely, > > >> > > >> Will Lieu > > >> > > >>> Date: Fri, 2 Dec 2011 21:44:36 +0530 > > >>> Subject: Re: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > >>> From: prashant.ii...@gmail.com > > >>> To: common-user@hadoop.apache.org > > >>> > > >>> Why do you need a plugin at all? > > >>> > > >>> you can do away with it by having a maven project i.e. having a pom.xml > > >> and > > >>> setting hadoop as one of the dependencies. Then use regular maven > > >> commands > > >>> to build etc.. e.g. mvn eclipse:eclipse would be an interesting command. > > >>> > > >>> On Fri, Dec 2, 2011 at 1:59 PM, Will L > >>> wrote: > > >>> > > > > > > Oops guess the formatting went away: > > I have tried the following combinations: > > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jar > > * Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > > * Hadoop 0.20.203 Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jar > > * Hadoop 0.20.203, Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA) > > * Hadoop 0.20.205, Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.205.0.jar > > > > > From: seventeen_reas...@hotmail.com > > > To: common-user@hadoop.apache.org > > > Subject: Help with Hadoop Eclipse Plugin on Mac OS X Lion > > > Date: Fri, 2 Dec 2011 00:26:28 -0800 > > > > > > > > > > > > > > > > > > > > > Hello, > > > I am having problems getting my hadoop eclipse plugin to work on Mac > > >> OS > > X Lion. > > > > > > I have tried the following combinations: > > > Hadoop 0.20.203, Eclipse 3.6.2 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.6.2 > > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > > 0.20.203, Eclipse 3.7.1 (32-bit), > > hadoop-eclipse-plugin-0.20.203.0.jarHadoop 0.20.203, Eclipse 3.7.1 > > (32-bit), hadoop-eclipse-plugin-0.20.3-SNAPSHOT.jar (from JIRA)Hadoop > > 0.20.205, Eclipse 3.7.1 (32-bit), hadoop-eclipse-plugin-0.20.205.0.jar > > > > > > Has anyone gotten the hadoop eclipse plugin to work on Mac OS X Lion? > > > > > > > > > Thank you for your time and help I greatly appreciate it! > > > > > > > > > Sincerely, > > > > > > > > > Will > > > > > > > > > > > > >> > > >> > > >
Re: Availability of Job traces or logs
Arun, You can very well run synthetic workloads like large scale sort, wordcount etc or more realistic workloads like PigMix (https://cwiki.apache.org/confluence/display/PIG/PigMix). On a decent enough cluster, these workloads work pretty well. Is there a specific reason why you want traces of varied sizes from various organizations? > How can i make sure that the rumen generates only say 25 jobs,50 jobs or so Do you want to get 25/50 jobs based on some filtering criterion? I recently faced a similar situation where I wanted to extract jobs from a Rumen trace based on job ids. I will be happy to share these filtering tools. Amar On 12/1/11 8:48 AM, "ArunKumar" wrote: Hi guys ! Apart from generating the job traces from RUMEN , can i get logs or job traces of varied sizes from some organizations. How can i make sure that the rumen generates only say 25 jobs,50 jobs or so ? Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Availability-of-Job-traces-or-logs-tp3550462p3550462.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Capturing Map and Reduce I/O time
Arun, > I see that hadoop doesn't capture the Map task I/O time and Reduce task I/O > time and captures only map runtime > and reduce runtime. Am i right ? For maps, the framework doesn't explicitly capture the read time. For reduce, maybe shuffle time is a good metric to start with. > What does that runtime of Map and reduce tasks mean ? Time to finish the entire map task (not the method). Includes data read, data processing, sort and spill. > Which files do i need to look at and modify in Hadoop if i want to capture > the map and reduce I/O time's ? For the old codebase (pre YARN), see MapTask.java and ReduceTask.java. Roughly, the map phase is divided into 2 phases i.e map and sort. In the map phase, the read and processing happens in parallel. While the user code processes the current key-value pair, the framework reads and caches the next key-value pair. Hence its tough to distinguish between the read and process phases. Reduce task is divided into 3 phases i.e shuffle, sort (final), reduce. The shuffle phase has data copy (over the network) and sort (rather merge) happening in parallel. Once the entire data gets copies, a final merge happens. This gets captured under the sort phase. But still the shuffle phase time (recorded in the job history) is a good indicator of the time it takes to read the data off the network. Amar On 11/29/11 7:56 PM, "ArunKumar" wrote: Hi guys ! I see that hadoop doesn't capture the Map task I/O time and Reduce task I/O time and captures only map runtime and reduce runtime. Am i right ? By I/O time for map task i meant time taken by the map task to read the input chunk allocated to it for processing and the time for it to write the O/P data to the local disk. By I/O time for Reduce task i meant time for reduce task to transfer map O/Ps to reduce task(shuffle phase) and writing reduce O/Ps to DFS. > What does that runtime of Map and reduce tasks mean ? Does it mean time taken to execute the Map method and reduce method respectively ? (or) Does it mean time taken from the start of the Map/Reduce task to the completion of the Map/Reduce task(i.e including time to read,sort ,compute map or reduce ,merge,etc.) ? > Which files do i need to look at and modify in Hadoop if i want to capture > the map and reduce I/O time's ? > If i want to capture these values for few jobs of applications like > wordcount,sort,etc. what is the best way to do ? Can anyone guide me in this regard ? Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Capturing-Map-and-Reduce-I-O-time-tp3545298p3545298.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.