Re: MRv1 with CDH4

2012-07-30 Thread Alejandro Abdelnur
Antonie, This is the Apache Hadoop alias for mapreduce, your question is specific to CDH4 distribution of Apache Hadoop. You should use the Cloudera alias for such questions. I'll follow up with you in the Cloudera alias where you cross-posted this message. Thanks. On Mon, Jul 30, 2012 at 6:20 A

Re: Stop chained mapreduce.

2011-09-12 Thread Alejandro Abdelnur
Ilyal, The MR output files names follow the pattern part- and you'll have as many as reducers your job had. As you know the output directory, you could do a fs.listStatus() of the output directory and check all the part-* files. Hope this helps. Thanks. Alejandro On Sun, Sep 11, 2011 at 4

Re: Can you access Distributed cache in custom output format ?

2011-07-29 Thread Alejandro Abdelnur
cess it but for output format, I see it is not able to access >> it. >> >> -files copies cache file under: >> >> /user//.staging//files/ >> >> >> >> On Fri, Jul 29, 2011 at 11:14 AM, Alejandro Abdelnur >> wrote: >> >>> Mmm

Re: Can you access Distributed cache in custom output format ?

2011-07-29 Thread Alejandro Abdelnur
while ((line = reader.readLine()) != null) { > System.out.println("Now parsing the line: " + line); > > > } > } catch (Exception e) { > System.out.println("exception" + e.getMessage()); > > } > > On F

Re: Programming Multiple rounds of mapreduce

2011-06-13 Thread Alejandro Abdelnur
Thanks Matt, Arko, if you plan to use Oozie, you can have a simple coordinator job that does does, for example (the following schedules a WF every 5 mins that consumes the output produced by the previous run, you just have to have the initial data) Thxs. Alejandro 1 ${

Re: Problems adding JARs to distributed classpath in Hadoop 0.20.2

2011-06-01 Thread Alejandro Abdelnur
31 May 2011 12:02:28 -0700, Alejandro Abdelnur > > > wrote: > >> What is exactly that does not work? > > In the hopes that more information can help, I've dug into the local > filesystems on each of my four nodes and retrieved the job.xml and the > locations of t

Re: Problems adding JARs to distributed classpath in Hadoop 0.20.2

2011-05-31 Thread Alejandro Abdelnur
What is exactly that does not work? Oozie uses DistributeCache as the only mechanism to set classpaths to jobs and it works fine. Thanks. Alejandro On Mon, May 30, 2011 at 10:22 AM, John Armstrong wrote: > On Mon, 30 May 2011 09:43:14 -0700, Alejandro Abdelnur > wrote: > > If yo

Re: Problems adding JARs to distributed classpath in Hadoop 0.20.2

2011-05-30 Thread Alejandro Abdelnur
Armstrong wrote: > On Fri, 27 May 2011 15:47:23 -0700, Alejandro Abdelnur > wrote: > > John, > > > > If you are using Oozie, dropping all the JARs your MR jobs needs in the > > Oozie WF lib/ directory should suffice. Oozie will make sure all those > JARs > >

Re: Problems adding JARs to distributed classpath in Hadoop 0.20.2

2011-05-27 Thread Alejandro Abdelnur
John, If you are using Oozie, dropping all the JARs your MR jobs needs in the Oozie WF lib/ directory should suffice. Oozie will make sure all those JARs are in the distributed cache. Alejandro On Thu, May 26, 2011 at 7:45 AM, John Armstrong wrote: > Hi, everybody. > > I'm running into some dif

Re: MultipleOutputFormat

2011-03-29 Thread Alejandro Abdelnur
itions on the fly. > > On Tue, Mar 29, 2011 at 8:56 PM, Alejandro Abdelnur > wrote: > > Dmitriy, > > Have you check the MultipleOutputs instead? It provides similar > > functionality. > > Alejandro > > > > On Wed, Mar 30, 2011 at 11:39 AM, Dmitriy Lyubimov

Re: MultipleOutputFormat

2011-03-29 Thread Alejandro Abdelnur
Dmitriy, Have you check the MultipleOutputs instead? It provides similar functionality. Alejandro On Wed, Mar 30, 2011 at 11:39 AM, Dmitriy Lyubimov wrote: > Hi, > I can't seem to be able to find either jira or implementation of > MultipleOutputFormat in new api in either 0.21 or 0.22 branches.

Re: running a job without user

2011-03-10 Thread Alejandro Abdelnur
Sagar, Set the following property in your JobConf (or using -D) * mapred*.*reduce*.*tasks=0* * * That should do the trick. Alejandro *** * On Fri, Mar 11, 2011 at 1:53 PM, Sagar Kohli wrote: > Hi , > > > > I am trying to run a job which does not require reducer, I commented out > the reduce

Re: how to use hadoop apis with cloudera distribution ?

2011-03-08 Thread Alejandro Abdelnur
If write your code within a Maven project (which you can open from Eclipse) then you should the following in your pom.xml: * Define Cloudera repository: ... cdh.repo https://repository.cloudera.com/content/groups/cloudera-repos Cloudera Repositories

Re: use DistributedCache to add many files to class path

2011-02-16 Thread Alejandro Abdelnur
Lei Liu, You have a cut&paste error the second addition should use 'tairJarPath' but it is using the 'jeJarPath' Hope this helps. Alejandro On Thu, Feb 17, 2011 at 11:50 AM, lei liu wrote: > I use DistributedCache to add two files to class path, exampe below code > : >String jeJarPa

Re: Problem while installing Oozie

2011-01-20 Thread Alejandro Abdelnur
FYI, answered in the oozie-users@ alias: http://tech.groups.yahoo.com/group/Oozie-users/message/673 On Thu, Jan 20, 2011 at 7:15 PM, Giridhar Addepalli < giridhar.addepa...@komli.com> wrote: > Hi, > > > > I am using hadoop-0.20.2+228 version of hadoop. Want to use Oozie for > managing workflows.

Re: Too large class path for map reduce jobs

2010-10-07 Thread Alejandro Abdelnur
; Thanks, > Henning > > > On Thu, 2010-10-07 at 13:22 +0800, Alejandro Abdelnur wrote: > > [sent too soon] > > > > The first CP shown is how it is today the CP of a task. If we change it > pick up all the job JARs from the current dir, then the classpath will be &

Re: Too large class path for map reduce jobs

2010-10-06 Thread Alejandro Abdelnur
, Oct 7, 2010 at 1:02 PM, Alejandro Abdelnur wrote: > Fragmentation of Hadoop classpaths is another issue: hadoop should > differentiate the CP in 3: > > 1*client CP: what is needed to submit a job (only the nachos) > 2*server CP (JT/NN/TT/DD): what is need to run the cluster (the wh

Re: Too large class path for map reduce jobs

2010-10-06 Thread Alejandro Abdelnur
roposal 2. How would that remove > (e.g.) jetty libs from the job's classpath? > > Thanks, > Henning > > Am Mittwoch, den 06.10.2010, 18:28 +0800 schrieb Alejandro Abdelnur: > > 1. Classloader business can be done right. Actually it could be done as > spec-ed for

Re: Too large class path for map reduce jobs

2010-10-06 Thread Alejandro Abdelnur
1. Classloader business can be done right. Actually it could be done as spec-ed for servlet web-apps. 2. If the issue is strictly 'too large classpath', then a simpler solution would be to sof-link all JARs to the current directory and create the classpath with the JAR names only (no path). Note t