Antonie,
This is the Apache Hadoop alias for mapreduce, your question is specific to
CDH4 distribution of Apache Hadoop. You should use the Cloudera alias for
such questions. I'll follow up with you in the Cloudera alias where you
cross-posted this message.
Thanks.
On Mon, Jul 30, 2012 at 6:20 A
Ilyal,
The MR output files names follow the pattern part- and you'll have as
many as reducers your job had.
As you know the output directory, you could do a fs.listStatus() of the
output directory and check all the part-* files.
Hope this helps.
Thanks.
Alejandro
On Sun, Sep 11, 2011 at 4
cess it but for output format, I see it is not able to access
>> it.
>>
>> -files copies cache file under:
>>
>> /user//.staging//files/
>>
>>
>>
>> On Fri, Jul 29, 2011 at 11:14 AM, Alejandro Abdelnur
>> wrote:
>>
>>> Mmm
while ((line = reader.readLine()) != null) {
> System.out.println("Now parsing the line: " + line);
>
>
> }
> } catch (Exception e) {
> System.out.println("exception" + e.getMessage());
>
> }
>
> On F
Thanks Matt,
Arko, if you plan to use Oozie, you can have a simple coordinator job that
does does, for example (the following schedules a WF every 5 mins that
consumes the output produced by the previous run, you just have to have the
initial data)
Thxs.
Alejandro
1
${
31 May 2011 12:02:28 -0700, Alejandro Abdelnur
>
> > wrote:
> >> What is exactly that does not work?
>
> In the hopes that more information can help, I've dug into the local
> filesystems on each of my four nodes and retrieved the job.xml and the
> locations of t
What is exactly that does not work?
Oozie uses DistributeCache as the only mechanism to set classpaths to jobs
and it works fine.
Thanks.
Alejandro
On Mon, May 30, 2011 at 10:22 AM, John Armstrong wrote:
> On Mon, 30 May 2011 09:43:14 -0700, Alejandro Abdelnur
> wrote:
> > If yo
Armstrong wrote:
> On Fri, 27 May 2011 15:47:23 -0700, Alejandro Abdelnur
> wrote:
> > John,
> >
> > If you are using Oozie, dropping all the JARs your MR jobs needs in the
> > Oozie WF lib/ directory should suffice. Oozie will make sure all those
> JARs
> >
John,
If you are using Oozie, dropping all the JARs your MR jobs needs in the
Oozie WF lib/ directory should suffice. Oozie will make sure all those JARs
are in the distributed cache.
Alejandro
On Thu, May 26, 2011 at 7:45 AM, John Armstrong wrote:
> Hi, everybody.
>
> I'm running into some dif
itions on the fly.
>
> On Tue, Mar 29, 2011 at 8:56 PM, Alejandro Abdelnur
> wrote:
> > Dmitriy,
> > Have you check the MultipleOutputs instead? It provides similar
> > functionality.
> > Alejandro
> >
> > On Wed, Mar 30, 2011 at 11:39 AM, Dmitriy Lyubimov
Dmitriy,
Have you check the MultipleOutputs instead? It provides similar
functionality.
Alejandro
On Wed, Mar 30, 2011 at 11:39 AM, Dmitriy Lyubimov wrote:
> Hi,
> I can't seem to be able to find either jira or implementation of
> MultipleOutputFormat in new api in either 0.21 or 0.22 branches.
Sagar,
Set the following property in your JobConf (or using -D)
* mapred*.*reduce*.*tasks=0*
*
*
That should do the trick.
Alejandro
***
*
On Fri, Mar 11, 2011 at 1:53 PM, Sagar Kohli wrote:
> Hi ,
>
>
>
> I am trying to run a job which does not require reducer, I commented out
> the reduce
If write your code within a Maven project (which you can open from Eclipse)
then you should the following in your pom.xml:
* Define Cloudera repository:
...
cdh.repo
https://repository.cloudera.com/content/groups/cloudera-repos
Cloudera Repositories
Lei Liu,
You have a cut&paste error the second addition should use 'tairJarPath' but
it is using the 'jeJarPath'
Hope this helps.
Alejandro
On Thu, Feb 17, 2011 at 11:50 AM, lei liu wrote:
> I use DistributedCache to add two files to class path, exampe below code
> :
>String jeJarPa
FYI, answered in the oozie-users@ alias:
http://tech.groups.yahoo.com/group/Oozie-users/message/673
On Thu, Jan 20, 2011 at 7:15 PM, Giridhar Addepalli <
giridhar.addepa...@komli.com> wrote:
> Hi,
>
>
>
> I am using hadoop-0.20.2+228 version of hadoop. Want to use Oozie for
> managing workflows.
; Thanks,
> Henning
>
>
> On Thu, 2010-10-07 at 13:22 +0800, Alejandro Abdelnur wrote:
>
> [sent too soon]
>
>
>
> The first CP shown is how it is today the CP of a task. If we change it
> pick up all the job JARs from the current dir, then the classpath will be
&
, Oct 7, 2010 at 1:02 PM, Alejandro Abdelnur wrote:
> Fragmentation of Hadoop classpaths is another issue: hadoop should
> differentiate the CP in 3:
>
> 1*client CP: what is needed to submit a job (only the nachos)
> 2*server CP (JT/NN/TT/DD): what is need to run the cluster (the wh
roposal 2. How would that remove
> (e.g.) jetty libs from the job's classpath?
>
> Thanks,
> Henning
>
> Am Mittwoch, den 06.10.2010, 18:28 +0800 schrieb Alejandro Abdelnur:
>
> 1. Classloader business can be done right. Actually it could be done as
> spec-ed for
1. Classloader business can be done right. Actually it could be done as
spec-ed for servlet web-apps.
2. If the issue is strictly 'too large classpath', then a simpler solution
would be to sof-link all JARs to the current directory and create the
classpath with the JAR names only (no path). Note t
19 matches
Mail list logo