ice that any use, disclosure, copying or distribution of this
> message, in any form, is strictly prohibited. If you have received this
> message in error, please immediately notify the sender and/or Syncsort and
> destroy all copies of this message in your possession, custody or control.
--
Eric Sammer
twitter: esammer
data: www.cloudera.com
10 at 9:25 PM, Mohamed Riadh Trad
wrote:
> Dear All;
>
> I had to emit final Key/Values in the Mapper Close Method but I can't get the
> context.
>
> Any suggestion?
>
> Regard.
--
Eric Sammer
twitter: esammer
data: www.cloudera.com
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
> }
> String athString = otherArgs[otherArgs.length - 1];
> File out = new File(athString);
> if (out.exists()) {
> FileUtilities.expungeDirectory(out);
> out.delete();
> }
> Path outputDir = new Path(athString);
>
> FileOutputFormat.setOutputPath(job, outputDir);
>
> boolean ans = job.waitForCompletion(true);
> int ret = ans ? 0 : 1;
> System.exit(ret);
> }
> }
> --
> Steven M. Lewis PhD
> Institute for Systems Biology
> Seattle WA
>
--
Eric Sammer
twitter: esammer
data: www.cloudera.com
conf.set("mapred.reduce.tasks.speculative.execution", "false");
>
> What am I missing here?
>
> cheers
> --
> Torsten
>
--
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
x tasks, but
that's cluster wide, not per host so I don't think that will be
helpful. A better option is to pack more work into each task in the
"lighter" of your two jobs so they have similar performance
characteristics, if possible. Of course, easier said than done, I
know.
--
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
o run the mapreduce program from another
> java program. I need some mechanism for submitting the job not from the
> command line but some other java program should launch the job.
>
> Nishant Sonar
>
--
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
> at org.apache.hadoop.mapred.Child.main(Child.java:194)
>
> I would like to debug this thread in a IDE but I don't know how to do it.
> Should I define properties to do this? Is there a way to do it?
>
> Thanks
>
> --
> PSC
>
nnection, the
failure semantics are very different. Without making Hadoop aware of
the multi-datacenter case, a failure of a router could easily lose all
replicas of a large number of blocks creating a huge hole in the data.
Again, it's about more than just performance here.
--
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
27;t mean in private computers, all of them in different
> places, rather a collection of datacenters, connected to each other over
> the Internet.
>
> Would that fail? If yes, how and why? What issues would arise?
>
--
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
t back to the "old" APIs and use MTOF or MO as you've mentioned.
I believe CDH3 has (or will have) updated versions of MTOF and MO for the
new APIs but don't quote me on that.
--
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com
s into the specific
class names, etc.).
Hope this help. If I've said anything wrong, I'm very happy to have
people correct me.
Regards.
--
Eric Sammer
e...@lifeless.net
http://esammer.blogspot.com
;s configuration? If not, does anyone else feel
like there should be?
I completely understand the correct answer is to fix the hosts file or
not depend on it at all, deferring to DNS. But, it does seem like this
bit of the code is overly complicated and brittle.
Thoughts?
Thanks.
--
Eric Sammer
e.
ably correct myself and say that it depends on the application. In
general, the assumption made by the framework is that all reduce values
for a given key may not fit in memory. In specific implementations it
may be fine (or even necessary) for the user to do buffering like this.
Thanks and sorry
impact performance and add the
requirement that all values for a given key fit in memory.
Hope this helps.
--
Eric Sammer
e...@lifeless.net
http://esammer.blogspot.com
ou can roll your own
start up scripts and invoke the underlying hadoop-daemon.sh scripts on
each node over whatever communication channel you'd like. You may have
to do a little environment setup first if you choose to go this route.
Take a look at the source of start-*.sh; they're pre
rt-mapred.sh]
> "$bin"/hadoop-daemon.sh --config $HADOOP_CONF_DIR start jobtracker
> "$bin"/hadoop-daemons.sh --config $HADOOP_CONF_DIR start tasktracker
[1] - http://wiki.apache.org/hadoop/
[2] - http://www.cloudera.com/hadoop-training-mapreduce-hdfs
Hope this helps.
--
Eric Sammer
e...@lifeless.net
http://esammer.blogspot.com
ency on Spring which
for me isn't a problem.
You can replace Spring with your DI framework of choice, of course, but
this pattern works well for me. Hope this helps!
Best regards.
--
Eric Sammer
e...@lifless.net
http://esammer.blogspot.com
17 matches
Mail list logo