Give this a read, and let me know if there are still questions:
http://blog.cloudera.com/blog/2009/09/apache-hadoop-log-files-where-to-find-them-in-cdh-and-what-info-they-contain/
On Mon, Jan 28, 2013 at 3:56 AM, Dhanasekaran Anbalagan
bugcy...@gmail.comwrote:
Hi Guys,
How to understand
Informatica's take on the question:
http://www.informatica.com/hadoop/
My take on the question:
Hadoop is definitely disruptive and there have been times where we've been
able to blow missed data pipeline SLAs out of the water using Hadoop where
tools like Informatica were not able to. But
it.
On Wed, Jan 16, 2013 at 12:02 PM, Jeff Bean jwfb...@cloudera.com wrote:
Validate your scheduler capacity and behavior by using sleep jobs. Submit
sleep jobs to the pools that mirror your production jobs and just check
that the scheduler pool allocation behaves as you expect. The nice thing
about
a
streaming application (The reduces don't receive data as long as it is
produced by the map tasks)?
On 16 January 2013 05:41, Jeff Bean jwfb...@cloudera.com wrote:
me property. The reduce method is not called until the mappers are done,
and
the reducers are not scheduled before
Validate your scheduler capacity and behavior by using sleep jobs. Submit
sleep jobs to the pools that mirror your production jobs and just check
that the scheduler pool allocation behaves as you expect. The nice thing
about sleep is that you can mimic your real jobs: numbers of tasks and how
long
This is a Cloudera release level.
Specifically, it's CDH3, update 5. It means we've taken hadoop-0.20.2,
added 923 additional patches during beta for CDH3, 421 patches applied
after the stable version of CDH.
https://ccp.cloudera.com/display/DOC/CDH+Version+and+Packaging+Information
A dot (.)
Hi Pedro,
Yes, Hadoop Streaming has the same property. The reduce method is not
called until the mappers are done, and the reducers are not scheduled
before the threshold set by mapred.reduce.slowstart.completed.maps is
reached.
On Tue, Jan 15, 2013 at 3:06 PM, Pedro Sá da Costa
Hi Pedro,
Have you read the documentation on profiling MapReduce?
http://hadoop.apache.org/docs/r0.20.2/mapred_tutorial.html#Profiling
Jeff
On Sat, Dec 15, 2012 at 7:20 AM, Pedro Sá da Costa psdc1...@gmail.comwrote:
Hi
I want to attach jprofiler to Hadoop MapReduce (MR). DO I need to
Do setrep -w on the increase to force the new replica before decreasing
again.
Of course, the little script only works if the replication factor is 3 on
all the files. If it's a variable amount you should use the java API to get
the existing factor and then increase by one and then decrease.
Yes this is true. Combiner may never run if intermediate values don't need to
shuffle out to disk before the final output is done. Also, combiner cant be
substituted as a reducer.
Sent from my iPad
On Jul 10, 2011, at 4:42, Florin P florinp...@yahoo.com wrote:
Hello!
I've read on
It's part of the design that reduce() does not get called until the map
phase is complete. You're seeing reduce report as started when map is at 90%
complete because hadoop is shuffling data from the mappers that have
completed. As currently designed, you can't prematurely start reduce()
because
Hi Kim,
I saw this problem once, turned out the block was getting deleted before it
was read. Check namenode for blk_-7027776556206952935_61338. What's the
story there?
Jeff
On Thu, Nov 18, 2010 at 12:45 PM, Kim Vogt k...@simplegeo.com wrote:
Hi,
I'm using the MapFileOutputFormat to lookup
Is the tab the delimiter between records or between keys and values on the
input?
in other words does the input file look like this:
a\tb
b\tc
c\ta
or does it look like this:
a b\tb c\tc a\t
?
Jeff
On Thu, Jul 15, 2010 at 6:18 PM, Nikolay Korovaiko korovai...@gmail.comwrote:
Hi
, Jul 16, 2010 at 9:16 AM, Jeff Bean jwfb...@cloudera.com wrote:
Is the tab the delimiter between records or between keys and values on
the
input?
in other words does the input file look like this:
a\tb
b\tc
c\ta
or does it look like this:
a b\tb c\tc a\t
?
Jeff
Hi Michael,
Why did you determine that Hadoop streaming was insufficient for you?
Jeff
On Mon, May 31, 2010 at 9:17 AM, Michael Robinson
hadoopmich...@gmail.comwrote:
Hi Jef,
I have a C program that processes very large data files which are
compressed, so this program has to have full
Hi Michael,
How come you can't specify the C program as the mapper in streaming and just
have no reducers?
Jeff
On Sat, May 29, 2010 at 6:14 PM, Michael Robinson
hadoopmich...@gmail.comwrote:
Thanks for your answers.
I have read hadoop streaming and I think it is great, however what I am
16 matches
Mail list logo