Thanks for the response. What I meant by uniform view is that I would be
able to avoid having to reference each individual part-r- file. It
wasn't immediately clear to me that the directory could be the input path.
That tells me then the problem(s) is somewhere in my MR code. Thanks!
On Wed, M
Hello,
On Thu, Mar 3, 2011 at 7:51 AM, John Sanda wrote:
> The output path created from the first job is a directory, and it the file
> in that directory that has a name like part-r- that I want to feed as
> input into the second job. I am running in pseudo-distributed mode so I know
> that t
Hi I am new to Hadoop, so maybe I am missing something obvious. I have
written a small map reduce program that runs two jobs. I want the output of
the first job to serve as the input to the second job. Here is what my
driver code looks like:
public int run(String[] args) throws Exception {
Con
Just tried to launch the same job from a secondary machine. The speed was
really fast. Must be some kind of environment problem or configuration
problem on the primary launching machine. Anyone has an idea on what could
cause it to take that long for every map task that is loading the avro
forma
Hello experts,
I am recently testing a set of logs that I converted to avro format in
hadoop. I am notice really really slow performance when compare to raw
logs. The map logs showing below seems to indicate setting up JVM took the
longest time. I am wondering if there is anything I can tweak in
For MapReduce MST, see Section 5 in [1].
[1] A Model of Computation for MapReduce
Karloff, H.; Suri, S.; Vassilvitskii, S.
SODA 2010
Available at http://research.yahoo.com/pub/2945
Nicholas
From: sumit ghosh
To: mapreduce-user@hadoop.apache.org
Sent: Tue, Mar