Thanks Todd.
Unfortunately, I'm using Hadoop cascading, so I'm not sure if there's
an easy mechanism to force LocalJobs it fires off to use a different
configuration. I'll talk to the Cascading folks and find out.
J
Quoting Todd Lipcon :
Hi Jeremy,
That's a good point - we don't curren
Hi Jeremy,
That's a good point - we don't currently do a good job of segregating the
configurations used for the LJR from the configs used for the TaskTracker.
In particular I think both mapred.local.dir and mapred.system.dir are used
by both.
You run into the same issue when trying to use LJR on
Hi,
I'm running hadoop (Cloudera release 3) in pseudo distributed mode,
with the linux task controller so that jobs will run as the user who
submitted them.
My program (which uses hadoop cascading) fires off a job using
LocalJobRunner (I think to read data from the local filesystem). So
You need to add a call to MultipleOutputs.close() in your reducer's cleanup:
public void cleanup(Context) throws IOException {
mos.close();
...
}
On Fri, May 6, 2011 at 1:55 PM, Geoffry Roberts
wrote:
> All,
>
> I am attempting to take a large file and split it up into a series of
> smal
Steve,
Yes the object is known at start up and is read only. The mappers don't
touch it. I considered serializing to a string. I was wondering if there
wasn't a more excellent way.
Thanks
On 6 May 2011 10:55, Steve Lewis wrote:
> If possible serialize the object as XML then add it as a set
On 05/06/2011 01:12 PM, Geoffry Roberts wrote:
All,
I need for each one of my reducers to have read access to a certain object
or a clone thereof. I can instantiate this object a start up. How can I
give my reducers a copy?
Serialize it to a string, set it as a configuration setting on the j
All,
I am attempting to take a large file and split it up into a series of
smaller files. I want the smaller files to be named based on values taken
from the large file. I am using
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs to do this.
The job runs without error and produces a set o
If possible serialize the object as XML then add it as a set of lines to the
config - alternatively serialize it (maybe xml) to a known spot in HDFS and
read it in in the setup code in the reducer - I assume this is an object
known at the start of the job and not modified by the mapper
On Fri, May
All,
I need for each one of my reducers to have read access to a certain object
or a clone thereof. I can instantiate this object a start up. How can I
give my reducers a copy?
--
Geoffry Roberts