On 08/09/2010 09:14 PM, Harsh J wrote:
Another solution would be to create a custom named output using
mapred.lib.MultipleOutputs and collecting to that instead of the
job-set output format (which one can set to NullOutputFormat so it
doesn't complain about existing paths, etc.).
So if you'd wan
Another solution would be to create a custom named output using
mapred.lib.MultipleOutputs and collecting to that instead of the
job-set output format (which one can set to NullOutputFormat so it
doesn't complain about existing paths, etc.).
So if you'd want 'foo' prefix to your 0-N number
On Aug 9, 2010, at 1:27 PM, Pedro Costa wrote:
>
> 2 - If I'm deducting correctly, the reduce will always fetch 10 bytes
> less than the saved map output?
Why do you care?
On 08/09/2010 05:45 PM, David Rosenstrauch wrote:
On 08/09/2010 04:01 PM, David Rosenstrauch wrote:
On a similar note, it looks like if I want to customize the name/path of
the generated SequenceFile my only option currently is to override
FileOutputFormat.getDefaultWorkFile().
a) Again, have I
On 08/09/2010 04:01 PM, David Rosenstrauch wrote:
On a similar note, it looks like if I want to customize the name/path of
the generated SequenceFile my only option currently is to override
FileOutputFormat.getDefaultWorkFile().
a) Again, have I got this correct, or am I overlooking something?
b
Hi,
1 - I'm trying to compare the size of 1 map output on the map and on
the reduce side. So, I did some code modifications in the MR to see
what's happening when map saves map outputs and the reduce fetchs
them, and I've notice that the map output fetched by the reducer is
smaller 10 bytes than t
On 08/07/2010 02:06 AM, Harsh J wrote:
On Sat, Aug 7, 2010 at 11:20 AM, David Rosenstrauch wrote:
I'm using a SequenceFileOutputFormat. But I'd like to be able to set some
SequenceFile.Metadata on the SequenceFile.Writer that's getting created.
Doesn't look like there's any easy way to do th
Tnx much for the info, and the additional tips.
Unfortunately we're doing a lot of transforming of the DB data as we're
bringing it into Hadoop, so I don't think Sqoop's an option.
Thanks again,
DR
On 08/06/2010 12:50 AM, Aaron Kimball wrote:
The InputFormat instantiates a RecordReader (DBR
Not sure if this is something the devs would want to implement a change
like this, but it couldn't hurt to at least file it and make them aware.
Done: https://issues.apache.org/jira/browse/MAPREDUCE-2001
Thanks,
DR
On 08/09/2010 12:16 PM, Harsh J wrote:
You may also propose to extend the ex
You may also propose to extend the existing SFOP to allow this on JIRA
or the dev mailing list :)
On Mon, Aug 9, 2010 at 8:09 PM, David Rosenstrauch wrote:
> On 08/07/2010 02:06 AM, Harsh J wrote:
>>
>> On Sat, Aug 7, 2010 at 11:20 AM, David Rosenstrauch
>> wrote:
>>>
>>> I'm using a SequenceFil
On 08/07/2010 02:06 AM, Harsh J wrote:
On Sat, Aug 7, 2010 at 11:20 AM, David Rosenstrauch wrote:
I'm using a SequenceFileOutputFormat. But I'd like to be able to set some
SequenceFile.Metadata on the SequenceFile.Writer that's getting created.
Doesn't look like there's any easy way to do th
Hi,
1 - I would like to compare programatically the map output and the
reduce input to see if they're equal in MR. So, I'm trying to do an
hash on the output generated by the map, and on the input on the
reduce side and compare them. The problem is that I'm doing the hash
to all the file and not t
12 matches
Mail list logo