delimiter?
Terrence A. Pietrondi
--- On Sun, 10/5/08, Alex Loddengaard [EMAIL PROTECTED] wrote:
From: Alex Loddengaard [EMAIL PROTECTED]
Subject: Re: architecture diagram
To: core-user@hadoop.apache.org
Date: Sunday, October 5, 2008, 9:26 PM
Let's say you have one very large input file
) is the map key, and the map value is
the field contents. How is this incorrect? I think this follows your earlier
suggestion of:
You may want to play with the following idea: collect key = column_number and
value = column_contents in your map step.
Terrence A. Pietrondi
--- On Mon, 10/6/08, Alex
shuffled back into their originating positions in the column.
Once again, sorry for the typos and confusion.
Terrence A. Pietrondi
--- On Fri, 10/3/08, Alex Loddengaard [EMAIL PROTECTED] wrote:
From: Alex Loddengaard [EMAIL PROTECTED]
Subject: Re: architecture diagram
To: core-user
I am sorry for the confusion. I meant distributed data.
So help me out here. For example, if I am reducing to a single file, then my
main transformation logic would be in my mapping step since I am reducing away
from the data?
Terrence A. Pietrondi
http://del.icio.us/tepietrondi
--- On Wed
computation areas or is the reduce the major computation area?
Thanks.
Terrence A. Pietrondi
So to be distributed in a sense, you would want to do your computation on the
disconnected parts of data in the map phase I would guess?
Terrence A. Pietrondi
http://del.icio.us/tepietrondi
--- On Wed, 10/1/08, Arun C Murthy [EMAIL PROTECTED] wrote:
From: Arun C Murthy [EMAIL PROTECTED