I think I can figure this out now and get it to work. I will check back in if I
get it. All that is missing at the moment is in my pivot back mapping step.
Thanks for the help.
Terrence A. Pietrondi
--- On Tue, 10/7/08, Alex Loddengaard <[EMAIL PROTECTED]> wrote:
> From: Alex Lo
the map key, and the map value is
the field contents. How is this incorrect? I think this follows your earlier
suggestion of:
"You may want to play with the following idea: collect key => column_number and
value => column_contents in your map step."
Terrence A. Pietrondi
---
lit
on my delimiter?
Terrence A. Pietrondi
--- On Sun, 10/5/08, Alex Loddengaard <[EMAIL PROTECTED]> wrote:
> From: Alex Loddengaard <[EMAIL PROTECTED]>
> Subject: Re: architecture diagram
> To: core-user@hadoop.apache.org
> Date: Sunday, October 5, 2008, 9:26 PM
> Le
okenized.
Does this mean in this example the row tokens may not be the complete row?
Thanks.
Terrence A. Pietrondi
--- On Fri, 10/3/08, Alex Loddengaard <[EMAIL PROTECTED]> wrote:
> From: Alex Loddengaard <[EMAIL PROTECTED]>
> Subject: Re: architecture diagram
> To: c
ng the shuffling, B and E are swapped, and G and C are swapped, while
A and D were shuffled back into their originating positions in the column.
Once again, sorry for the typos and confusion.
Terrence A. Pietrondi
--- On Fri, 10/3/08, Alex Loddengaard <[EMAIL PROTECTED]> wrote:
> From: A
|B|C
D|E|G
pivots too...
D|A
E|B
C|G
Then for each row, shuffle the contents around randomly...
D|A
B|E
G|C
Then pivot the data back...
A|E|C
D|B|C
You can reference my progress so far...
http://svn.sourceforge.net/viewvc/csvdatamix/branches/datamix_mapreduce/
Terrence A. Pietrondi
--- On
I am sorry for the confusion. I meant distributed data.
So help me out here. For example, if I am reducing to a single file, then my
main transformation logic would be in my mapping step since I am reducing away
from the data?
Terrence A. Pietrondi
http://del.icio.us/tepietrondi
--- On Wed
So to be "distributed" in a sense, you would want to do your computation on the
disconnected parts of data in the map phase I would guess?
Terrence A. Pietrondi
http://del.icio.us/tepietrondi
--- On Wed, 10/1/08, Arun C Murthy <[EMAIL PROTECTED]> wrote:
> From: Arun C Murth
computation areas or is the reduce the major computation area?
Thanks.
Terrence A. Pietrondi