Re: architecture diagram

2008-10-06 Thread Terrence A. Pietrondi
delimiter? Terrence A. Pietrondi --- On Sun, 10/5/08, Alex Loddengaard [EMAIL PROTECTED] wrote: From: Alex Loddengaard [EMAIL PROTECTED] Subject: Re: architecture diagram To: core-user@hadoop.apache.org Date: Sunday, October 5, 2008, 9:26 PM Let's say you have one very large input file

Re: architecture diagram

2008-10-06 Thread Terrence A. Pietrondi
) is the map key, and the map value is the field contents. How is this incorrect? I think this follows your earlier suggestion of: You may want to play with the following idea: collect key = column_number and value = column_contents in your map step. Terrence A. Pietrondi --- On Mon, 10/6/08, Alex

Re: architecture diagram

2008-10-03 Thread Terrence A. Pietrondi
shuffled back into their originating positions in the column. Once again, sorry for the typos and confusion. Terrence A. Pietrondi --- On Fri, 10/3/08, Alex Loddengaard [EMAIL PROTECTED] wrote: From: Alex Loddengaard [EMAIL PROTECTED] Subject: Re: architecture diagram To: core-user

Re: architecture diagram

2008-10-02 Thread Terrence A. Pietrondi
I am sorry for the confusion. I meant distributed data. So help me out here. For example, if I am reducing to a single file, then my main transformation logic would be in my mapping step since I am reducing away from the data? Terrence A. Pietrondi http://del.icio.us/tepietrondi --- On Wed

architecture diagram

2008-10-01 Thread Terrence A. Pietrondi
computation areas or is the reduce the major computation area? Thanks. Terrence A. Pietrondi

Re: architecture diagram

2008-10-01 Thread Terrence A. Pietrondi
So to be distributed in a sense, you would want to do your computation on the disconnected parts of data in the map phase I would guess? Terrence A. Pietrondi http://del.icio.us/tepietrondi --- On Wed, 10/1/08, Arun C Murthy [EMAIL PROTECTED] wrote: From: Arun C Murthy [EMAIL PROTECTED