Mohit, Please do not cross-post a question to multiple lists unless you're announcing something.
What you describe, does not happen; and the way the splitting is done for Text files is explained in good detail here: http://wiki.apache.org/hadoop/HadoopMapReduce Hope this solves your doubt :) On Fri, May 27, 2011 at 10:25 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > I am new to hadoop and from what I understand by default hadoop splits > the input into blocks. Now this might result in splitting a line of > record into 2 pieces and getting spread accross 2 maps. For eg: Line > "abcd" might get split into "ab" and "cd". How can one prevent this in > hadoop and pig? I am looking for some examples where I can see how I > can specify my own split so that it logically splits based on the > record delimiter and not the block size. For some reason I am not able > to get right examples online. > -- Harsh J