Re: Using own InputSplit

Harsh J Fri, 27 May 2011 10:04:50 -0700

Mohit,

Please do not cross-post a question to multiple lists unless you're
announcing something.


What you describe, does not happen; and the way the splitting is done
for Text files is explained in good detail here:
http://wiki.apache.org/hadoop/HadoopMapReduce

Hope this solves your doubt :)

On Fri, May 27, 2011 at 10:25 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote:
> I am new to hadoop and from what I understand by default hadoop splits
> the input into blocks. Now this might result in splitting a line of
> record into 2 pieces and getting spread accross 2 maps. For eg: Line
> "abcd" might get split into "ab" and "cd". How can one prevent this in
> hadoop and pig? I am looking for some examples where I can see how I
> can specify my own split so that it logically splits based on the
> record delimiter and not the block size. For some reason I am not able
> to get right examples online.
>



-- 
Harsh J

Re: Using own InputSplit

Reply via email to