Re: Splitting files on new line using hadoop fs

bejoy . hadoop Wed, 22 Feb 2012 12:24:23 -0800

Hi Mohit
        AFAIK there is no default mechanism available for the same in hadoop. 
File is split into blocks just based on the configured block size during hdfs 
copy. While processing the file using Mapreduce the record reader takes care of 
the new lines even if a line spans across multiple blocks.

Could you explain more on the use case that demands such a requirement while 
hdfs copy itself?

------Original Message------
From: Mohit Anchlia
To: common-user@hadoop.apache.org
ReplyTo: common-user@hadoop.apache.org
Subject: Splitting files on new line using hadoop fs
Sent: Feb 23, 2012 01:45

How can I copy large text files using "hadoop fs" such that split occurs
based on blocks + new lines instead of blocks alone? Is there a way to do
this?

Regards
Bejoy K S

>From handheld, Please excuse typos.

Re: Splitting files on new line using hadoop fs

Reply via email to