RE: partition file by content based through HDFS

2014-05-11 Thread John Lilley
: Sunday, May 11, 2014 2:54 PM To: user@hadoop.apache.org Subject: Re: partition file by content based through HDFS Hi, HDFS blocks are not "content aware". Such a separation like you requested, could be done via Hive or Pig with some lines of code, than you would have multiple files wh

Re: partition file by content based through HDFS

2014-05-11 Thread Mirko Kämpf
Hi, HDFS blocks are not "content aware". Such a separation like you requested, could be done via Hive or Pig with some lines of code, than you would have multiple files which can be organized in partitions as well, but such partitions are on a different abstraction level, not on blocks, but within

Re: partition file by content based through HDFS

2014-05-11 Thread Mohammad Tariq
Hi Karim, In short, no. If you intend to have partitioned data, better store it in different files based on your needs. What exactly is the use case? Warm Regards, Tariq cloudfront.blogspot.com On Sun, May 11, 2014 at 7:11 PM, Karim Awara wrote: > Hi, > > When a user is uploading a file from