Re: splittable vs seekable compressed formats

2013-05-24 Thread Harsh J
SequenceFiles should be seekable provided you know/manage their sync points during writes I think. With LZO this may be non-trivial. On Thu, May 23, 2013 at 11:01 PM, John Lilley john.lil...@redpoint.net wrote: I’ve read about splittable compressed formats in Hadoop. Are any of these formats

Re: splittable vs seekable compressed formats

2013-05-24 Thread Rahul Bhattacharjee
Yeah , I think John meant seeking to record boundaries. Thanks, Rahul On Fri, May 24, 2013 at 12:22 PM, Harsh J ha...@cloudera.com wrote: SequenceFiles should be seekable provided you know/manage their sync points during writes I think. With LZO this may be non-trivial. On Thu, May 23,

RE: splittable vs seekable compressed formats

2013-05-24 Thread John Lilley
question is more about the standard formats (e.g. LZO compression in SequenceFile) supporting this without additional work. John From: Rahul Bhattacharjee [mailto:rahul.rec@gmail.com] Sent: Friday, May 24, 2013 1:00 AM To: user@hadoop.apache.org Subject: Re: splittable vs seekable compressed

Re: splittable vs seekable compressed formats

2013-05-23 Thread Rahul Bhattacharjee
I think seeking is a property of the fs , so any file stored in hdfs is seekable. Inputstream is seekable and outputstream isn't. FileSystem supports seekable. Thanks, Rahul On Thu, May 23, 2013 at 11:01 PM, John Lilley john.lil...@redpoint.netwrote: I’ve read about splittable compressed