Line On Mon, Apr 30, 2012 at 4:15 PM, Mohit Anchlia <[email protected]>wrote:
> Thanks! It worked just fine. But now my question is when compressing a text > file is it compressed line by line or the entire file is compressed as one? > > On Sun, Apr 29, 2012 at 7:33 PM, Prashant Kommireddi <[email protected] > >wrote: > > > By blocks do you mean you would be using Snappy to write SequeneFile? > Yes, > > you can do that by setting compression at BLOCK level for the sequence > > file. > > > > On Sun, Apr 29, 2012 at 1:41 PM, Mohit Anchlia <[email protected] > > >wrote: > > > > > Thanks! Is this compressing everyline or in blocks? Is it possible to > set > > > it to compress per block? > > > > > > On Sun, Apr 29, 2012 at 1:12 PM, Prashant Kommireddi < > > [email protected] > > > >wrote: > > > > > > > The ones you mentioned are for map output compression, not job > output. > > > > > > > > On Apr 29, 2012, at 1:07 PM, Mohit Anchlia <[email protected]> > > > wrote: > > > > > > > > > I tried these and didn't work with STORE? Is this different than > the > > > one > > > > > you mentioned? > > > > > > > > > > SET mapred.compress.map.output true; > > > > > > > > > > SET mapred.output.compression > > > org.apache.hadoop.io.compress.SnappyCodec; > > > > > > > > > > > > > > > On Sun, Apr 29, 2012 at 11:57 AM, Prashant Kommireddi > > > > > <[email protected]>wrote: > > > > > > > > > >> Have you tried setting output compression to Snappy for Store? > > > > >> > > > > >> grunt> set output.compression.enabled true; > > > > >> grunt> set output.compression.codec > > > > >> org.apache.hadoop.io.compress.SnappyCodec; > > > > >> > > > > >> You should be able to read and write Snappy compressed files with > > > > >> PigStorage which uses Hadoop TextInputFormat internally. > > > > >> > > > > >> Thanks, > > > > >> Prashant > > > > >> > > > > >> > > > > >> On Thu, Apr 26, 2012 at 12:40 PM, Mohit Anchlia < > > > [email protected] > > > > >>> wrote: > > > > >> > > > > >>> I think I need to write both store and load functions. It appears > > > that > > > > >> only > > > > >>> intermediate output that is stored on temp location can be > > compressed > > > > >>> using: > > > > >>> > > > > >>> SET mapred.compress.map.output true; > > > > >>> > > > > >>> SET mapred.output.compression > > > > org.apache.hadoop.io.compress.SnappyCodec; > > > > >>> > > > > >>> > > > > >>> > > > > >>> Any pointers as to how I can store and load using snappy would be > > > > >> helpful. > > > > >>> On Thu, Apr 26, 2012 at 12:32 PM, Mohit Anchlia < > > > > [email protected] > > > > >>>> wrote: > > > > >>> > > > > >>>> I am able to write with Snappy compression. But I don't think > pig > > > > >>>> provides anything to read such records. Can someone suggest or > > point > > > > me > > > > >>> to > > > > >>>> relevant code that might help me write LoadFunc for it? > > > > >>> > > > > >> > > > > > > > > > >
