Re: Purposefully keeping around WAL files

2019-05-03 Thread Biju N
Hi Sean, Is there a JIRA ticket for this to follow? On Sat, Mar 16, 2019 at 2:00 PM Andrew Purtell wrote: > Running the file through a standard compressor. Makes handling more > straightforward eg copy to local filesystem and extraction. We could wait > to do it until all references to the WAL f

Re: Purposefully keeping around WAL files

2019-05-03 Thread Sean Busbey
Nope, didn't get far enough in specifying an approach to file a JIRA. If you're up for making a go of it, feel free to start a new one. On Fri, May 3, 2019, 14:44 Biju N wrote: > Hi Sean, Is there a JIRA ticket for this to follow? > > On Sat, Mar 16, 2019 at 2:00 PM Andrew Purtell > wrote: > >

Re: Purposefully keeping around WAL files

2019-03-16 Thread Andrew Purtell
Running the file through a standard compressor. Makes handling more straightforward eg copy to local filesystem and extraction. We could wait to do it until all references to the WAL file are gone so as to not complicate things like replication. > On Mar 16, 2019, at 10:17 AM, Sean Busbey wr

Re: Purposefully keeping around WAL files

2019-03-16 Thread Sean Busbey
Yeah I like the idea of compressing them. you thinking of rewriting them with the wal compression feature enabled, or just something simple like running the whole file through a compressor? Maybe I should poke at what difference in resultant file size looks like. IIRC things already get moved out

Re: Purposefully keeping around WAL files

2019-03-16 Thread Andrew Purtell
How about an option that tells the cleaner to archive them, with compression? There’s a lot of wastage in WAL files due to repeated information, and reasons to not enable WAL compression for live files, but I think little reason not to rewrite an archived WAL file with a typical and standard arc

Purposefully keeping around WAL files

2019-03-16 Thread Sean Busbey
Hi folks! Sometimes while working to diagnose an HBase failure in production settings I need to ensure WALs stick around so that I can examine or possibly replay them. For difficult problems on clusters with plenty of HDFS space relative to the HBase write workload sometimes that might mean for da