AVROs versioning capability might help if that could replace SequenceFile in your workflow.
Just a thought. -Terry On 1/29/13 9:17 PM, David Parks wrote: > I'll consider a patch to the SequenceFile, if we could manually override the > sequence file input Key and Value that's read from the sequence file headers > we'd have a clean solution. > > I don't like versioning my Model object because it's used by 10's of other > classes and I don't want to risk less maintained classes continuing to use > an old version. > > For the time being I just used 2 jobs. First I renamed the old Model Object > to the original name, read it in, upgraded it, and wrote the new version > with a different class name. > > Then I renamed the classes again so the new model object used the original > name and read in the altered name and cloned it into the original name. > > All in all an hours work only, but having a cleaner process would be better. > I'll add the request to JIRA at a minimum. > > Dave > > > -----Original Message----- > From: Harsh J [mailto:ha...@cloudera.com] > Sent: Wednesday, January 30, 2013 2:32 AM > To: <user@hadoop.apache.org> > Subject: Re: Tricks to upgrading Sequence Files? > > This is a pretty interesting question, but unfortunately there isn't an > inbuilt way in SequenceFiles itself to handle this. However, your key/value > classes can be made to handle versioning perhaps - detecting if what they've > read is of an older time and decoding it appropriately (while handling newer > encoding separately, in the normal fashion). > This would be much better than going down the classloader hack paths I > think? > > On Tue, Jan 29, 2013 at 1:11 PM, David Parks <davidpark...@yahoo.com> wrote: >> Anyone have any good tricks for upgrading a sequence file. >> >> >> >> We maintain a sequence file like a flat file DB and the primary object >> in there changed in recent development. >> >> >> >> It's trivial to write a job to read in the sequence file, update the >> object, and write it back out in the new format. >> >> >> >> But since sequence files read and write the key/value class I would >> either need to rename the model object with a version number, or >> change the header of each sequence file. >> >> >> >> Just wondering if there are any nice tricks to this. > > > -- > Harsh J >