Your scenario, moving from one type to another, should be easy enough to migrate; you'd just float both classes in the migration task and run the conversion from one type to the other. But I think you were intending to ask the harder question of going between versions of the same type. Taking your example of MapFiles, MapFiles are versioned and it looks like there's some attempt at making it so newer versions can read files written by older versions. I'd suggest that any hbase class that makes marks on the filesystem should be made do likewise. HLog emissions and catalog tables, -ROOT- and .META., look like obvious candidates for versioning.

St.Ack


Bryan Duxbury wrote:
The scheme you propose would be good so long as we only ever do things like rename files and move them around. If we ever decide to change something significant, like the underlying file structure (like if we break away from using MapFile or something), then we'd need the ability to read the old version as well as write the new ones. What would you like to be able to do in these instances?

On Dec 14, 2007, at 12:46 PM, stack (JIRA) wrote:


[ https://issues.apache.org/jira/browse/HADOOP-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551938 ]

stack commented on HADOOP-2394:
-------------------------------

I like the rails idea. Migration should support going in both directions I'd say.

hbase state is all kept out in the filesystem so hopefully, filesystem machinations should be all thats required making migrations.

HStoreFiles are MapFiles + an info file stored in a sympathetic directory. This info file has little in it currently -- just sequence id. Could also have hbase version. For log files, perhaps first record is stamp of the hbase version doing the writing.

It occurred to me that migrations could entail significant rewriting of on-filesystem data. To distribute the migration, we could we could have the master and regionservers run the migrations. Each server on startup would look for any migrations to run and just run them if any found. Nice thing about this is that we'd get the migration job distributed. But thinking on it, probably better to have the migration done outside of hbase in its own dedicated MR job. Would be easier tracking failures and running reversals.

Add supprt for migrating between hbase versions
-----------------------------------------------

                Key: HADOOP-2394
                URL: https://issues.apache.org/jira/browse/HADOOP-2394
            Project: Hadoop
         Issue Type: Improvement
         Components: contrib/hbase
           Reporter: Johan Oskarsson

If Hbase is to be used to serve data to live systems we would need a way to upgrade both the underlying hadoop installation and hbase to newer versions with minimal downtime.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Reply via email to