Your scenario, moving from one type to another, should be easy enough to
migrate; you'd just float both classes in the migration task and run the
conversion from one type to the other. But I think you were intending
to ask the harder question of going between versions of the same type.
Taking your example of MapFiles, MapFiles are versioned and it looks
like there's some attempt at making it so newer versions can read files
written by older versions. I'd suggest that any hbase class that makes
marks on the filesystem should be made do likewise. HLog emissions and
catalog tables, -ROOT- and .META., look like obvious candidates for
versioning.
St.Ack
Bryan Duxbury wrote:
The scheme you propose would be good so long as we only ever do things
like rename files and move them around. If we ever decide to change
something significant, like the underlying file structure (like if we
break away from using MapFile or something), then we'd need the
ability to read the old version as well as write the new ones. What
would you like to be able to do in these instances?
On Dec 14, 2007, at 12:46 PM, stack (JIRA) wrote:
[
https://issues.apache.org/jira/browse/HADOOP-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551938
]
stack commented on HADOOP-2394:
-------------------------------
I like the rails idea. Migration should support going in both
directions I'd say.
hbase state is all kept out in the filesystem so hopefully,
filesystem machinations should be all thats required making migrations.
HStoreFiles are MapFiles + an info file stored in a sympathetic
directory. This info file has little in it currently -- just
sequence id. Could also have hbase version. For log files, perhaps
first record is stamp of the hbase version doing the writing.
It occurred to me that migrations could entail significant rewriting
of on-filesystem data. To distribute the migration, we could we
could have the master and regionservers run the migrations. Each
server on startup would look for any migrations to run and just run
them if any found. Nice thing about this is that we'd get the
migration job distributed. But thinking on it, probably better to
have the migration done outside of hbase in its own dedicated MR
job. Would be easier tracking failures and running reversals.
Add supprt for migrating between hbase versions
-----------------------------------------------
Key: HADOOP-2394
URL: https://issues.apache.org/jira/browse/HADOOP-2394
Project: Hadoop
Issue Type: Improvement
Components: contrib/hbase
Reporter: Johan Oskarsson
If Hbase is to be used to serve data to live systems we would need a
way to upgrade both the underlying hadoop installation and hbase to
newer versions with minimal downtime.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.