Nice catch! Thanks. I assume this is for the current master branch. The additional fs().remove(path); is unnecessary though, which creates another window of race condition (range server dies before the rename, then we don't have a transaction log at all and range server won't recover next time it starts.)
__Luke On Jan 29, 5:50 pm, jglim <[email protected]> wrote: > in this case: > > 1. /hypertable/servers/IP_38060/log/range_txn/0.log exists > 2. RangeServer found 0.log file, start recovering > 3. RangeServer dead while recovering , 0.log and 0.log.tmp both exists > 4. restarting RangeServer, found 0.log > 5. trying to create 0.log.tmp, with no overwrite option, fail. > > this patch would work for this case: creating a file with overwriting > option: > > diff --git a/src/cc/Hypertable/Lib/RangeServerMetaLog.cc b/src/cc/ > Hypertable/Lib/RangeServerMetaLog.cc > index a2c655b..340c651 100644 > --- a/src/cc/Hypertable/Lib/RangeServerMetaLog.cc > +++ b/src/cc/Hypertable/Lib/RangeServerMetaLog.cc > @@ -70,7 +70,7 @@ RangeServerMetaLog::recover(const String &path) { > String tmp(path); > tmp += ".tmp"; > > - fd(create(tmp)); > + fd(create(tmp, true)); > write_header(); > > // copy the metalog and potentially skip the last bad entry > @@ -82,6 +82,7 @@ RangeServerMetaLog::recover(const String &path) { > while ((entry = reader->read())) > write(entry.get()); > > + fs().remove(path); > fs().rename(tmp, path); > } > > the reason why I inserted fs().remove(path): in some filesystem, > renaming does not automatically remove the destination file --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
