[ https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991314#comment-12991314 ]
Ivan Kelly commented on HDFS-1580: ---------------------------------- {quote} > > ShouldWALStreamCustodians control rolling themselves? As I understand, book-keeper doesn't allow reading logs from open ledger, is that correct? If that is the case it would be better to first roll and then do the checkpoint to capture most latest edits. {quote} That's true. Therefore there must be a method for having an external entity call roll. However, for the usecase of rolling logs periodically (to keep under a certain size), is there any requirement that en external entity knows anything about it? {quote} >WALStreamCustodianNotifier It seems to me this will be implemented only by fsedit log. If that is the case we could handle errors just by exceptions i.e. if an operation on WALStreamCustodian interface fails an exception is thrown and fsedit log can decide to remove the custodian depending on the kind of exception. {quote} Also true. In addition to point one, this means we could get rid of WALStreamCustodianNotifier completely. {quote} > namenode crash If the namenode comes back before the znode disappears from zookeeper, the ledger will be open. In that case, will the FSEditLog.load be able to load transactions from the open ledger as well? {quote} When coming back up, the Bookkeeper WAL implementation will see that there is a ledger open, but no namenode alive, so it will manually close the ledger. At that point FSEditLog will be able to read all the updates. {quote} > interface JournalStream The document defines this interface but doesn't describe its purpose or use-case. {quote} This interface already exists. I should take these methods out of this design though, as I think 1073 will be adding something like them anyhow. {quote} > List<URI> getLogs(long sinceTransactionId); The list returned must be ordered w.r.t the transactions contained. It might be a good idea to encode the ordering attribute in the url itself, so that the caller of this method can also verify that order is correct. The uri naming convention could mimic this aspect from the convention proposed in 1073. {quote} Agreed. {quote} > void startRoll() > void endRoll() I can only imagine a single roll method, that cuts a log, and starts a new one. I believe the naming convention or the ordering attribute for the logs should be controlled by the application and not the storage, therefore the roll method should take a parameter which becomes part of the log metadata and is used to order the logs. Again this also depends on how 1073 does it for file logs. {quote} Making the ordering controlled by the application and not the storage makes it hard to encapsulate periodic rolling inside the storage. Rolling requires the current transaction id (i assume this would be the usual parameter) to open a new log. However, what we could do in this case, is remove the "roll" call completely. When you want to roll, you just call close on the WALStreamCustodian. Then the next call to getOutputStream() would open a new stream. getOutputStream() would require the current transaction id to know where how to call the new stream, but this shouldn't be a problem. > Add interface for generic Write Ahead Logging mechanisms > -------------------------------------------------------- > > Key: HDFS-1580 > URL: https://issues.apache.org/jira/browse/HDFS-1580 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Ivan Kelly > Attachments: generic_wal_iface.pdf, generic_wal_iface.pdf, > generic_wal_iface.txt > > -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira