[ 
https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991314#comment-12991314
 ] 

Ivan Kelly commented on HDFS-1580:
----------------------------------

{quote}
> > ShouldWALStreamCustodians control rolling themselves?
As I understand, book-keeper doesn't allow reading logs from open ledger, is 
that correct? If that is the case it would be better to first roll and then do 
the checkpoint to capture most latest edits.
{quote}
That's true. Therefore there must be a method for having an external entity 
call roll. However, for the usecase of rolling logs periodically (to keep under 
a certain size), is there any requirement that en external entity knows 
anything about it?

{quote}
>WALStreamCustodianNotifier
It seems to me this will be implemented only by fsedit log. If that is the case 
we could handle errors just by exceptions i.e. if an operation on 
WALStreamCustodian interface fails an exception is thrown and fsedit log can 
decide to remove the custodian depending on the kind of exception.
{quote}
Also true. In addition to point one, this means we could get rid of 
WALStreamCustodianNotifier completely. 

{quote}
> namenode crash
If the namenode comes back before the znode disappears from zookeeper, the 
ledger will be open. In that case, will the FSEditLog.load be able to load 
transactions from the open ledger as well?
{quote}
When coming back up, the Bookkeeper WAL implementation will see that there is a 
ledger open, but no namenode alive, so it will manually close the ledger. At 
that point FSEditLog will be able to read all the updates.

{quote}
> interface JournalStream
The document defines this interface but doesn't describe its purpose or 
use-case.
{quote}
This interface already exists. I should take these methods out of this design 
though, as I think 1073 will be adding something like them anyhow.

{quote}
> List<URI> getLogs(long sinceTransactionId);
The list returned must be ordered w.r.t the transactions contained. It might be 
a good idea to encode the ordering attribute in the url itself, so that the 
caller of this method can also verify that order is correct. The uri naming 
convention could mimic this aspect from the convention proposed in 1073.
{quote}
Agreed. 

{quote}
> void startRoll()
> void endRoll()
I can only imagine a single roll method, that cuts a log, and starts a new one. 
I believe the naming convention or the ordering attribute for the logs should 
be controlled by the application and not the storage, therefore the roll method 
should take a parameter which becomes part of the log metadata and is used to 
order the logs. Again this also depends on how 1073 does it for file logs.
{quote}
Making the ordering controlled by the application and not the storage makes it 
hard to encapsulate periodic rolling inside the storage. Rolling requires the 
current transaction id (i assume this would be the usual parameter) to open a 
new log. 

However, what we could do in this case, is remove the "roll" call completely. 
When you want to roll, you just call close on the WALStreamCustodian. Then the 
next call to getOutputStream() would open a new stream. getOutputStream() would 
require the current transaction id to know where how to call the new stream, 
but this shouldn't be a problem. 

> Add interface for generic Write Ahead Logging mechanisms
> --------------------------------------------------------
>
>                 Key: HDFS-1580
>                 URL: https://issues.apache.org/jira/browse/HDFS-1580
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>         Attachments: generic_wal_iface.pdf, generic_wal_iface.pdf, 
> generic_wal_iface.txt
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to