Excellent. I really appreciate you checking on this, Rakesh and Michael. ZooKeeper did well in this evaluation and I wanted to make sure you were aware of this finding, in case an improvement was needed.
On Thu, Mar 2, 2017 at 11:25 AM, Michael Han <h...@cloudera.com> wrote: > The partial crash bug described in the paper looks the same case as what's > fixed by ZOOKEEPER-2247. The root cause is the same for both cases (quorum > threads were not shutdown). > > On Thu, Mar 2, 2017 at 7:45 AM, Rakesh Radhakrishnan <rake...@apache.org> > wrote: > > > Thanks a lot Andrew Purtell for pointing out this. > > > > I could see, https://issues.apache.org/jira/browse/ZOOKEEPER-2247 jira > is > > talking about similar case. Could you please go through this jira and let > > me know your comments. > > > > It seems they have used ZooKeeper (v3.4.8) for preparing the report. This > > bug is fixed and available only in the latest stable version 3.4.9. > > > > Thanks, > > Rakesh > > > > On Thu, Mar 2, 2017 at 11:07 AM, Andrew Purtell <apurt...@salesforce.com > > > > wrote: > > > > > Is there a JIRA open for the partial crash bug described in "Redundancy > > > Does Not Imply Fault Tolerance: Analysis of Distributed Storage > Reactions > > > to Single Errors and Corruptions" Aishwarya Ganesan, Ramnatthan > > Alagappan, > > > Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, University of > > > Wisconsin—Madison. 15th USENIX Conference on File and Storage > > Technologies > > > (FAST ’17)? > > > > > > From > > > https://www.usenix.org/system/files/conference/fast17/ > fast17-ganesan.pdf > > > > > > > > > "Unfortunately, ZooKeeper does not recover from write errors to the > > > transaction head and log tail. On write errors during log > initialization, > > > the error handling code tries to gracefully shutdown the node but kills > > > only the transaction processing threads; the quorum thread remains > alive > > > (partial crash). Consequently, other nodes believe that the leader is > > > healthy and do not elect a new leader. However, since the leader has > > > partially crashed, it cannot propose any transactions, leading to an > > > indefinite write unavailability." > > > > > > > > > > > > > > > -- > > > Best regards, > > > Andrew Purtell > > > apurt...@salesforce.com > > > apurt...@apache.org > > > > > > > > > -- > Cheers > Michael. > -- Best regards, - Andy If you are given a choice, you believe you have acted freely. - Raymond Teller (via Peter Watts)