Re: Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)

Jean-Daniel Cryans Thu, 04 Aug 2011 10:35:20 -0700

> Thanks for the feedback.  So you're inclined to think it would be at the dfs
> layer?


That's where the evidence seems to point.

>
> Is it accurate to say the most likely places where the data could have been
> lost were:
> 1. wal writes didn't actually get written to disk (no log entries to suggest
> any issues)

Most likely.

> 2. wal corrupted (no log entries suggest any trouble reading the log)

In that case the logs would scream (and I didn't see that in the logs
I looked at).

> 3. not all split logs were read by regionservers  (?? is there any way to
> ensure this either way... should I look at the filesystem some place?)

Some regions would have recovered edits files, but that seems highly
unlikely. With DEBUG enabled we could have seen which files were split
by the master and which ones were created for the regions, and then
which were read by the region servers.

>
> Do you think the type of network partition I'm talking about is adequately
> covered in existing tests? (Specifically running an external zk cluster?)

The IO fencing was only tested with HDFS, I don't know what happens in
that case with MapR. What I mean is that when the master splits the
logs, it takes ownership of the HDFS writer lease (only one per file)
so that it can safely close the log file. Then after that it checks if
there are any new log files that were created (the region server could
have rolled a log while the master was splitting them) and will
restart if that situation happens until it's able to own all files and
split them.

>
> Have you heard if anyone else is been having problems with the second 90.4
> rc?

Nope, we run it here on our dev cluster and didn't encounter any issue
(with the code or node failure).

>
> Thanks again for your help.  I'm following up with the MapR guys as well.

Good idea!

J-D

Re: Apparent data loss on 90.4 rc2 after partial zookeeper network partition (on MapR)

Reply via email to