[jira] [Updated] (KUDU-1853) Error during tablet copy may orphan a bunch of stuff
[ https://issues.apache.org/jira/browse/KUDU-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated KUDU-1853: -- Fix Version/s: (was: 1.3.0) > Error during tablet copy may orphan a bunch of stuff > > > Key: KUDU-1853 > URL: https://issues.apache.org/jira/browse/KUDU-1853 > Project: Kudu > Issue Type: Bug > Components: tablet, tserver >Affects Versions: 1.2.0 >Reporter: Adar Dembo >Assignee: Mike Percy >Priority: Critical > > Currently, a failure during tablet copy may leave behind a number of > different things: > # Downloaded superblock (if the failure falls after TabletCopyClient::Start()) > # Downloaded data blocks (if the failure falls during > TabletCopyClient::FetchAll()) > # Downloaded WAL segments (if the failure falls during > TabletCopyClient::FetchAll()) > # Downloaded cmeta file (if the failure falls during > TabletCopyClient::Finish()) > The next time the tserver starts, it'll see that this tablet's state is still > TABLET_DATA_COPYING and will tombstone it. That takes care of #1, #3, and #4 > (well, it leaves the cmeta file behind as the tombstone, but that's > intentional). > Unfortunately, all data blocks are orphaned, because the on-disk superblock > has no record of the new blocks, and so they aren't deleted. > We're already tracking a general purpose GC mechanism for data blocks in > KUDU-829, but I think this separate JIRA for describing the problem with > tablet copy is useful, if only as a reference for users. > Separately, it may be worth addressing these issues for failures that don't > result in tserver crashes, such as intermittent network outages between > tservers. A long lived tserver won't GC for some time, and it'd be nice to > reclaim the disk space used by these orphaned objects in the interim, not to > mention that implementing this kind of "GC" for data blocks is a lot easier > than a general purpose GC. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KUDU-1853) Error during tablet copy may orphan a bunch of stuff
[ https://issues.apache.org/jira/browse/KUDU-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Percy updated KUDU-1853: - Code Review: http://gerrit.cloudera.org:8080/5799 > Error during tablet copy may orphan a bunch of stuff > > > Key: KUDU-1853 > URL: https://issues.apache.org/jira/browse/KUDU-1853 > Project: Kudu > Issue Type: Bug > Components: tablet, tserver >Affects Versions: 1.2.0 >Reporter: Adar Dembo >Assignee: Mike Percy >Priority: Critical > > Currently, a failure during tablet copy may leave behind a number of > different things: > # Downloaded superblock (if the failure falls after TabletCopyClient::Start()) > # Downloaded data blocks (if the failure falls during > TabletCopyClient::FetchAll()) > # Downloaded WAL segments (if the failure falls during > TabletCopyClient::FetchAll()) > # Downloaded cmeta file (if the failure falls during > TabletCopyClient::Finish()) > The next time the tserver starts, it'll see that this tablet's state is still > TABLET_DATA_COPYING and will tombstone it. That takes care of #1, #3, and #4 > (well, it leaves the cmeta file behind as the tombstone, but that's > intentional). > Unfortunately, all data blocks are orphaned, because the on-disk superblock > has no record of the new blocks, and so they aren't deleted. > We're already tracking a general purpose GC mechanism for data blocks in > KUDU-829, but I think this separate JIRA for describing the problem with > tablet copy is useful, if only as a reference for users. > Separately, it may be worth addressing these issues for failures that don't > result in tserver crashes, such as intermittent network outages between > tservers. A long lived tserver won't GC for some time, and it'd be nice to > reclaim the disk space used by these orphaned objects in the interim, not to > mention that implementing this kind of "GC" for data blocks is a lot easier > than a general purpose GC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)