[ 
https://issues.apache.org/jira/browse/KUDU-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wong resolved KUDU-2627.
-------------------------------
    Fix Version/s: 1.12.0
       Resolution: Duplicate

This is a dupe of KUDU-2993, which is resolved as of Kudu 1.12.

> Automatically "fix" inconsistent data directories
> -------------------------------------------------
>
>                 Key: KUDU-2627
>                 URL: https://issues.apache.org/jira/browse/KUDU-2627
>             Project: Kudu
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Andrew Wong
>            Priority: Major
>             Fix For: 1.12.0
>
>
> Currently, Kudu will attempt to check the integrity of its FS layout by 
> checking that all data dirs exist where they're expected, and that all of 
> them "know" about the rest of the data dirs in the FS layout. When a data dir 
> is missing on disk (e.g. because the underlying disk was yanked and a new one 
> was put in), this currently means that all other data dirs will expect a data 
> dir that will be missing. Following KUDU-2359, Kudu will accept this and 
> start up, but label the data dir as "failed", alerting users that something 
> on disk is inconsistent with the users' FS config, at which point, they can 
> run `kudu fs update_dirs` with the expected directories.
> This isn't a great user experience for a couple reasons: 1) it adds more 
> legwork and more downtime when recovering from disk failures, performing 
> hardware upgrades, etc., 2) if the user _is_ repairing a disk failure, the 
> "new" directories input to the `kudu fs update_dirs` tool will be identical 
> to the old ones (or more cautiously be done as a removal and then addition), 
> which is somewhat confusing. The `kudu fs update_dirs` tool is already smart 
> enough to tell users when attention is needed (e.g. if removing directories 
> with tablets striped across them); it wouldn't be unreasonable to think that 
> we could put it in front of (or mirror the behavior in front of) a server 
> startup.
> For administrators who prefer tooling, it probably makes sense to maintain 
> the current, more conservative, less automatic codepaths, and gate it by some 
> flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to