[ 
https://issues.apache.org/jira/browse/KUDU-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963802#comment-16963802
 ] 

YangSong commented on KUDU-2975:
--------------------------------

Thank you, let me summarize the implementation:
 # We need to add a new gflag, such as "–fs_wal_dirs", to support spreading WAL 
across multiple dirs. And we should keep around {{--fs_wal_dir}} for backwards 
compatibility. User can chose one of them.
 # The first time 'fs_manager' is initialized it needs to generate an instance 
file per wal directory. If the data directories (fs_data_dirs) not provided, we 
use write-ahead log directories(fs_wal_dirs) as data directories. If the 
metadata directory not provided, we use the first wal directories or the first 
data directories. If one of the WAL directories doesn't exist, report a fatal 
error. If some of WAL directories have 'instance' file, but some of them have 
not, report a fatal error. 
 # Add a class WalDirManager, maybe like this:class WalDirManager \{ public:   
static Status Create(CanonicalizedRootsList wal_fs_roots,   
std::unique_ptr<WalDirManager>* wal_manager);   static Status 
Open(CanonicalizedRootsList wal_fs_roots,   std::unique_ptr<WalDirManager>* 
wal_manager);   ~WalDirManager();   void Shutdown();   Status 
LoadWalDirFromPB(const std::string& tablet_id, const WalDirPB& pb);   
std::set<std::string> FindTabletsByWALDir(const std::string& wal_dir) const;   
Status FindWalDirByTabletId(const std::string& tablet_id, std::string* wal_dir) 
const;   Status MarkWalDirsFailed(const std::string& error_message = "");   
void MarkWalDirFailed(const std::string& dir);   bool IsWalDirFailed(const 
std::string& dir) const;   const std::set<string> GetFailedDataDirs() const;   
std::vector<std::string> GetWalDirs() const;   string GetWalDirByUuid(string 
uuid) const;   Status CreateWalDir(const std::string& tablet_id); private:   
WalDirManager(CanonicalizedRootsList canonicalized_wal_roots);   const 
CanonicalizedRootsList canonicalized_wal_fs_roots_;   typedef 
std::unordered_map<std::string, std::string> DirByUuidMap;   DirByUuidMap 
dir_by_uuid_;   typedef std::multimap<std::string, std::string> 
TabletsByDirMap;   TabletsByDirMap tablets_by_dir_;   typedef std::set<string> 
FailedWalDirSet;   FailedWalDirSet failed_data_dirs_; };
 We need to update the "instance" file under per WAL dir when creating a new 
WalDirManager class. Each wal directory generates its own uuid, and recorde it 
in the instance file.The directory structure may be like this:   

 

--wal        ----instance

 # adf
 # asdfadf
 # dasf

> Spread WAL across multiple data directories
> -------------------------------------------
>
>                 Key: KUDU-2975
>                 URL: https://issues.apache.org/jira/browse/KUDU-2975
>             Project: Kudu
>          Issue Type: New Feature
>          Components: fs, tablet, tserver
>            Reporter: LiFu He
>            Priority: Major
>         Attachments: network.png, tserver-WARNING.png, util.png
>
>
> Recently, we deployed a new kudu cluster and every node has 12 SSD. Then, we 
> created a big table and loaded data to it through flink.  We noticed that the 
> util of one SSD which is used to store WAL is 100% but others are free. So, 
> we suggest to spread WAL across multiple data directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to