[ 
https://issues.apache.org/jira/browse/KUDU-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054781#comment-17054781
 ] 

wangningito commented on KUDU-3070:
-----------------------------------

[~helifu]

In a ToB company we met lots different scenarios.  It's always disturbs when 
migrating across two different clusters or just two standalone nodes.

As I understand, 'insert into ... select ...from ' is executed by impala. It's 
not as fast as cli tool 'kudu table copy'. (I am not quite familiar with 
impala).

And as we benchmarked, migrate data via 'kudu table copy' is not as fast as 
'kudu local_replica copy_from_remote' or just dump tablet data.

So, our scenario requires rewrite_raft_config more.

 

> allow skip open block manager in cli cmeta rewrite_raft_config operation
> ------------------------------------------------------------------------
>
>                 Key: KUDU-3070
>                 URL: https://issues.apache.org/jira/browse/KUDU-3070
>             Project: Kudu
>          Issue Type: New Feature
>          Components: CLI
>            Reporter: wangningito
>            Priority: Minor
>
> I'm in a bigdata company which served over 1000+ company, we adopted kudu as 
> main or auxiliary storage engine, some of our customers  are just small 
> startups, they had a lot of data but too much nodes are expensive to them.
> So some of cases are based on: few nodes, much data and maybe not compacted 
> well data.
> In our scenarios, there exists some migration cases
>  # from standalone tserver to another standalone tserver
>  # from 3 nodes tserver cluster to another 3 nodes tserver
> In the past, we have to do something like this 
> {code:java}
> // First, download tablet data via kudu local_replica copy_from_remote
> // then rewrite all the raft info for each tablet
> echo ${tablet_id_list} | xargs -i kudu local_replica cmeta 
> rewrite_raft_config {} PEER_INFO -fs_data_dirs=xxx -fs_wal_dir=yyy{code}
> Download data via copy_from_remote is blazing fast.  
> However sometimes it takes us a lot of time to rewrite raft info of all 
> tablet, 30s - 60s per tablet as I witnessed. Sometimes it could take more 
> time if the data were not fully compacted. So sometimes it take us 2 hours to 
> download tablet data, but 6 hours to rewrite meta. 
>  I noticed some code fragment  in RewriteRaftConfig function
> {code:java}
> FsManager fs_manager(env, FsManagerOpts()); 
> RETURN_NOT_OK(fs_manager.Open());{code}
> This means I have to open the fs_data_dirs and fs_wal_dir 100 times if I want 
> to rewrite raft of 100 tablets.
> To saving the overhead of each operation, we can just skip opening block 
> manager for rewrite_raft_config, cause all the operations only happened on 
> meta files.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to