[jira] [Commented] (KUDU-2181) Multi-master config change support
[ https://issues.apache.org/jira/browse/KUDU-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352051#comment-17352051 ] ASF subversion and git services commented on KUDU-2181: --- Commit 04262dc8d20ac441890971d8487eda586fdbe43e in kudu's branch refs/heads/branch-1.15.x from Bankim Bhavsar [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=04262dc ] [master] KUDU-2181 Fix duplicate master address and remove unsafe flag tag While testing add master on a physical cluster, observed that supplying only hostnames without port resulted in duplicates being supplied to bring up new master which in turn leads to failure in creating distributed Raft config on startup. For e.g. kudu master add hp1,hp2 h2 Reason being hp2 is compared as hp2:7051 whereas in the vector of strings "master_addresses", it contains hp1,hp2. This changes adds a new function ParseAddresses() in HostPort class that's a variant of existing ParseStrings() function and takes a vector of strings instead. This new function is used in the duplicate detection logic. This change also removes the unsafe tag from --master_addr_add_new_master as the feature is ready but keeps it hidden as it's only meant to be used by the add master orchestration tool. This one is tricky to test locally and write a test because we need to specify ports for starting masters/tservers. Verified the fix on physical cluster. Along with CM integration we can add a systest later. Change-Id: Icf29730e3a6b225adb24ff161cac2ad777b46b81 Reviewed-on: http://gerrit.cloudera.org:8080/17500 Tested-by: Bankim Bhavsar Reviewed-by: Alexey Serbin (cherry picked from commit 261d71ef71fcfdf96c26c9d604d6fc0147c2ee6a) Reviewed-on: http://gerrit.cloudera.org:8080/17516 Tested-by: Kudu Jenkins > Multi-master config change support > -- > > Key: KUDU-2181 > URL: https://issues.apache.org/jira/browse/KUDU-2181 > Project: Kudu > Issue Type: Improvement > Components: consensus, master >Reporter: Mike Percy >Assignee: Bankim Bhavsar >Priority: Major > Labels: roadmap-candidate > > It would be very useful to add support to the Kudu master for dynamic config > change. The current procedure for replacing a failed master is fairly arduous. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-2181) Multi-master config change support
[ https://issues.apache.org/jira/browse/KUDU-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352036#comment-17352036 ] ASF subversion and git services commented on KUDU-2181: --- Commit eda678de47856910c3b2aaef6e0e9e11473eb5ff in kudu's branch refs/heads/master from Bankim Bhavsar [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=eda678d ] [master] KUDU-2181 Fix duplicate master address and remove unsafe flag tag While testing add master on a physical cluster, observed that supplying only hostnames without port resulted in duplicates being supplied to bring up new master which in turn leads to failure in creating distributed Raft config on startup. For e.g. kudu master add hp1,hp2 h2 Reason being hp2 is compared as hp2:7051 whereas in the vector of strings "master_addresses", it contains hp1,hp2. This changes adds a new function ParseAddresses() in HostPort class that's a variant of existing ParseStrings() function and takes a vector of strings instead. This new function is used in the duplicate detection logic. This change also removes the unsafe tag from --master_addr_add_new_master as the feature is ready but keeps it hidden as it's only meant to be used by the add master orchestration tool. This one is tricky to test locally and write a test because we need to specify ports for starting masters/tservers. Verified the fix on physical cluster. Along with CM integration we can add a systest later. Change-Id: Icf29730e3a6b225adb24ff161cac2ad777b46b81 Reviewed-on: http://gerrit.cloudera.org:8080/17500 Tested-by: Bankim Bhavsar Reviewed-by: Alexey Serbin > Multi-master config change support > -- > > Key: KUDU-2181 > URL: https://issues.apache.org/jira/browse/KUDU-2181 > Project: Kudu > Issue Type: Improvement > Components: consensus, master >Reporter: Mike Percy >Assignee: Bankim Bhavsar >Priority: Major > Labels: roadmap-candidate > > It would be very useful to add support to the Kudu master for dynamic config > change. The current procedure for replacing a failed master is fairly arduous. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-3183) Add tablePrefix option to the Kudu restore job
[ https://issues.apache.org/jira/browse/KUDU-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351921#comment-17351921 ] ASF subversion and git services commented on KUDU-3183: --- Commit 14a28e167630824c1eaeb7139e095e14106e5405 in kudu's branch refs/heads/branch-1.15.x from Abhishek Chennaka [ https://gitbox.apache.org/repos/asf?p=kudu.git;h=14a28e1 ] [backup] KUDU-3183 Add --newDatabaseName and --removeImpalaPrefix options to restore job While Kudu does not have a notion of database, usually the full table name is stored as . on kudu side. (NOTE: database name is optional). Using the options in this patch users can change the existing database name of the table or add a new database name to the table i.e. the prefix to the '.' in the full tablename as well as remove impala prefix for the tables which are being restored. 1.--newDatabaseName : Use this option to specify the new database name for the restored table. This will overwrite any existing database and if there is no existing database, a new database will be added to the table name. Will not affect "impala::" prefix. E.g: Adding database name "newDB" to the tables impala::default.test, impala::test, default.test, test will result in the table names impala::newDB.test, impala::newDB.test, newDB.test, newDB.test . This will not affect the existing/source tables. 2.--removeImpalaPrefix : If enabled, this option will remove the “impala::” prefix, if present from the restored table names. This is advisable if tables are backed up in Kudu clusters without HMS integration and being restored to Kudu clusters with HMS integration. Change-Id: I65adcc1b3de0a8e1ac5b7f50a2d3a7036aa69421 Reviewed-on: http://gerrit.cloudera.org:8080/17388 Tested-by: Kudu Jenkins Reviewed-by: Grant Henke (cherry picked from commit bd37d601d36bcf51c595baad52b5d24b5e20d684) Reviewed-on: http://gerrit.cloudera.org:8080/17513 Tested-by: Grant Henke Reviewed-by: Bankim Bhavsar > Add tablePrefix option to the Kudu restore job > -- > > Key: KUDU-3183 > URL: https://issues.apache.org/jira/browse/KUDU-3183 > Project: Kudu > Issue Type: Improvement > Components: backup >Affects Versions: 1.10.0 >Reporter: Grant Henke >Assignee: Abhishek >Priority: Minor > Labels: beginner, newbie, trivial > > The Kudu restore job has a `tableSuffix` option that is useful for slightly > changing the name of a table on restore, but it could also benefit from a > `tablePrefix` option. This would be especially useful as a way to append a > database to a table name when restoring to an environment with HMS sync > enabled. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KUDU-2064) Overall log cache usage doesn't respect the limit
[ https://issues.apache.org/jira/browse/KUDU-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351716#comment-17351716 ] YifanZhang commented on KUDU-2064: -- I also found actual log cache usage exceeded the log_cache_size_limit/global_log_cache_limit in a tserver's mem-tracker page(kudu version1.12.0): ||Id ||Parent ||Limit ||Current Consumption ||Peak Consumption || |root|none|none|44.97G|76.44G| |block_cache-sharded_lru_cache|root|none|40.01G|40.02G| |server|root|none|2.50G|26.29G| |log_cache|root|1.00G|2.46G|10.89G| |log_cache:adbee30f32664a48bc24f80b1e53d425:cbcc9aa7ac9c4167a7ba0b540c95c83a|log_cache|128.00M|854.01M|858.10M| |log_cache:adbee30f32664a48bc24f80b1e53d425:4b2cbe4fd0d64e7d998a8abddbc1fb47|log_cache|128.00M|793.87M|794.58M| |log_cache:adbee30f32664a48bc24f80b1e53d425:ea0d65bc2f384757b2259a19829fab9c|log_cache|128.00M|254.86M|429.48M| |log_cache:adbee30f32664a48bc24f80b1e53d425:65065df878a64d1bae52fcd0bf6a2e45|log_cache|128.00M|215.48M|392.56M| But the tablet that consumes largest log cache is TOMBSTONED, I'm not sure if the cache is actually occupied or the MemTracker is not updated. I also saw some kernel_stack_watchdog traces in the log: {code:java} W0526 11:35:35.414122 27289 kernel_stack_watchdog.cc:198] Thread 190027 stuck at /home/zhangyifan8/work/kudu-xm/src/kudu/consensus/log.cc:405 for 118ms: Kernel stack: [] futex_wait_queue_me+0xc6/0x130 [] futex_wait+0x17b/0x280 [] do_futex+0x106/0x5a0 [] SyS_futex+0x80/0x180 [] system_call_fastpath+0x1c/0x21 [] 0x User stack: @ 0x7fe923e72370 (unknown) @ 0x2318d54 kudu::RowOperationsPB::~RowOperationsPB() @ 0x20d0300 kudu::tserver::WriteRequestPB::SharedDtor() @ 0x20d37a8 kudu::tserver::WriteRequestPB::~WriteRequestPB() @ 0x2095703 kudu::consensus::ReplicateMsg::SharedDtor() @ 0x209b038 kudu::consensus::ReplicateMsg::~ReplicateMsg() @ 0xc3d617 kudu::consensus::LogCache::EvictSomeUnlocked() @ 0xc3e052 _ZNSt17_Function_handlerIFvRKN4kudu6StatusEEZNS0_9consensus8LogCache16AppendOperationsERKSt6vectorI13scoped_refptrINS5_19RefCountedReplicateEESaISA_EERKSt8functionIS4_EEUlS3_E_E9_M_invokeERKSt9_Any_dataS3_ @ 0xc89ea9 kudu::log::Log::AppendThread::HandleBatches() @ 0xc8a7ad kudu::log::Log::AppendThread::ProcessQueue() @ 0x2295cfe kudu::ThreadPool::DispatchThread() @ 0x228ecaf kudu::Thread::SuperviseThread() @ 0x7fe923e6adc5 start_thread @ 0x7fe92214c73d __clone {code} This often happens when there is a large number of write requests and results in slow writes. > Overall log cache usage doesn't respect the limit > - > > Key: KUDU-2064 > URL: https://issues.apache.org/jira/browse/KUDU-2064 > Project: Kudu > Issue Type: Bug > Components: log >Affects Versions: 1.4.0 >Reporter: Jean-Daniel Cryans >Priority: Major > Labels: data-scalability > > Looking at a fairly loaded machine (10TB of data in LBM, close to 10k > tablets), I can see in the mem-trackers page that the log cache is using > 1.83GB, that it peaked at 2.82GB, with a 1GB limit. It's consistent on other > similarly loaded tservers. It's unexpected. > Looking at the per-tablet breakdown, they all have between 0 and a handful of > MBs. -- This message was sent by Atlassian Jira (v8.3.4#803005)