[jira] [Commented] (KUDU-2181) Multi-master config change support

2021-05-26 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352051#comment-17352051
 ] 

ASF subversion and git services commented on KUDU-2181:
---

Commit 04262dc8d20ac441890971d8487eda586fdbe43e in kudu's branch 
refs/heads/branch-1.15.x from Bankim Bhavsar
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=04262dc ]

[master] KUDU-2181 Fix duplicate master address and remove unsafe flag tag

While testing add master on a physical cluster, observed that
supplying only hostnames without port resulted in duplicates
being supplied to bring up new master which in turn leads
to failure in creating distributed Raft config on startup.

For e.g. kudu master add hp1,hp2 h2

Reason being hp2 is compared as hp2:7051 whereas in the vector
of strings "master_addresses", it contains hp1,hp2.

This changes adds a new function ParseAddresses() in HostPort class
that's a variant of existing ParseStrings() function and takes
a vector of strings instead. This new function is used
in the duplicate detection logic.

This change also removes the unsafe tag from
--master_addr_add_new_master as the feature is ready but keeps
it hidden as it's only meant to be used by the add master orchestration
tool.

This one is tricky to test locally and write a test because
we need to specify ports for starting masters/tservers.
Verified the fix on physical cluster. Along with CM integration
we can add a systest later.

Change-Id: Icf29730e3a6b225adb24ff161cac2ad777b46b81
Reviewed-on: http://gerrit.cloudera.org:8080/17500
Tested-by: Bankim Bhavsar 
Reviewed-by: Alexey Serbin 
(cherry picked from commit 261d71ef71fcfdf96c26c9d604d6fc0147c2ee6a)
Reviewed-on: http://gerrit.cloudera.org:8080/17516
Tested-by: Kudu Jenkins


> Multi-master config change support
> --
>
> Key: KUDU-2181
> URL: https://issues.apache.org/jira/browse/KUDU-2181
> Project: Kudu
>  Issue Type: Improvement
>  Components: consensus, master
>Reporter: Mike Percy
>Assignee: Bankim Bhavsar
>Priority: Major
>  Labels: roadmap-candidate
>
> It would be very useful to add support to the Kudu master for dynamic config 
> change. The current procedure for replacing a failed master is fairly arduous.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2181) Multi-master config change support

2021-05-26 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17352036#comment-17352036
 ] 

ASF subversion and git services commented on KUDU-2181:
---

Commit eda678de47856910c3b2aaef6e0e9e11473eb5ff in kudu's branch 
refs/heads/master from Bankim Bhavsar
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=eda678d ]

[master] KUDU-2181 Fix duplicate master address and remove unsafe flag tag

While testing add master on a physical cluster, observed that
supplying only hostnames without port resulted in duplicates
being supplied to bring up new master which in turn leads
to failure in creating distributed Raft config on startup.

For e.g. kudu master add hp1,hp2 h2

Reason being hp2 is compared as hp2:7051 whereas in the vector
of strings "master_addresses", it contains hp1,hp2.

This changes adds a new function ParseAddresses() in HostPort class
that's a variant of existing ParseStrings() function and takes
a vector of strings instead. This new function is used
in the duplicate detection logic.

This change also removes the unsafe tag from
--master_addr_add_new_master as the feature is ready but keeps
it hidden as it's only meant to be used by the add master orchestration
tool.

This one is tricky to test locally and write a test because
we need to specify ports for starting masters/tservers.
Verified the fix on physical cluster. Along with CM integration
we can add a systest later.

Change-Id: Icf29730e3a6b225adb24ff161cac2ad777b46b81
Reviewed-on: http://gerrit.cloudera.org:8080/17500
Tested-by: Bankim Bhavsar 
Reviewed-by: Alexey Serbin 


> Multi-master config change support
> --
>
> Key: KUDU-2181
> URL: https://issues.apache.org/jira/browse/KUDU-2181
> Project: Kudu
>  Issue Type: Improvement
>  Components: consensus, master
>Reporter: Mike Percy
>Assignee: Bankim Bhavsar
>Priority: Major
>  Labels: roadmap-candidate
>
> It would be very useful to add support to the Kudu master for dynamic config 
> change. The current procedure for replacing a failed master is fairly arduous.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-3183) Add tablePrefix option to the Kudu restore job

2021-05-26 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351921#comment-17351921
 ] 

ASF subversion and git services commented on KUDU-3183:
---

Commit 14a28e167630824c1eaeb7139e095e14106e5405 in kudu's branch 
refs/heads/branch-1.15.x from Abhishek Chennaka
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=14a28e1 ]

[backup] KUDU-3183 Add --newDatabaseName and --removeImpalaPrefix 
options to restore job

While Kudu does not have a notion of database, usually the full
table name is stored as . on kudu side. (NOTE:
database name is optional). Using the options in this patch users
can change the existing database name of the table or add a new
database name to the table i.e. the prefix to the '.' in the full
tablename as well as remove impala prefix for the tables which
are being restored.

1.--newDatabaseName : Use this option to specify the new database
name for the restored table. This will overwrite any existing
database and if there is no existing database, a new database will
be added to the table name. Will not affect "impala::" prefix.
E.g: Adding database name "newDB" to the tables impala::default.test,
impala::test, default.test, test will result in the table names
impala::newDB.test, impala::newDB.test, newDB.test, newDB.test .
This will not affect the existing/source tables.

2.--removeImpalaPrefix : If enabled, this option will remove the
“impala::” prefix, if present from the restored table names. This is
advisable if tables are backed up in Kudu clusters without HMS
integration and being restored to Kudu clusters with HMS integration.

Change-Id: I65adcc1b3de0a8e1ac5b7f50a2d3a7036aa69421
Reviewed-on: http://gerrit.cloudera.org:8080/17388
Tested-by: Kudu Jenkins
Reviewed-by: Grant Henke 
(cherry picked from commit bd37d601d36bcf51c595baad52b5d24b5e20d684)
Reviewed-on: http://gerrit.cloudera.org:8080/17513
Tested-by: Grant Henke 
Reviewed-by: Bankim Bhavsar 


> Add tablePrefix option to the Kudu restore job
> --
>
> Key: KUDU-3183
> URL: https://issues.apache.org/jira/browse/KUDU-3183
> Project: Kudu
>  Issue Type: Improvement
>  Components: backup
>Affects Versions: 1.10.0
>Reporter: Grant Henke
>Assignee: Abhishek
>Priority: Minor
>  Labels: beginner, newbie, trivial
>
> The Kudu restore job has a `tableSuffix` option that is useful for slightly 
> changing the name of a table on restore, but it could also benefit from a 
> `tablePrefix` option. This would be especially useful as a way to append a 
> database to a table name when restoring to an environment with HMS sync 
> enabled. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KUDU-2064) Overall log cache usage doesn't respect the limit

2021-05-26 Thread YifanZhang (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351716#comment-17351716
 ] 

YifanZhang commented on KUDU-2064:
--

I also found actual log cache usage exceeded the 
log_cache_size_limit/global_log_cache_limit in a tserver's mem-tracker 
page(kudu version1.12.0):
||Id
 ||Parent
 ||Limit
 ||Current Consumption
 ||Peak Consumption
 ||
|root|none|none|44.97G|76.44G|
|block_cache-sharded_lru_cache|root|none|40.01G|40.02G|
|server|root|none|2.50G|26.29G|
|log_cache|root|1.00G|2.46G|10.89G|
|log_cache:adbee30f32664a48bc24f80b1e53d425:cbcc9aa7ac9c4167a7ba0b540c95c83a|log_cache|128.00M|854.01M|858.10M|
|log_cache:adbee30f32664a48bc24f80b1e53d425:4b2cbe4fd0d64e7d998a8abddbc1fb47|log_cache|128.00M|793.87M|794.58M|
|log_cache:adbee30f32664a48bc24f80b1e53d425:ea0d65bc2f384757b2259a19829fab9c|log_cache|128.00M|254.86M|429.48M|
|log_cache:adbee30f32664a48bc24f80b1e53d425:65065df878a64d1bae52fcd0bf6a2e45|log_cache|128.00M|215.48M|392.56M|

But the tablet that consumes largest log cache is TOMBSTONED, I'm not sure if 
the cache is actually occupied or the MemTracker is not updated.

I also saw some kernel_stack_watchdog traces in the log:
{code:java}
W0526 11:35:35.414122 27289 kernel_stack_watchdog.cc:198] Thread 190027 stuck 
at /home/zhangyifan8/work/kudu-xm/src/kudu/consensus/log.cc:405 for 118ms:
Kernel stack:
[] futex_wait_queue_me+0xc6/0x130
[] futex_wait+0x17b/0x280
[] do_futex+0x106/0x5a0
[] SyS_futex+0x80/0x180
[] system_call_fastpath+0x1c/0x21
[] 0x

User stack:
@ 0x7fe923e72370  (unknown)
@  0x2318d54  kudu::RowOperationsPB::~RowOperationsPB()
@  0x20d0300  kudu::tserver::WriteRequestPB::SharedDtor()
@  0x20d37a8  kudu::tserver::WriteRequestPB::~WriteRequestPB()
@  0x2095703  kudu::consensus::ReplicateMsg::SharedDtor()
@  0x209b038  kudu::consensus::ReplicateMsg::~ReplicateMsg()
@   0xc3d617  kudu::consensus::LogCache::EvictSomeUnlocked()
@   0xc3e052  
_ZNSt17_Function_handlerIFvRKN4kudu6StatusEEZNS0_9consensus8LogCache16AppendOperationsERKSt6vectorI13scoped_refptrINS5_19RefCountedReplicateEESaISA_EERKSt8functionIS4_EEUlS3_E_E9_M_invokeERKSt9_Any_dataS3_
@   0xc89ea9  kudu::log::Log::AppendThread::HandleBatches()
@   0xc8a7ad  kudu::log::Log::AppendThread::ProcessQueue()
@  0x2295cfe  kudu::ThreadPool::DispatchThread()
@  0x228ecaf  kudu::Thread::SuperviseThread()
@ 0x7fe923e6adc5  start_thread
@ 0x7fe92214c73d  __clone
{code}
This often happens when there is a large number of write requests and results 
in slow writes.

 

> Overall log cache usage doesn't respect the limit
> -
>
> Key: KUDU-2064
> URL: https://issues.apache.org/jira/browse/KUDU-2064
> Project: Kudu
>  Issue Type: Bug
>  Components: log
>Affects Versions: 1.4.0
>Reporter: Jean-Daniel Cryans
>Priority: Major
>  Labels: data-scalability
>
> Looking at a fairly loaded machine (10TB of data in LBM, close to 10k 
> tablets), I can see in the mem-trackers page that the log cache is using 
> 1.83GB, that it peaked at 2.82GB, with a 1GB limit. It's consistent on other 
> similarly loaded tservers. It's unexpected.
> Looking at the per-tablet breakdown, they all have between 0 and a handful of 
> MBs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)