[jira] [Assigned] (KUDU-2345) Add developer docs for the python client

2020-08-10 Thread Attila Bukor (Jira)


 [ 
https://issues.apache.org/jira/browse/KUDU-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Bukor reassigned KUDU-2345:
--

Assignee: Attila Bukor  (was: Jordan Birdsell)

> Add developer docs for the python client
> 
>
> Key: KUDU-2345
> URL: https://issues.apache.org/jira/browse/KUDU-2345
> Project: Kudu
>  Issue Type: Improvement
>  Components: documentation, python
>Reporter: Grant Henke
>Assignee: Attila Bukor
>Priority: Minor
>
> I am far from a Python expert. Especially with Cython in the mix, so it took 
> me a bit just to get started working on the kudu python client. 
> We should document basic steps for how to develop and test the kudu python 
> client. Including environment setup, building, and testing (running a single 
> test too).
> For now I essentially boiled my work down to this:
>  
> {code:java}
> cd /path/to/kudu
> cd build/debug 
> make -j4
> make install
> cd /path/to/kudu/python
> git clean -fdx
> export KUDU_HOME=/path/to/kudu
> pip install -r requirements.txt
> python setup.py build_ext
> python setup.py test
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KUDU-3180) kudu don't always prefer to flush MRS/DMS that anchor more memory

2020-08-10 Thread YifanZhang (Jira)


[ 
https://issues.apache.org/jira/browse/KUDU-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173104#comment-17173104
 ] 

YifanZhang edited comment on KUDU-3180 at 8/10/20, 10:57 AM:
-

If we lower {{-memory_pressure_percentage}}, we should also lower 
{{-block_cache_capacity_mb}} accordingly, then we may not make full use of the 
memory resources. 

In fact most time of a day the memory usage of our kudu server is not very 
high(about 50%), but there will be a lot of insert/update in one hour or two 
and the memory usage is significantly growing, at this time kudu did flush big 
MRSs/DMSs in priority but sometimes OOM still occurred, even though we have 
tuned {{-maintenance_manager_num_threads}} to 20. After we tuned 
{{-flush_threshold_secs}} to 1800(was 3600 before), we could avoid OOM 
occurring but I found that {{average_diskrowset_height}} of most tablets become 
larger, that means these tablets need to be compacted more.

In general we want to prioritize flushes so we could free more memory, but also 
don't want to get more small DRSs. So maybe prioritize bigger MRS/DMS flushes 
would help.

Maybe could use {{max(memory_size, time_since_last_flush }} to define perf 
improvement of a mem-store flush, so that both big mem-stores and long_lived 
mem-stores  could be flushed in priority.

 


was (Author: zhangyifan27):
If we lower {{-memory_pressure_percentage}}, we should also lower 
{{-block_cache_capacity_mb}} accordingly, that may not make full use of the 
memory resources. 

In fact most time of a day the memory usage of our kudu server is not very 
high(about 50%), but there will be a lot of insert/update in one hour or two 
and the memory usage is significantly growing, at this time kudu did flush big 
MRSs/DMSs in priority but sometimes OOM still occurred, even though we have 
tuned {{-maintenance_manager_num_threads}} to 20. After we tuned 
{{-flush_threshold_secs}} to 1800(was 3600 before), we could avoid OOM 
occurring but I found that {{average_diskrowset_height}} of most tablets become 
larger, that means these tablets need to be compacted more.

In general we want to prioritize flushes so we could free more memory, but also 
don't want to get more small DRSs. So maybe prioritize bigger MRS/DMS flushes 
would help.

> kudu don't always prefer to flush MRS/DMS that anchor more memory
> -
>
> Key: KUDU-3180
> URL: https://issues.apache.org/jira/browse/KUDU-3180
> Project: Kudu
>  Issue Type: Improvement
>Reporter: YifanZhang
>Priority: Major
> Attachments: image-2020-08-04-20-26-53-749.png, 
> image-2020-08-04-20-28-00-665.png
>
>
> Current time-based flush policy always give a flush op a high score if we 
> haven't flushed for the tablet in a long time, that may lead to starvation of 
> ops that could free more memory.
> We set  -flush_threshold_mb=32,  -flush_threshold_secs=1800 in a cluster, and 
> find that some small MRS/DMS flushes has a higher perf score than big MRS/DMS 
> flushes and compactions, which seems not so reasonable.
> !image-2020-08-04-20-26-53-749.png|width=1424,height=317!!image-2020-08-04-20-28-00-665.png|width=1414,height=327!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)