Hi Nadav and Waldek,
        Thanks a lot for very detailed answers from both of you. I have some 
updates on this.
For the first question, I ended up implementing my own adhoc stat class where I 
can measure the total time (Or total count) of a function and calculate the 
average. I am still struggling to make the perf work. I got this error when 
using perf kvm as shown here 
https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#profiling 
        Couldn't record guest kernel [0]'s reference relocation symbol.
>From perf. Have you ever encountered this problem when you were developing?

For the second question, I ended up removing the global tlb_flush_mutex and 
introduced linux-like design where you have percpu call_function_data which 
contains a percpu array of call_single_data. Each CPU has its own 
call_single_queue where the call_single_data is enqueued or dequeued. If you 
don’t mind, I can arrange the code a bit and send the patch. Then you can 
review it. I am not sure how the developing process works for OSv and I will 
appreciate it very much if you can give me some guide.

For the scheduling part, I am reading the paper now and the doc. Thanks for the 
resources. I need sometime to digest because I found that preempt_lock matters 
a lot for performance of my code.
    
    Best Wishes
    Pan


> On 28 Nov 2023, at 08:29, Nadav Har'El <[email protected]> wrote:
> 
> On Tue, Nov 28, 2023 at 8:20 AM Waldek Kozaczuk <[email protected] 
> <mailto:[email protected]>> wrote:
>> Hi,
>> 
>> It is great to hear from you. Please see my answers below. 
>> 
>> I hope you also do not mind I reply to the group so others may add something 
>> extra or refine/correct my answers as I am not an original 
>> developer/designer of OSv.
>> 
>> On Fri, Nov 24, 2023 at 8:50 AM Yueyang Pan <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> Dear Waldemar Kozaczuk,
>>>     I am Yueyang Pan from EPFL. Currently I am working on a project about 
>>> remote memory and trying to develop a prototype based on OSv. I am the guy 
>>> who raised the questions on the google group several days ago as well. For 
>>> that question, I made a workaround by adding my own stats class which 
>>> record the sum and count because I need is the average number. Now I have 
>>> some further questions. Probably they are a bit dumb for you but I will be 
>>> very grateful if you could spend a little bit of time to give me some 
>>> suggestions.
>> 
>> The tracepoints use ring buffers of fixed size so eventually, all old 
>> tracepoints would be overwritten by new ones. I think you can either 
>> increase the size or use the approach used by the script freq.py
> 
> Exactly. OSv's tracepoints have two modes. One is indeed to save them in a 
> ring buffer - so you'll see the last N traced events when you read that 
> buffer - but other is a mode that just counts the events. What freq.py does 
> is to retrieve the count at one second, then retrieve the count the next 
> second - and the subtraction is the average number of this even per second.
> 
> If you want instead of counting the event, to have a sum of, say, integers 
> that come from the event (e.g., sum of packet lengths), we don't have support 
> for this at the moment - we only increment the count by 1. It could be added 
> as a feature, I guess. But you can always do something ad-hoc like maintain a 
> global variable which you add.
>  
>> (you need to add the module httpserver-monitoring-api). There is also newly 
>> added (experimental though) strace-like functionality (see 
>> https://github.com/cloudius-systems/osv/commit/7d7b6d0f1261b87b678c572068e39d482e2103e4).
>>  Finally, you may find the comments on this issue relevant - 
>> https://github.com/cloudius-systems/osv/issues/1261#issuecomment-1722549524. 
>> I am also sure you have come across this wiki page - 
>> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py.
>> 
>>>     Now after my profiling, I found the mutex in global tib_flush_mutex to 
>>> be hot in my benchmark so I am trying to remove it but it turns to be a bit 
>>> hard without understanding the thread model of OSv. So I would like to ask 
>>> whether there is any high-level doc that describes what the scheduling 
>>> policy of OSv is, how the priority of the threads are decided, whether we 
>>> can disable preemption or not (the functionality of preempt_lock) and the 
>>> design of synchronisation primitives (for example why it is not allowed to 
>>> have preemption disabled inside lockfree::mutex). I am trying to understand 
>>> by reading the code directly but it can be really helpful if there is some 
>>> material which describes the design.
> 
> There are a lot of questions here, and I'm not even sure answering them will 
> explain specifically why tlb_flush_mutex is highly contested in your workload.
> 
> Waldek suggested that you read the OSv paper from Usenix, which is a good 
> start for understanding the overall OSv architecture.
> The scheduling policy and priority (how to decide which thread should run 
> next) is described in more detail in this document: 
> https://docs.google.com/document/d/1W7KCxOxP-1Fy5EyF2lbJGE2WuKmu5v0suYqoHas1jRM/edit
> 
> If you have specific questions, post them here and I'll try to answer. But 
> only a few at a time :-) You had a lot of questions above and I can't answer 
> them all in one mail :-)

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/FFE772FD-E0EE-49DD-9F15-6BC5B4912F59%40epfl.ch.

Reply via email to