Hi Nadav and Waldek,
Thanks a lot for very detailed answers from both of you. I have some
updates on this.
For the first question, I ended up implementing my own adhoc stat class where I
can measure the total time (Or total count) of a function and calculate the
average. I am still struggling to make the perf work. I got this error when
using perf kvm as shown here
https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#profiling
Couldn't record guest kernel [0]'s reference relocation symbol.
>From perf. Have you ever encountered this problem when you were developing?
For the second question, I ended up removing the global tlb_flush_mutex and
introduced linux-like design where you have percpu call_function_data which
contains a percpu array of call_single_data. Each CPU has its own
call_single_queue where the call_single_data is enqueued or dequeued. If you
don’t mind, I can arrange the code a bit and send the patch. Then you can
review it. I am not sure how the developing process works for OSv and I will
appreciate it very much if you can give me some guide.
For the scheduling part, I am reading the paper now and the doc. Thanks for the
resources. I need sometime to digest because I found that preempt_lock matters
a lot for performance of my code.
Best Wishes
Pan
> On 28 Nov 2023, at 08:29, Nadav Har'El <[email protected]> wrote:
>
> On Tue, Nov 28, 2023 at 8:20 AM Waldek Kozaczuk <[email protected]
> <mailto:[email protected]>> wrote:
>> Hi,
>>
>> It is great to hear from you. Please see my answers below.
>>
>> I hope you also do not mind I reply to the group so others may add something
>> extra or refine/correct my answers as I am not an original
>> developer/designer of OSv.
>>
>> On Fri, Nov 24, 2023 at 8:50 AM Yueyang Pan <[email protected]
>> <mailto:[email protected]>> wrote:
>>> Dear Waldemar Kozaczuk,
>>> I am Yueyang Pan from EPFL. Currently I am working on a project about
>>> remote memory and trying to develop a prototype based on OSv. I am the guy
>>> who raised the questions on the google group several days ago as well. For
>>> that question, I made a workaround by adding my own stats class which
>>> record the sum and count because I need is the average number. Now I have
>>> some further questions. Probably they are a bit dumb for you but I will be
>>> very grateful if you could spend a little bit of time to give me some
>>> suggestions.
>>
>> The tracepoints use ring buffers of fixed size so eventually, all old
>> tracepoints would be overwritten by new ones. I think you can either
>> increase the size or use the approach used by the script freq.py
>
> Exactly. OSv's tracepoints have two modes. One is indeed to save them in a
> ring buffer - so you'll see the last N traced events when you read that
> buffer - but other is a mode that just counts the events. What freq.py does
> is to retrieve the count at one second, then retrieve the count the next
> second - and the subtraction is the average number of this even per second.
>
> If you want instead of counting the event, to have a sum of, say, integers
> that come from the event (e.g., sum of packet lengths), we don't have support
> for this at the moment - we only increment the count by 1. It could be added
> as a feature, I guess. But you can always do something ad-hoc like maintain a
> global variable which you add.
>
>> (you need to add the module httpserver-monitoring-api). There is also newly
>> added (experimental though) strace-like functionality (see
>> https://github.com/cloudius-systems/osv/commit/7d7b6d0f1261b87b678c572068e39d482e2103e4).
>> Finally, you may find the comments on this issue relevant -
>> https://github.com/cloudius-systems/osv/issues/1261#issuecomment-1722549524.
>> I am also sure you have come across this wiki page -
>> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py.
>>
>>> Now after my profiling, I found the mutex in global tib_flush_mutex to
>>> be hot in my benchmark so I am trying to remove it but it turns to be a bit
>>> hard without understanding the thread model of OSv. So I would like to ask
>>> whether there is any high-level doc that describes what the scheduling
>>> policy of OSv is, how the priority of the threads are decided, whether we
>>> can disable preemption or not (the functionality of preempt_lock) and the
>>> design of synchronisation primitives (for example why it is not allowed to
>>> have preemption disabled inside lockfree::mutex). I am trying to understand
>>> by reading the code directly but it can be really helpful if there is some
>>> material which describes the design.
>
> There are a lot of questions here, and I'm not even sure answering them will
> explain specifically why tlb_flush_mutex is highly contested in your workload.
>
> Waldek suggested that you read the OSv paper from Usenix, which is a good
> start for understanding the overall OSv architecture.
> The scheduling policy and priority (how to decide which thread should run
> next) is described in more detail in this document:
> https://docs.google.com/document/d/1W7KCxOxP-1Fy5EyF2lbJGE2WuKmu5v0suYqoHas1jRM/edit
>
> If you have specific questions, post them here and I'll try to answer. But
> only a few at a time :-) You had a lot of questions above and I can't answer
> them all in one mail :-)
--
You received this message because you are subscribed to the Google Groups "OSv
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/osv-dev/FFE772FD-E0EE-49DD-9F15-6BC5B4912F59%40epfl.ch.