On Tue, Nov 28, 2023 at 3:04 PM Yueyang Pan <[email protected]> wrote:

> Hi Nadav and Waldek,
> Thanks a lot for very detailed answers from both of you. I have some
> updates on this.
> For the first question, I ended up implementing my own adhoc stat class
> where I can measure the total time (Or total count) of a function and
> calculate the average. I am still struggling to make the perf work. I got
> this error when using perf kvm as shown here
> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#profiling
> *Couldn't record guest kernel [0]'s reference relocation symbol*.
> From perf. Have you ever encountered this problem when you were developing?
>

I have never seen it but I will try to dig a bit deeper once I have time.

>
> For the second question, I ended up removing the global tlb_flush_mutex
> and introduced linux-like design where you have percpu call_function_data
> which contains a percpu array of call_single_data. Each CPU has its own
> call_single_queue where the call_single_data is enqueued or dequeued. If
> you don’t mind, I can arrange the code a bit and send the patch. Then you
> can review it. I am not sure how the developing process works for OSv and I
> will appreciate it very much if you can give me some guide.
>
Feel free to create PR on github.

Do you see significant improvement with your change to use percpu
call_function_data? OSv has its one percpu structures concept
(see include/osv/percpu.hh) so I wonder if you can leverage it.

I wonder how this Linux-like solution helps given that the point of the
mmu::flush_tlb_all() (where tlb_flush_mutex is used) is to coordinate the
flushing of TLB and make sure all CPUs do it so the virtual/physical
mapping is in sync across all CPUs. How do you achieve it in your solution?
Is potential speed improvement gained from avoiding IPIs which are known to
be slow?


> For the scheduling part, I am reading the paper now and the doc. Thanks
> for the resources. I need sometime to digest because I found that
> preempt_lock matters a lot for performance of my code.
>
>     Best Wishes
>     Pan
>
>
> On 28 Nov 2023, at 08:29, Nadav Har'El <[email protected]> wrote:
>
> On Tue, Nov 28, 2023 at 8:20 AM Waldek Kozaczuk <[email protected]>
> wrote:
>
>> Hi,
>>
>> It is great to hear from you. Please see my answers below.
>>
>> I hope you also do not mind I reply to the group so others may add
>> something extra or refine/correct my answers as I am not an original
>> developer/designer of OSv.
>>
>> On Fri, Nov 24, 2023 at 8:50 AM Yueyang Pan <[email protected]> wrote:
>>
>>> Dear Waldemar Kozaczuk,
>>>     I am Yueyang Pan from EPFL. Currently I am working on a project
>>> about remote memory and trying to develop a prototype based on OSv. I am
>>> the guy who raised the questions on the google group several days ago as
>>> well. For that question, I made a workaround by adding my own stats class
>>> which record the sum and count because I need is the average number. Now I
>>> have some further questions. Probably they are a bit dumb for you but I
>>> will be very grateful if you could spend a little bit of time to give me
>>> some suggestions.
>>>
>>
>> The tracepoints use ring buffers of fixed size so eventually, all old
>> tracepoints would be overwritten by new ones. I think you can either
>> increase the size or use the approach used by the script *freq.py*
>>
>
> Exactly. OSv's tracepoints have two modes. One is indeed to save them in a
> ring buffer - so you'll see the last N traced events when you read that
> buffer - but other is a mode that just counts the events. What freq.py does
> is to retrieve the count at one second, then retrieve the count the next
> second - and the subtraction is the average number of this even per second.
>
> If you want instead of counting the event, to have a sum of, say, integers
> that come from the event (e.g., sum of packet lengths), we don't have
> support for this at the moment - we only increment the count by 1. It could
> be added as a feature, I guess. But you can always do something ad-hoc like
> maintain a global variable which you add.
>
>
>> (you need to add the module *httpserver-monitoring-api)*. There is also
>> newly added (experimental though) strace-like functionality (see
>> https://github.com/cloudius-systems/osv/commit/7d7b6d0f1261b87b678c572068e39d482e2103e4).
>> Finally, you may find the comments on this issue relevant -
>> https://github.com/cloudius-systems/osv/issues/1261#issuecomment-1722549524.
>> I am also sure you have come across this wiki page -
>> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py
>> .
>>
>>     Now after my profiling, I found the mutex in global tib_flush_mutex
>>> to be hot in my benchmark so I am trying to remove it but it turns to be a
>>> bit hard without understanding the thread model of OSv. So I would like to
>>> ask whether there is any high-level doc that describes what the scheduling
>>> policy of OSv is, how the priority of the threads are decided, whether we
>>> can disable preemption or not (the functionality of preempt_lock) and the
>>> design of synchronisation primitives (for example why it is not allowed to
>>> have preemption disabled inside lockfree::mutex). I am trying to understand
>>> by reading the code directly but it can be really helpful if there is some
>>> material which describes the design.
>>
>>
> There are a lot of questions here, and I'm not even sure answering them
> will explain specifically why tlb_flush_mutex is highly contested in your
> workload.
>
> Waldek suggested that you read the OSv paper from Usenix, which is a good
> start for understanding the overall OSv architecture.
> The scheduling policy and priority (how to decide which thread should run
> next) is described in more detail in this document:
> https://docs.google.com/document/d/1W7KCxOxP-1Fy5EyF2lbJGE2WuKmu5v0suYqoHas1jRM/edit
>
> If you have specific questions, post them here and I'll try to answer. But
> only a few at a time :-) You had a lot of questions above and I can't
> answer them all in one mail :-)
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CAL9cFfM9A3O7871qdofZZEdCqmq-n7k7Lsdn%3DzC36YkEjdNCxQ%40mail.gmail.com.

Reply via email to