On Tue, Nov 28, 2023 at 3:04 PM Yueyang Pan <[email protected]> wrote:
> Hi Nadav and Waldek, > Thanks a lot for very detailed answers from both of you. I have some > updates on this. > For the first question, I ended up implementing my own adhoc stat class > where I can measure the total time (Or total count) of a function and > calculate the average. I am still struggling to make the perf work. I got > this error when using perf kvm as shown here > https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#profiling > *Couldn't record guest kernel [0]'s reference relocation symbol*. > From perf. Have you ever encountered this problem when you were developing? > I have never seen it but I will try to dig a bit deeper once I have time. > > For the second question, I ended up removing the global tlb_flush_mutex > and introduced linux-like design where you have percpu call_function_data > which contains a percpu array of call_single_data. Each CPU has its own > call_single_queue where the call_single_data is enqueued or dequeued. If > you don’t mind, I can arrange the code a bit and send the patch. Then you > can review it. I am not sure how the developing process works for OSv and I > will appreciate it very much if you can give me some guide. > Feel free to create PR on github. Do you see significant improvement with your change to use percpu call_function_data? OSv has its one percpu structures concept (see include/osv/percpu.hh) so I wonder if you can leverage it. I wonder how this Linux-like solution helps given that the point of the mmu::flush_tlb_all() (where tlb_flush_mutex is used) is to coordinate the flushing of TLB and make sure all CPUs do it so the virtual/physical mapping is in sync across all CPUs. How do you achieve it in your solution? Is potential speed improvement gained from avoiding IPIs which are known to be slow? > For the scheduling part, I am reading the paper now and the doc. Thanks > for the resources. I need sometime to digest because I found that > preempt_lock matters a lot for performance of my code. > > Best Wishes > Pan > > > On 28 Nov 2023, at 08:29, Nadav Har'El <[email protected]> wrote: > > On Tue, Nov 28, 2023 at 8:20 AM Waldek Kozaczuk <[email protected]> > wrote: > >> Hi, >> >> It is great to hear from you. Please see my answers below. >> >> I hope you also do not mind I reply to the group so others may add >> something extra or refine/correct my answers as I am not an original >> developer/designer of OSv. >> >> On Fri, Nov 24, 2023 at 8:50 AM Yueyang Pan <[email protected]> wrote: >> >>> Dear Waldemar Kozaczuk, >>> I am Yueyang Pan from EPFL. Currently I am working on a project >>> about remote memory and trying to develop a prototype based on OSv. I am >>> the guy who raised the questions on the google group several days ago as >>> well. For that question, I made a workaround by adding my own stats class >>> which record the sum and count because I need is the average number. Now I >>> have some further questions. Probably they are a bit dumb for you but I >>> will be very grateful if you could spend a little bit of time to give me >>> some suggestions. >>> >> >> The tracepoints use ring buffers of fixed size so eventually, all old >> tracepoints would be overwritten by new ones. I think you can either >> increase the size or use the approach used by the script *freq.py* >> > > Exactly. OSv's tracepoints have two modes. One is indeed to save them in a > ring buffer - so you'll see the last N traced events when you read that > buffer - but other is a mode that just counts the events. What freq.py does > is to retrieve the count at one second, then retrieve the count the next > second - and the subtraction is the average number of this even per second. > > If you want instead of counting the event, to have a sum of, say, integers > that come from the event (e.g., sum of packet lengths), we don't have > support for this at the moment - we only increment the count by 1. It could > be added as a feature, I guess. But you can always do something ad-hoc like > maintain a global variable which you add. > > >> (you need to add the module *httpserver-monitoring-api)*. There is also >> newly added (experimental though) strace-like functionality (see >> https://github.com/cloudius-systems/osv/commit/7d7b6d0f1261b87b678c572068e39d482e2103e4). >> Finally, you may find the comments on this issue relevant - >> https://github.com/cloudius-systems/osv/issues/1261#issuecomment-1722549524. >> I am also sure you have come across this wiki page - >> https://github.com/cloudius-systems/osv/wiki/Trace-analysis-using-trace.py >> . >> >> Now after my profiling, I found the mutex in global tib_flush_mutex >>> to be hot in my benchmark so I am trying to remove it but it turns to be a >>> bit hard without understanding the thread model of OSv. So I would like to >>> ask whether there is any high-level doc that describes what the scheduling >>> policy of OSv is, how the priority of the threads are decided, whether we >>> can disable preemption or not (the functionality of preempt_lock) and the >>> design of synchronisation primitives (for example why it is not allowed to >>> have preemption disabled inside lockfree::mutex). I am trying to understand >>> by reading the code directly but it can be really helpful if there is some >>> material which describes the design. >> >> > There are a lot of questions here, and I'm not even sure answering them > will explain specifically why tlb_flush_mutex is highly contested in your > workload. > > Waldek suggested that you read the OSv paper from Usenix, which is a good > start for understanding the overall OSv architecture. > The scheduling policy and priority (how to decide which thread should run > next) is described in more detail in this document: > https://docs.google.com/document/d/1W7KCxOxP-1Fy5EyF2lbJGE2WuKmu5v0suYqoHas1jRM/edit > > If you have specific questions, post them here and I'll try to answer. But > only a few at a time :-) You had a lot of questions above and I can't > answer them all in one mail :-) > > > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CAL9cFfM9A3O7871qdofZZEdCqmq-n7k7Lsdn%3DzC36YkEjdNCxQ%40mail.gmail.com.
