On Sat, Dec 7, 2019 at 6:07 AM Waldek Kozaczuk <jwkozac...@gmail.com> wrote:
> On second thought, let me tweak my suggestions and add some more. > > On Friday, December 6, 2019 at 5:05:03 PM UTC-5, Waldek Kozaczuk wrote: >> >> >> >> On Friday, December 6, 2019 at 4:06:07 PM UTC-5, Matthew Pabst wrote: >>> >>> Ah, I think I understand things better now. >>> >>> Here is my current idea for a solution: >>> >>> Everytime we disable interrupts and preemption (in irq_disable() and >>> preempt_disable()), we call a method like ensure_stack_mapped(), which will >>> do the following: >>> >> >>> - If the current stack pointer is in a region mapped as >>> mmu::mmap_stack (i.e. an application stack), then we read the next few >>> stack pages. >>> - Otherwise the stack pointer is in a kernel stack, which is already >>> mapped so reading ahead may cause a segmentation fault. >>> >>> I do not think you need to differentiate between "kernel" vs "app" >> stack. In this context (forget fault handling kernel code that uses >> separate stack) both kernel and application code would run on the same >> application stack. Unless you meant simply checking if the reading byte on >> the stack 1 or 2 pages ahead still fits within the stack mapped area. Which >> is indeed what we have to do (application stack is almost full it is full >> and we cannot do anything about). But I would start without making that >> check. >> > I think eventually we want to differentiate between "kernel" and > "application" threads and only ensure stack for application threads. > Why are you saying this? The only reason I can think to do that is code which already has a tiny stack (like 16KB) and we don't want to waste the last page of the stack. Is this the reason you were thinking of? >> I also do not think you need ensure_stack_mapped(), I think we could >> directly change preempt_disable() and irq_disable() and prepend it with >> simple inline assembly and C like that: >> >> cmpb $0, -4096(%rsp) >> >> Given that the logic will be more than single assembly instruction, I > take mu suggestion back and agree with your ensure_stack_mapped() function > (or ensure_stack_allocated()). However, I still think it would be easier to > call it from preempt_disable() and irq_disable() instead of calling in all > places where these two are called. > I agree about the latter, but we need to be careful not to make this a complex function because preempt_disable() is supposed to be every efficient. I really hoped this function won't even have an if(). The hope was to have something very simple in the common case (where enough stack is available) but a more complex logic in the case of page fault. > Also instead of "cmp" instruction which is not side-effect-free, we could > actually write to instead of reading (I do not think it makes much > difference): > Why not "mov" for reading? Why writing? > mov $0,-4096(%rsp) > mov $0,-8192(%rsp) //For second page, etc > > > This effectively would read single-byte 4096 bytes deep in the stack and >> automatically trigger page fault right at this moment IF -4096(%rsp)has >> not been allocated yet. For starters, it might be as simple as that. >> >> I am not sure if cmp is the best instruction to use here or maybe there >> is a better assembly instruction for that. >> >> Finally to see the effect of this logic you will need to revert this > commit - > https://github.com/cloudius-systems/osv/commit/41efdc1cb8b19216d98fc0baf38e5312fd3c27d9 > . > > Testing these changes will be fun :-) I would start with unit tests and > especially use tst-pipe to run it 100 or more times to verify things do not > break. Then it would nice to measure the improvements of memory > utilizations - please see this email thread - > https://groups.google.com/d/msg/osv-dev/LqUAWjzVpSc/sF-G4dYYBgAJ. > Hopefully, with your changes, my Java example with 400 threads will need > much less memory than 600MB. > > Good luck! > >> >> Let me know what you think, >>> Matthew >>> >>> On Thursday, December 5, 2019 at 8:05:15 AM UTC-6, Nadav Har'El wrote: >>>> >>>> On Tue, Dec 3, 2019 at 5:56 PM Matthew Pabst <pabstm...@gmail.com> >>>> wrote: >>>> >>>>> Thanks for the response. >>>>> >>>>> I've been trying to think through the logic for this change more, and >>>>> I'm don't completely understand the proposed solution. >>>>> >>>>> For example, say the program faults on a stack page. Then before >>>>> disabling interrupts we access the next stack page(s). If one of those >>>>> pages is not yet allocated, it will cause a second page fault. >>>>> >>>> >>>> No, that is not a problem. The page-fault handling code already knows >>>> it can be used without a working stack - so we have separate interrupt and >>>> exception stacks which we switch to while handling those, and they work >>>> fine even if the user's stack is empty. >>>> >>>> Rather, the problem is what are in Linux known as "system calls", and >>>> in OSv are ordinary function calls. >>>> Imagine you do call read() or sleep() or whatever in your application. >>>> Some of the code implementing these >>>> functions, which considers itself "kernel" code, disables interrupts or >>>> preemption. But it also uses the stack, >>>> and if it suddenly needs a new page of the stack, the resulting page >>>> fault will refuse to run. >>>> >>>> >>>>> This second page fault will invoke the page fault handler, which will >>>>> attempt to run some code, which may access the first stack page and run >>>>> into deadlock. >>>>> >>>>> Is this a possible problem, or am I misunderstanding how page faults >>>>> are handled in OSv? >>>>> >>>>> If this is a problem, could we switch to a pre-populated per-CPU >>>>> exception stack before calling the page fault handler? >>>>> >>>> >>>> We do exactly that already - see exception_stack and interrupt_stack in >>>> arch/x64/arch-cpu.hh >>>> >>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "OSv Development" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to osv...@googlegroups.com. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/osv-dev/92857c9c-3329-4743-bb88-a74053db613e%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/osv-dev/92857c9c-3329-4743-bb88-a74053db613e%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to osv-dev+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/osv-dev/e061cf71-013f-4a28-8f0c-0dafc1ad89cc%40googlegroups.com > <https://groups.google.com/d/msgid/osv-dev/e061cf71-013f-4a28-8f0c-0dafc1ad89cc%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CANEVyjvkvoZVbTuMY7HgFXx8zSRK%3DQd950UFu1Z7WHo%3D1p%3DnJA%40mail.gmail.com.