Re: 4byte aligned com(4) and PCI_MAPREG_TYPE_MEM
Hi. (2014/02/15 1:58), Izumi Tsutsui wrote: I'd suggest to clarify what problems you are trying to solve and consider how it should be solved, before updating your patch. The problems you mentioned are: (1) merge initialization of sparse register mappings (with 4 byte stride) Right. (2) defer consinit() for puc com to use uvm_km_alloc() in it Right though the real goal is not to defer consinit but to support memory mapped I/O in com(4). And, (3) Almost all drivers don't clear struct com_regs regs. It's not a real bug but should be cleard for further change. Your patch is trying to solve them by: (1) change COM_INIT_REGS() (and comprobe1()) APIs to pass stride byte (2) add an MI extent_initted global variable and check it in MD consinit() My vote is: (1) leave existing APIs and handle the quirk in MD attachment (2) add an x86 specific MD variable to defer consinit() till cpu_startup() I think static int iniited variable is used to do that, but there is no way to know whether extent_init() (or uvm_init()) is called or not. Because: (1) - it's really pain to change the MI APIs (so many attachments and most of them will rarely be tested unfortunately) - only three or four attachments can share the new API while such embedded devices often might have more other quirks Before writing that patch, I wrote another patch which did't change the arguments of COM_INIT_REGS(). We have two choices a) Use COM_INIT_REGS() for all drivers. b) Copy COM_INIT_REGS() and modify to support 4 byte stride. I couldn't decice which one was better or not. Now I think not change the argument and make a new macro is better than adding new argument, because considering the byte order is not required for 1 byte slide devices. - even if stride handling is really necessary in MI part, it's much better to prepare new wrap functions, like wdcprobe1() and wdcprobe() in wdc.c (i.e. prepare a new COM_INIT_REGS_STRIDE() macro with a new arg and make exiting COM_INIT_REGS() macro use it) (2) - it's unclear what functions actually require the extent_init() (I guess uvm_init() is enough to call uvm_km_alloc()) Me neither... I had thought that uvm_init() was enough, but someone advived me that exten_init() should be called. - in general MI code assumes that console devices are properly mapped by MD bootstrap or its firmware Yes. But, it's little hard to make such mapping as PCI bus_space_tag and handle in eary boot stage. - some ports already has MD flags to switch malloc(9) or static memory in early device mappings and initialization It can be used, but IMHO, checking whether bus_space_map() can be called is more generic than it. Just my two cents, --- Izumi Tsutsui Please give me a little time to write another patch. -- --- SAITOH Masanobu (msai...@execsw.org msai...@netbsd.org)
pcb offset into uarea
I'm adding code to i386 and amd64 to save the ymm registers on process switch - allowing userspace to use the AVX instructions. I also don't want to have to do it all again when the next set of extensions appear. This means that the size of the FPU save area (currently embedded in the pcb) can't be determined until runtime. Plan A is to move the FPU save are to the end of the pcb, and then locate the pcb at the correct offset in the uarea so that the written region ends at the end of the page. The problem with this is that the offset of the pcb in the uarea is set by MI code based on some #defines - and there seem to be several related values. Now on x86 (like most systems) the cpu stack advances into low memory. The pcb is placed at the end of the uarea with the intial stack pointer just below it. I suspect that a long time ago (when the uarea had a fixed KVA) an additional memory page was placed below the uarea to give interrupts more stack space. I don't think this happens any more. As an aside: The uarea used to be pageable, whereas (what is now) the lwp structure isn't. Paging of uarea's was disabled a few years back - so there is no real difference between the lifetimes of an lwp a uarea. (zombies probably lose the uarea before the lwp). An alternative would be to place the FP save area at the start of the uarea. This would mean that, on stack overflow, the FP save area would be trashed before some random piece of memory. It might even be worth putting the pcb at the start of the uarea - so that stack overflow crashes out the failing process, and probably earlier than the random corruption would. This gives me three options: A) Put the save area at the end of the pcb and dynamically adjust the pcb offset. B) Put the save area at the start of the uarea, with the pcb at a fixed offset at the end of the uarea. C) Put the save area at the end of the pcb, and put the pcb at the start of the uarea. Votes? What have I missed? David -- David Laight: da...@l8s.co.uk
Re: pcb offset into uarea
On Feb 16, 2014, at 1:41 PM, David Laight da...@l8s.co.uk wrote: I'm adding code to i386 and amd64 to save the ymm registers on process switch - allowing userspace to use the AVX instructions. I also don't want to have to do it all again when the next set of extensions appear. This means that the size of the FPU save area (currently embedded in the pcb) can't be determined until runtime. Plan A is to move the FPU save are to the end of the pcb, and then locate the pcb at the correct offset in the uarea so that the written region ends at the end of the page. The problem with this is that the offset of the pcb in the uarea is set by MI code based on some #defines - and there seem to be several related values. Now on x86 (like most systems) the cpu stack advances into low memory. The pcb is placed at the end of the uarea with the intial stack pointer just below it. I suspect that a long time ago (when the uarea had a fixed KVA) an additional memory page was placed below the uarea to give interrupts more stack space. I don't think this happens any more. As an aside: The uarea used to be pageable, whereas (what is now) the lwp structure isn't. Paging of uarea's was disabled a few years back - so there is no real difference between the lifetimes of an lwp a uarea. (zombies probably lose the uarea before the lwp). An alternative would be to place the FP save area at the start of the uarea. This would mean that, on stack overflow, the FP save area would be trashed before some random piece of memory. It might even be worth putting the pcb at the start of the uarea - so that stack overflow crashes out the failing process, and probably earlier than the random corruption would. For most ports, the pcb is at the start of the uarea. This gives me three options: A) Put the save area at the end of the pcb and dynamically adjust the pcb offset. B) Put the save area at the start of the uarea, with the pcb at a fixed offset at the end of the uarea. C) Put the save area at the end of the pcb, and put the pcb at the start of the uarea. Votes? What have I missed? Keep a default mmx/sse save area in the pcb along with a pointer to it. If a variant is used that needs a larger save area, dynamically allocate it and save it in the pcb pointer. Since it's unlikely most processes will be AVX why waste the space?