:I have tried to understand the following code in vm_map_lookup() without :much success: : : if (fault_type & VM_PROT_OVERRIDE_WRITE) : prot = entry->max_protection; : else : prot = entry->protection; : ........ : : if (entry->wired_count && (fault_type & VM_PROT_WRITE) && : (entry->eflags & MAP_ENTRY_COW) && : (fault_typea & VM_PROT_OVERRIDE_WRITE) == 0) { : RETURN(KERN_PROTECTION_FAILURE); : } : :At first, it seems to me that if you want to write a COW page, you must :have OVERRIDE_WRITE set.
The VM_PROT_OVERRIDE_WRITE flag is only used for user-wired pages, so it does not effect 'normal' page handling. Look carefully at the vm_fault() code (vm/vm_fault.c line 212), that lookup only occurs with VM_PROT_OVERRIDE_WRITE set if the normal lookup fails and the user has wired the page. So if a normal lookup fails and this is a user-wired page, we try the lookup again with VM_PROT_OVERRIDE_WRITE, presumably to handle a faked copy-on-write fault for the debugger. This results in the following: First, we temporarily increase the protections to make the page *appear* writeable. Note: only 'appear' writeable, not actually be writeable. if (fault_type & VM_PROT_OVERRIDE_WRITE) prot = entry->max_protection; else prot = entry->protection; Next we strip off only the fault bits that we care about. Note that we have already adjusted 'prot' based on the VM_PROT_OVERRIDE_WRITE flag so 'prot' is probably writeable. We will thus fall through this conditional: fault_type &= (VM_PROT_READ|VM_PROT_WRITE|VM_PROT_EXECUTE); if ((fault_type & prot) != fault_type) { RETURN(KERN_PROTECTION_FAILURE); } If this is part of a user wire and we have a write fault and the page is copy-on-write, *AND* VM_PROT_OVERRIDE_WRITE was not set, we return a failure. This is, in fact, the failure that is returned when the vm_fault code initially attempts to do the lookup before vm_fault falls through and makes a second attempt with VM_PROT_OVERRIDE_WRITE. if (entry->wired_count && (fault_type & VM_PROT_WRITE) && (entry->eflags & MAP_ENTRY_COW) && (fault_typea & VM_PROT_OVERRIDE_WRITE) == 0) { RETURN(KERN_PROTECTION_FAILURE); } Now that we've gotten past this code we revert the protection bits if the page is a user-wire, because it was because the page was a user wire (indirectly, anyway) that the protections were increased in the first place. We lose the entry->max_protection and revert back to entry->protection. Essentially, we make the page (probably) read-only again. *wired = (entry->wired_count != 0); if (*wired) prot = fault_type = entry->protection; ... but we've already gotten past the conditionals that can cause a failure to be returned, so the code that follows will *still* do the copy-on-write for the debugger. :But later I find that when wired_count is non zero, we are actually :simulating a page fault, not a real one. :Anyway, I do not know how the above code (1) prevents a debugger from :writing a binary code, (2) forces :a COW when a debugger write other data. : :I also have some questions on wiring a page: : :(1) According to the man pages of mlock(2), a wired page can still :cause protection-violation faults. :But in the same vm_map_lookup(), we have the following code: : : if (*wired) : prot = fault_type = entry->protection; : :and the comment says "get it for all possible accesses". As I undersand :it, we wire a page by simulating :a page fault (no matter whether it is kernel or user who is wiring a :page). I'm pretty sure this piece is simply reverting the mess that the copy-on-write stuff does for the debugger. entry->protection is what we normally want to use. The debugger copy-on-write junk is there so the debugger can modify a program's TEXT area but the program itself *cannot* modify its own TEXT area. It's a big mess and I don't fully understand how the structures are faked up to handle the case. :(2) Can the kernel wire a page of a user process without that user's :request (by calling mlock)? : :Any help is appreciated. Yes. The kernel can wire a page. It usually busies the page for the duration, however, so vm_fault will block on the page and then retry without actually noticing that the page has been wired. I'm probably not entirely correct here, John may be able to say more about it. -Matt Matthew Dillon <dil...@backplane.com> To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message