On Nov 15, 2011, at 23:40, Paul Mackerras wrote: > On Tue, Nov 15, 2011 at 04:45:18PM -0600, Moffett, Kyle D wrote: >> >> I guess that's doable, although I have to admit that idea almost gives >> me more of a headache than trying to fix up the 32-bit ASM. >> >> One thing that bothers me in particular is that both 32/64 versions of >> __copy_tofrom_user() are dramatically overcomplicated for what they >> ought to be doing. >> >> It would seem that if we get a page fault during an unaligned copy, we >> ought to just give up and fall back to a simple byte-by-byte copy loop >> from wherever we left off. That would eliminate 90% of the ugly >> special cases without actually hurting performance, right? > > That's basically what we do, IIRC, and most of the complexity comes > from working out where we were up to. We could probably use a simpler > approximation that means we might copy some bytes twice. In fact the > greatest simplification would probably be to implement range entries > in the exception table so we can just have one entry for all the loads > and stores instead of an entry for each individual load and store.
Well, I spent some time tinkering with the GCC inline-assembly option, which was probably a waste, but I figured I would post my code here for other people to chuckle at. :-D Here's a basic, relatively easily extended "copy u8" macro that sets up the exception table using "asm goto": #define try_copy_u8(DST, SRC, LOAD_FAULT, STORE_FAULT) do { \ unsigned long try_copy_tmp__ = (try_copy_tmp__); \ asm goto ( \ "1: lbz %[tmp], %[src]\n" \ "2: stb %[tmp], %[dst]\n" \ " .pushsection __ex_table, \"a\"\n" \ " .align 2\n" \ " .long 1b, %l["#LOAD_FAULT"]\n" \ " .long 2b, %l["#STORE_FAULT"]\n" \ " .popsection\n" \ : /* No outputs allowed for "asm goto" */ \ : [dst] "m"(*(__user u8 *)(DST)), \ [src] "m"(*(const __user u8 *)(SRC)), \ [tmp] "r"(try_copy_tmp__) \ : "memory" \ : LOAD_FAULT, STORE_FAULT \ ); \ } while(0) If I put that into a function and compile it, the assembly and the exception table look perfectly OK, even under register pressure. With a few macros like that it looks like it should be possible to write the copy function directly in C and get optimal results. The only other variants you need would be "try_copy_ulong" and "try_copy_4ulong"/"try_copy_8ulong" for 32/64-bit. Unfortunately, as I mentioned before, GCC 4.4 and older don't have "asm goto" support :-(. Perhaps I could put __copy_tofrom_user() into its own file and make the assembled 32/64 output files be ".shipped"? On the other hand, perhaps this is overly complicated :-D. I'll poke at it more tomorrow. >> For a page-fault during a cacheline-aligned copy, we should be able to >> handle the exception and retry from the last cacheline without much >> logic, again with good performance. >> >> With that said, I'm curious about the origin of the PPC32 ASM. In >> particular, it looks like it was generated by GCC at some point in the >> distant past, and I'm wondering if there's a good way to rewrite that >> file in C and trick GCC into generating the relevant exception tables >> for it? > > Why do you think it was generated by gcc? I wrote the original > version, but I think it got extended and macro-ized by others. Ah, sorry, when I first looked at it the large collection of numeric labels and the very sparing comments made it look autogenerated. Although, given how much of a pain in the neck it is maybe you would rather people not think you wrote it at all. ;-) Cheers, Kyle Moffett -- Curious about my work on the Debian powerpcspe port? I'm keeping a blog here: http://pureperl.blogspot.com/ _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev