On Thursday 05 April 2007 02:25 Nathan Beyer wrote: > On 4/4/07, Gregory Shimansky <[EMAIL PROTECTED]> wrote: > > On Wednesday 04 April 2007 23:33 Rana Dasgupta wrote: > > > On 4/4/07, Mikhail Fursov <[EMAIL PROTECTED]> wrote: > > > > On 4/4/07, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > > > > > 2007/4/4, Gregory Shimansky <[EMAIL PROTECTED]>: > > > > > > > > I would like to see these modifications. I wonder what you've > > > > > > > > done in > > > > > > > > > > > > port/src/thread/linux/apr_thread_ext.c and > > > > > > vmcore/include/atomics.h. They contain mfence and sfence > > > > > > instructions in inline assembly which have to be changed to > > > > > > something else on P3. > > > > > > MemoryWriteBarrier() etc. should be no-ops on PIII. x86 is already > > > strongly ordered for writes ? > > > > What about MemoryReadWriteBarrier()? If you know, what kind of code > > should be used for this in P3? > > > > > > > Can we produce separate binary build for P3 if it is not easy to > > > > > replace mfence/sfence? > > > > > > > > Jitrino can use runtime detection of CPU features supported and emit > > > > appropriate code. > > > > Can we do the same with VM (check flag) to avoid multiple > > > > distributions? > > > > > > Jitrino generates code late, the VM doesn't. So I am not sure how this > > > would work unless we link all versions of the asm's and then decide > > > which ones to call at runtime, which has a cost. My suggestion would > > > be that if we want the x86-32 bits to be PIII compatible, we should > > > only use PIII instructions ( upto SSE ) in all the static 32 bit > > > binaries. The jit can choose to generate more advanced instruction > > > sequences at runtime based on cpuid if the paltform supports it. > > > > I am not sure what you mean by using only P3 compatible code in all > > statically linked binaries. In this case what should be used for > > MemoryWriteBarrier function so that it would work on P4 too? > > Yes, I believe what we want to say is code to the lowest common > instruction set for static code in the VM, at least for each distinct > instruction set (x86 32-bit, IPF, etc). For the x86 32-bit, the > available instructions must be available in at least a P3.
I don't quite understand why code patching doesn't help here? Classlib hythr code does code patching to remove lock prefixes for some instructions (see files in modules/portlib/src/main/native/thread/windows/windows.x86 and modules/portlib/src/main/native/common/windows/lock386.c) on uniprocessors, same could be done to patch away mfence and sfence. > > What about code patching upon initialization? The [m|s]fence instructions > > could be replaced with NOPs or other code for P3 by the initialization > > code. Of course this would prevent inlining of these functions so that > > all places where code has to be patched are known, but it should be > > faster than choosing the appropriate barrier function implementation at > > runtime. Also care should be taken for [lib]hythr library which uses > > apr_memory_rw_barrier function, it should be patched as well after > > [lib]hythr is loaded, possibly by the library initialization code since > > it is currently loaded before harmonyvm by the launcher as a hyport > > dependency. > > > > -- > > Gregory -- Gregory
