2015-04-17 13:35 GMT-07:00 Eliot Miranda <eliot.mira...@gmail.com>: > > > On Fri, Apr 17, 2015 at 3:41 AM, stepharo <steph...@free.fr> wrote: > >> >> ... at http://www.mirandabanda.org/files/Cog/VM/VM.r3311/. >> >> These should fix the regression introduced by the map changes in 3308. >> They certainly fix the two crashes I've looked at, one an update of a >> squeak trunk image and the other the startup of recent Newspeak images. >> Apologies for the inconvenience. >> >> >> Well, this is embarrassing as usual but I'm still seeing crashes in the >> image update. So I'll have to look deeper. At least the Newspeak fix was >> real, but it didn't fix everything. >> >> >> :) >> Hi eliot >> How can we help? >> Would it make sense to have some more ci jobs for testing the VM? >> > > Yes, that always helps. But we have to accept that that means > a) these CI jobs will detect failures caused by regressions etc > b) we need to mark builds as "bleeding edge" "last known good" etc, and > set these pointers > > b) I have yet to set up. > > But yesterday and today a wonderful thing has happened. Clément has > understood the core optimization and code generation structures in the JIT > as well as I do and is now both improving the code quality and implementing > more aggressive optimizations. This is /so/ satisfying. You know how good > Clément is. His input is so strong. I am /very/ happy. For example, > Clément is currently modifying #== in the Spur JIT so that the code checks > for forwarders only if #== is false, and that if #== is false and either > object is a forwarder, it/them is/are followed and the #== retried. This > should speed up #== by 50%. >
I think you made a typo. * This should speed up #== in the case where the answer was true, which theoretically is in 50% of the cases, but in practice we've not profiled it. > It won't make much difference at the macro level but this is a non-trivial > optimization to write and so now we have two people who really understand > the core optimizing JIT and I can now happily run in front of a bus. > It will make a difference on some benchmarks where we saw the overhead of #== due to forwarding pointers checks. And Ryan has a really good understanding too. I really enjoy the abstraction he made over the send logic. And Tim has the ARM backend almost working too (it's a week of work to have it working as far as I understood). > > >> Stef >> >> >> >> CogVM binaries as per VMMaker.oscog-eem.1204/r3311 >> >> Cogits: >> >> Fix regression in map machinery due to adding AnnotationExtension >> scheme. >> findMapLocationForMcpc:inMethod: must not be confused by IsDisplacementX2N >> bytes. This is likely the cause of the recent crashes with r3308 and >> earlier. >> >> Introduce marryFrameCopiesTemps and use it to >> not copy temps in Spur context creation trampolines. >> >> Change initial usage counts to keep more recently jitted methods around >> for >> longer, and do *not* throw away PICs in freeOlderMethodsForCompaction, so >> that >> there's a better chance of Sista finding send and branch data for the >> tripping >> method. >> >> extendedPushBytecode /does/ need a frame. >> >> Don't save the header in a scratch register unless >> it is useful to do so in the Spur at:[put:] primitives. >> >> Fix slip in genGetNumBytesOf:into:. And notice that >> genGetFormatOf:into:baseHeaderIntoScratch: et al can use byte access >> to get at format, as intended in the Spur header design. >> >> Fix unlinking dynamic super sends. >> >> Reduce false positives in access control violation reporting by marking >> the >> super send we actually use as privileged. Remove unused Newspeak >> bytecodes. >> >> Internal: >> >> Fix code generation bug surfaced by inline primitives. On x86 movb >> N(%reg),%rl >> can only store into al, bl, cl & dl, whereas movzbl can store into any >> reg. On >> ARM move byte also zero-extends. So change definition of MoveMbrR to >> always >> zero-extend, use movzbl on x86 and remove all the MoveCq: 0 R: used to >> zero the >> bits of the target of a MoveMb:r:R:. And now that we have >> genGetNumSlotsOf:into:, use it. >> >> Fix a slip in genTrinaryInlinePrimitive:, meet constraint that the >> target must >> be in ReceiverResultReg, and do a better job of register allocation >> there-in. >> >> Do dead code elimination for the branch following an inlined comparison >> (this >> is done in genBinaryInlineComparison:opFalse:destReg: copying the scheme >> in >> genSpecialSelectorEqualsEquals). >> >> Do register allocation in the right place in genUnaryInlinePrimitive:. >> >> Fix overflow slot access in genGetNumSlotsOf:into: et al. >> >> Fix several slips in inline primitive generation: Object>>at:put: needs >> to >> include a store check. Some register allocation code was wrong. Some >> results >> needed converting to SmallIntegers and recording results as pushed on the >> sim >> stack. >> >> Change callPrimitiveBytecode to genCallPrimitiveBytecode in the Cogit. >> remove the misnomer genConvertIntegerToSmallIntegerInScratchReg: >> >> Type of AbstractInstruction opcode must be unsigned now that we have >> more than 128 opcodes (XCHGRR pushed things over the top). >> >> Lay the groundwork for 32-bit intra-zone jumps and calls on ARM by >> introducing >> CallFull and JumpFull (and rewrites thereof) that are expected to span >> the full >> address space, leaving Call/JumpLong to span merely the 16mb code zone. >> On x86 >> CallFull and JumpFull simply default to Call/JumpLong. >> >> Replace bytecode trapIfNotInstanceOf by jumpIfNotInstanceOfOrPop. >> >> Rewrote the JIT logic for traps to be able to write trap trampolines >> calls at >> the end of the cogMethod. >> >> Refactor the slot store and store check machinery to take an inFrame: >> argument >> and hence deal with the store check in genInnerPrimitiveAtPut: on ARM. >> >> Fix limitation with MoveRXbrR; can only do movb from >> %al through %dl, so swap with %eax around movb. >> >> Fix mistake with genGetNumBytesOf:into: by refactoring >> genGetFormatOf:into:baseHeaderIntoScratch: into >> genGetBits:ofFormatByteOf:into:baseHeaderIntoScratch: >> and hence fetching and subtracting only odd bits of format. >> >> Correct the in-line primitive SmallInteger comparisons; CmpXR is >> confusing ;-) >> >> Fix var op var unsafe byte at:. Result must be converted to >> SmallInteger. >> >> Correct the generated Slang for the new register allocation code by >> adding a >> read-before-written pass to C generation that initializes variables >> read-before-written with 0 (the C equivalent of nil). >> >> fix a bug where sometimes register allocation was marking >> ReceiverResultReg as >> dead whereas it was still alive. >> >> Added some abstraction over register allocation. This is now used in >> inline >> primitives. >> >> >> >> > > > -- > best, > Eliot >