Re: [Pharo-dev] New Cog VMs available

Clément Bera Fri, 17 Apr 2015 17:15:59 -0700

2015-04-17 13:35 GMT-07:00 Eliot Miranda <eliot.mira...@gmail.com>:

>
>
> On Fri, Apr 17, 2015 at 3:41 AM, stepharo <steph...@free.fr> wrote:
>
>>
>>   ... at http://www.mirandabanda.org/files/Cog/VM/VM.r3311/.
>>
>>  These should fix the regression introduced by the map changes in 3308.
>> They certainly fix the two crashes I've looked at, one an update of a
>> squeak trunk image and the other the startup of recent Newspeak images.
>> Apologies for the inconvenience.
>>
>>
>>  Well, this is embarrassing as usual but I'm still seeing crashes in the
>> image update.  So I'll have to look deeper.  At least the Newspeak fix was
>> real, but it didn't fix everything.
>>
>>
>> :)
>> Hi eliot
>> How can we help?
>> Would it make sense to have some more ci jobs for testing the VM?
>>
>
> Yes, that always helps.  But we have to accept that that means
>  a) these CI jobs will detect failures caused by regressions etc
>  b) we need to mark builds as "bleeding edge" "last known good" etc, and
> set these pointers
>
> b) I have yet to set up.
>
> But yesterday and today a wonderful thing has happened.  Clément has
> understood the core optimization and code generation structures in the JIT
> as well as I do and is now both improving the code quality and implementing
> more aggressive optimizations.  This is /so/ satisfying.  You know how good
> Clément is.  His input is so strong.  I am /very/ happy.  For example,
> Clément is currently modifying #== in the Spur JIT so that the code checks
> for forwarders only if #== is false, and that if #== is false and either
> object is a forwarder, it/them is/are followed and the #== retried.  This
> should speed up #== by 50%.
>


I think you made a typo.

* This should speed up #== in the case where the answer was true, which
theoretically is in 50% of the cases, but in practice we've not profiled it.


> It won't make much difference at the macro level but this is a non-trivial
> optimization to write and so now we have two people who really understand
> the core optimizing JIT and I can now happily run in front of a bus.
>

It will make a difference on some benchmarks where we saw the overhead of
#== due to forwarding pointers checks.

And Ryan has a really good understanding too. I really enjoy the
abstraction he made over the send logic.
And Tim has the ARM backend almost working too (it's a week of work to have
it working as far as I understood).

>
>
>> Stef
>>
>>
>>
>>  CogVM binaries as per VMMaker.oscog-eem.1204/r3311
>>
>>  Cogits:
>>
>>  Fix regression in map machinery due to adding AnnotationExtension
>> scheme.
>> findMapLocationForMcpc:inMethod: must not be confused by IsDisplacementX2N
>> bytes.  This is likely the cause of the recent crashes with r3308 and
>> earlier.
>>
>>  Introduce marryFrameCopiesTemps and use it to
>> not copy temps in Spur context creation trampolines.
>>
>>  Change initial usage counts to keep more recently jitted methods around
>> for
>> longer, and do *not* throw away PICs in freeOlderMethodsForCompaction, so
>> that
>> there's a better chance of Sista finding send and branch data for the
>> tripping
>> method.
>>
>>  extendedPushBytecode /does/ need a frame.
>>
>>  Don't save the header in a scratch register unless
>> it is useful to do so in the Spur at:[put:] primitives.
>>
>>  Fix slip in genGetNumBytesOf:into:.  And notice that
>> genGetFormatOf:into:baseHeaderIntoScratch: et al can use byte access
>> to get at format, as intended in the Spur header design.
>>
>>  Fix unlinking dynamic super sends.
>>
>>  Reduce false positives in access control violation reporting by marking
>> the
>> super send we actually use as privileged. Remove unused Newspeak
>> bytecodes.
>>
>>  Internal:
>>
>>  Fix code generation bug surfaced by inline primitives.  On x86 movb
>> N(%reg),%rl
>> can only store into al, bl, cl & dl, whereas movzbl can store into any
>> reg.  On
>> ARM move byte also zero-extends.  So change definition of MoveMbrR to
>> always
>> zero-extend, use movzbl on x86 and remove all the MoveCq: 0 R: used to
>> zero the
>> bits of the target of a MoveMb:r:R:.  And now that we have
>> genGetNumSlotsOf:into:, use it.
>>
>>  Fix a slip in genTrinaryInlinePrimitive:, meet constraint that the
>> target must
>> be in ReceiverResultReg, and do a better job of register allocation
>> there-in.
>>
>>  Do dead code elimination for the branch following an inlined comparison
>> (this
>> is done in genBinaryInlineComparison:opFalse:destReg: copying the scheme
>> in
>> genSpecialSelectorEqualsEquals).
>>
>>  Do register allocation in the right place in genUnaryInlinePrimitive:.
>>
>>  Fix overflow slot access in genGetNumSlotsOf:into: et al.
>>
>>  Fix several slips in inline primitive generation: Object>>at:put: needs
>> to
>> include a store check.  Some register allocation code was wrong.  Some
>> results
>> needed converting to SmallIntegers and recording results as pushed on the
>> sim
>> stack.
>>
>>  Change callPrimitiveBytecode to genCallPrimitiveBytecode in the Cogit.
>> remove the misnomer genConvertIntegerToSmallIntegerInScratchReg:
>>
>>  Type of AbstractInstruction opcode must be unsigned now that we have
>> more than 128 opcodes (XCHGRR pushed things over the top).
>>
>>  Lay the groundwork for 32-bit intra-zone jumps and calls on ARM by
>> introducing
>> CallFull and JumpFull (and rewrites thereof) that are expected to span
>> the full
>> address space, leaving Call/JumpLong to span merely the 16mb code zone.
>> On x86
>> CallFull and JumpFull simply default to Call/JumpLong.
>>
>>  Replace bytecode trapIfNotInstanceOf by jumpIfNotInstanceOfOrPop.
>>
>>  Rewrote the JIT logic for traps to be able to write trap trampolines
>> calls at
>> the end of the cogMethod.
>>
>>  Refactor the slot store and store check machinery to take an inFrame:
>> argument
>> and hence deal with the store check in genInnerPrimitiveAtPut: on ARM.
>>
>>  Fix limitation with MoveRXbrR; can only do movb from
>> %al through %dl, so swap with %eax around movb.
>>
>>  Fix mistake with genGetNumBytesOf:into: by refactoring
>> genGetFormatOf:into:baseHeaderIntoScratch: into
>> genGetBits:ofFormatByteOf:into:baseHeaderIntoScratch:
>> and hence fetching and subtracting only odd bits of format.
>>
>>  Correct the in-line primitive SmallInteger comparisons; CmpXR is
>> confusing ;-)
>>
>>  Fix var op var unsafe byte at:.  Result must be converted to
>> SmallInteger.
>>
>>  Correct the generated Slang for the new register allocation code by
>> adding a
>> read-before-written pass to C generation that initializes variables
>> read-before-written with 0 (the C equivalent of nil).
>>
>>  fix a bug where sometimes register allocation was marking
>> ReceiverResultReg as
>> dead whereas it was still alive.
>>
>>  Added some abstraction over register allocation. This is now used in
>> inline
>> primitives.
>>
>>
>>
>>
>
>
> --
> best,
> Eliot
>

Re: [Pharo-dev] New Cog VMs available

Reply via email to