Indeed STG to Cmm lowering drops the correct size information for ccall arguments, there's even a TODO comment that has been around for quite a few years: https://gitlab.haskell.org/ghc/ghc/-/blob/master/compiler/GHC/StgToCmm/Foreign.hs#L83
This has been an annoyance for Asterius as well. When we try to translate a CmmUnsafeForeignCall node to a wasm function call, a CInt argument (which should be i32 in wasm) can be mistyped as i64 which causes a validation error. We have to insert wrap/extend opcodes based on the callee function signature, but if we preserve correct argument size in Cmm (or at least enrich the hints to include it), we won't need such a hack. On Tue, Oct 20, 2020 at 4:05 PM Moritz Angermann <moritz.angerm...@gmail.com> wrote: > > Yes, that's right. I'm not sure it's in core though, as the width information > still seems to be available in Stg. However the lowering from > stg into cmm widens it. > > On Tue, Oct 20, 2020 at 9:57 PM Carter Schonwald <carter.schonw...@gmail.com> > wrote: >> >> ... are you talking about Haskell Int and word? Those are always the same >> size in bits and should match native point size. That is definitely an >> assumption of ghc >> >> It sounds like some information that is dropped after core is needed to >> correctly do something in stg/cmm in the context of the ARM64 ncg that was >> recently added to handle cint being 32bit in this context ? >> >> >> On Tue, Oct 20, 2020 at 5:49 AM Moritz Angermann >> <moritz.angerm...@gmail.com> wrote: >>> >>> Alright, let me expand a bit. I've been looking at aarch64 NCG for ghc. >>> The Linux side of things is looking really good, >>> so I've moved onto the macOS side (I'm afraid I don't have any Windows >>> aarch64 hardware, nor much windows knowledge >>> to even attempt a Windows version yet). >>> >>> When calling C functions, the usual approach is to pass the first few >>> arguments in registers, and then arguments that exceed >>> the argument passing slots on the stack. The Arm AArch64 Procedure Call >>> Standard (aapcs) for C does this by assigning 8byte >>> slots to each overflow argument on the stack. A company I won't name, has >>> decided to implement a slightly different variation of >>> the Procedure Call Standard, often referred to as darwinpcs. This deviates >>> from the aapcs for vargs, as well as for handling of >>> spilled arguments on the stack. >>> >>> The aapcs allows us to generate calls to C functions without knowing the >>> actual prototype of the function, as all arguments are >>> simply spilled into 8byte slots on the stack. The darwinpcs however >>> requires us to know the size of the arguments, so we can >>> properly pack them onto the stack. Ints have 4 bytes, so we need to pack >>> them into 4byte slots. >>> >>> In the process library we have this rather fun foreign import: >>> foreign import ccall unsafe "runInteractiveProcess" >>> c_runInteractiveProcess >>> :: Ptr CString >>> -> CString >>> -> Ptr CString >>> -> FD >>> -> FD >>> -> FD >>> -> Ptr FD >>> -> Ptr FD >>> -> Ptr FD >>> -> Ptr CGid >>> -> Ptr CUid >>> -> CInt -- reset child's SIGINT & SIGQUIT >>> handlers >>> -> CInt -- flags >>> -> Ptr CString >>> -> IO PHANDLE >>> >>> with the corresponding C declaration: >>> >>> extern ProcHandle runInteractiveProcess( char *const args[], >>> char *workingDirectory, >>> char **environment, >>> int fdStdIn, >>> int fdStdOut, >>> int fdStdErr, >>> int *pfdStdInput, >>> int *pfdStdOutput, >>> int *pfdStdError, >>> gid_t *childGroup, >>> uid_t *childUser, >>> int reset_int_quit_handlers, >>> int flags, >>> char **failed_doing); >>> This function thus takes 14 arguments. We pass only the first 8 arguments >>> in registers, and the others on the stack. >>> Argument 12 and 13 are of type int. On linux using the aapcs, we can pass >>> those in 8byte slots on the stack. That is >>> both of them are effectively 64bits wide when passed. However for >>> darwinpcs, it is expected that these adhere to their >>> size and are packed as such. Therefore Argument 12 and 13 need to be passed >>> as 4byte slots each on the stack. >>> >>> This yields a moderate 8byte saving on the stack for the same function call >>> on darwinpcs compared to aapcs. >>> >>> Now onto GHC. When we generate function calls for foreign C functions, we >>> deal with something like: >>> >>> genCCall >>> :: ForeignTarget -- function to call >>> -> [CmmFormal] -- where to put the result >>> -> [CmmActual] -- arguments (of mixed type) >>> -> BlockId -- The block we are in >>> -> NatM (InstrBlock, Maybe BlockId) >>> >>> based on Cmm Nodes of the form CmmUnsafeForeignCall target result_regs args >>> >>> The CmmActual in the care of runInteractiveProcess hold the arguments for >>> the function, however contrary to the function >>> declaration, it contains I64 slots for Argument 12 and 13. Thus computing >>> the space needed for them based on their Cmm >>> Representations yields 8bytes, when they should really be 32bit and consume >>> only 4 byte. >>> >>> To illustrate this a bit better: here is what we see in the pretty printed >>> cmm: >>> >>> (_s6w3::I64) = call "ccall" arg hints: [PtrHint, PtrHint, PtrHint, signed, >>> signed, signed, PtrHint, PtrHint, PtrHint, PtrHint, PtrHint, signed, >>> signed, PtrHint] result hints: [signed] _runInteractiveProcess(I64[Sp + >>> 96], I64[Sp + 88], I64[Sp + 104], I64[Sp + 112], I64[Sp + 120], I64[Sp + >>> 56], I64[Sp + 64], I64[Sp + 72], I64[Sp + 24], 0, 0, I64[Sp + 8], I64[Sp + >>> 40] | I64[Sp + 48] | I64[Sp + 80] | 3, I64[R1 + 7]); >>> >>> I've added size information to the ForeignHints (NoHint, AddrHint, >>> SignedHint) we have, and computed both, which yields: >>> >>> [(CmmReg (CmmLocal (LocalReg s6Gi (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6Gk (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6Gm (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6Go (CmmType BitsCat W64))),SignedHint W32) >>> ,(CmmReg (CmmLocal (LocalReg s6Gq (CmmType BitsCat W64))),SignedHint W32) >>> ,(CmmReg (CmmLocal (LocalReg s6Gs (CmmType BitsCat W64))),SignedHint W32) >>> ,(CmmReg (CmmLocal (LocalReg s6Gu (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6Gw (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6Gy (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6Cp (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6FU (CmmType BitsCat W64))),AddrHint) >>> ,(CmmReg (CmmLocal (LocalReg s6GA (CmmType BitsCat W64))),SignedHint W32) >>> ,(CmmReg (CmmLocal (LocalReg s6GR (CmmType BitsCat W64))),SignedHint W32) >>> ,(CmmReg (CmmLocal (LocalReg s6GM (CmmType BitsCat W64))),AddrHint)] >>> >>> Thus, while we *do* know the right size from STG (which is what the Hints >>> are computed from), we loose this information when lowering >>> into Cmm, where we represent them with W64. This is what I was alluding to >>> in the previous email. In primRepCmmType, and mkIntCLit, we set their type >>> to 64bit for Ints; which on this platform does not hold. >>> >>> Now I've gone ahead and effectively assume Cmm is lying to me when >>> generating Foreign Function Calls, and rely on the (new) sized >>> hints to produce the appropriate argument packing on the stack. However I >>> believe the correct way would be for GHC not to conflate Ints >>> and Words in Cmm; implicitly assuming they are the same width. Sadly it's >>> not as simple as having primRepCmmType and mkIntCLit produce 32bit types. I >>> fear GHC internally assumes "Int" means 64bit Integer, and then just >>> happens to make the Int ~ CInt assumption. >>> >>> Cheers, >>> Moritz >>> >>> On Tue, Oct 20, 2020 at 3:33 PM Simon Peyton Jones <simo...@microsoft.com> >>> wrote: >>>> >>>> Moritz >>>> >>>> >>>> >>>> I’m afraid I don’t understand any of this. Not your fault, but I just >>>> don’t have enough context to know what you mean. >>>> >>>> >>>> >>>> Is there a current bug? If so, can you demonstrate it? If not, what is >>>> the problem you want to solve? Examples are always helpful. >>>> >>>> >>>> >>>> Maybe it’s worth opening a ticket too? >>>> >>>> >>>> >>>> Thanks! >>>> >>>> >>>> >>>> Simon >>>> >>>> >>>> >>>> From: ghc-devs <ghc-devs-boun...@haskell.org> On Behalf Of Moritz Angermann >>>> Sent: 20 October 2020 02:51 >>>> To: ghc-devs <ghc-devs@haskell.org> >>>> Subject: GHC's internal confusion about Ints and Words >>>> >>>> >>>> >>>> Hi there! >>>> >>>> >>>> >>>> So there is a procedure calling convention that for reasons I did not >>>> fully understand, but seem to be historically grown, uses packed arguments >>>> for those that are spilled onto the stack. On top of that, CInt is 32bit, >>>> Word is 64bits. This provides the following spectacle: >>>> >>>> >>>> >>>> While we know in STG that the CInt is 32bits wide, when lowered into Cmm, >>>> it's represented as I64 in the arguments to the C function. Thus packing >>>> based on the format of the Cmm type would yield 8 bytes. And now, all >>>> further packed arguments have the wrong offset (by four). >>>> >>>> >>>> >>>> Specifically in GHC.Cmm.Utils we find: >>>> >>>> primRepCmmType :: Platform -> PrimRep -> CmmType >>>> >>>> primRepCmmType platform IntRep = bWord platform >>>> >>>> >>>> >>>> mkIntCLit :: Platform -> Int -> CmmLit >>>> mkIntCLit platform i = CmmInt (toInteger i) (wordWidth platform) >>>> >>>> >>>> >>>> The naive idea to just fix this and make them return cIntWidth instead, >>>> seemingly produces the correct Cmm expressions at a local level, but >>>> produces a broken compiler. >>>> >>>> >>>> >>>> A second approach could be to extend the Hints into providing sizes, and >>>> using those during the foreign call generation to pack spilled arguments. >>>> This however appears to be more of a patching up of some fundamental >>>> underlying issue, instead of rectifying it properly. >>>> >>>> >>>> >>>> Maybe I'll have to go down the Hint path, it does however break current Eq >>>> assumptions, as they are sized now, and what was equal before, is only >>>> equal now if they represent the same size. >>>> >>>> >>>> >>>> From a cursory glance at the issues with naively fixing the width for Int, >>>> it seems that GHC internally assumes sizeof(Int) = sizeof(Word). Maybe >>>> there is a whole level of HsInt vs CInt discrimination missing? >>>> >>>> >>>> >>>> Cheers, >>>> >>>> Moritz >>> >>> _______________________________________________ >>> ghc-devs mailing list >>> ghc-devs@haskell.org >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs > > _______________________________________________ > ghc-devs mailing list > ghc-devs@haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs