Re: Build GHC in GHC2021

2023-12-08 Thread Andreas Klebinger via ghc-devs

On a whim tried enabling it by default but it failed first because
GeneralizedNewtypeDeriving is incompatible with safe haskell
and once I disabled GeneralizedNewtypeDeriving it failed with some error
about missmatched kinds.

I would welcome a patch doing this, but it's not a priority. Especially
since it doesn't seem to be as simple as changing
the base language and removing some pragmas.

Am 07/12/2023 um 14:29 schrieb Arnaud Spiwack:

Indeed. I didn't realise the ambiguity in my wording.

I'd like for GHC to be built, with Hadrian, using GHC2021 as the base
language.


On Thu, 7 Dec 2023 at 14:01, Tom Ellis
 wrote:

On Thu, Dec 07, 2023 at 12:53:02PM +, Richard Eisenberg wrote:
> I think this is an excellent idea! So excellent, that we've
already done it. :)
>
> When I try to compile with GHC 9.6.2 (what I have lying around),
GHC2021 is in effect.
>
> Is there something different you were thinking of?

I think Arnaud meant that compilations of GHC's codebase itself should
use the GHC2021 setting.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



--
Arnaud Spiwack
Director, Research at https://moduscreate.com and https://tweag.io.

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How do you keep tabs on commits that fix issues?

2023-09-28 Thread Andreas Klebinger

Personally I try to include fixes #1234 in the commit so then I can just
check which tags contain a commit mentioning the issue.

If the issue isn't mentioned in the commit I usually look at the issue
-> look for related mrs -> look for the commit with the fix -> grep for
the commit message of the commit or look for the marge MR mentioned on
the mr.

Am 28/09/2023 um 08:56 schrieb Bryan Richter via ghc-devs:

I am not sure of the best ways for checking if a certain issue has
been fixed on a certain release. My past ways of using git run into
certain problems:

The commit (or commits!) that fix an issue get rewritten once by Marge
as they are rebased onto master, and then potentially a second time as
they are cherry-picked onto release branches. So just following the
original commits doesn't work.

If a commit mentions the issue it fixes, you might get some clues as
to where it has ended up from GitLab. But those clues are often
drowning in irrelevant mentions: each failed Marge batch, for
instance, of which there can be many.

The only other thing I can think to do is look at the original merge
request, pluck out the commit messages, and use git to search for
commits by commit message and check each one for which branches
contain it. But then I also need to know the context of the fix to
know whether I should also be looking for other, logically related
commits, and repeat the dance. (Sometimes fixes are only partially
applied to certain releases, exacerbating the need for knowing the
context.) This seems like a mechanism that can't rely on trusting the
author of the original set of patches (which may be your past self)
and instead requires a deep understanding to be brought to bear every
time you would want to double check the situation. So it's not very
scalable and I wouldn't expect many people to be able to do it.

Are there better mechanisms already available? As I've said before, I
am used to a different git workflow and I'm still learning how to use
the one used by GHC. I'd like to know how others handle it.

Thanks!

-Bryan

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Can't build nofib

2023-07-12 Thread Andreas Klebinger

The nofib master branch has updated bounds. It's just ghcs submodule
that's lagging behind.

Am 12/07/2023 um 13:49 schrieb Rodrigo Mesquita:

I would recommend —allow-newer rather than rebuilding with 9.4. In
retrospect, 9.4 implies base == 4.17, but nofib seems to only allow <
4.17, which would leave 9.4 out.

Rodrigo


On 12 Jul 2023, at 12:48, Simon Peyton Jones
 wrote:

Thanks.  That is very unfortunate: ./configure does not issue any
complaint.

I upgraded from 9.2 because GHC won't compile with 9.2 any more.  But
now you are saying that nofib won't build with 9.6? So that leaves
9.4 only.

Well I can install 9.4 and rebuild everything.  But really, it would
be good if configure complained if you are using a boot compiler that
won't work.  That's what configure is for!

Simon

On Wed, 12 Jul 2023 at 12:41, Rodrigo Mesquita
 wrote:

From the error message it looks like you’re using ghc-9.6(and
base 4.18) while nofib requires base < 4.17.
I’d say as a temporary workaround you can likely run your
invocation additionally with —allow-newer, and hope that doesn’t
break. Otherwise you could downgrade to 9.4 or bump the version
manually in the cabal file of nofib?

Rodrigo


On 12 Jul 2023, at 12:38, Simon Peyton Jones
 wrote:

Friends

With a clean HEAD I can't build nofib.  See below.  What should
I do?

Thanks

Simon

(cd nofib; cabal v2-run -- nofib-run
--compiler=`pwd`/../_build/stage1/bin/ghc --output=`date -I`)
Resolving dependencies...
Error: cabal: Could not resolve dependencies:
[__0] trying: nofib-0.1.0.0 (user goal)
[__1] next goal: base (dependency of nofib)
[__1] rejecting: base-4.18.0.0/installed-4.18.0.0 (conflict:
nofib =>
base>=4.5 && <4.17)
[__1] skipping: base-4.18.0.0, base-4.17.1.0, base-4.17.0.0 (has
the same
characteristics that caused the previous version to fail:
excluded by
constraint '>=4.5 && <4.17' from 'nofib')
[__1] rejecting: base-4.16.4.0, base-4.16.3.0, base-4.16.2.0,
base-4.16.1.0,
base-4.16.0.0, base-4.15.1.0, base-4.15.0.0, base-4.14.3.0,
base-4.14.2.0,
base-4.14.1.0, base-4.14.0.0, base-4.13.0.0, base-4.12.0.0,
base-4.11.1.0,
base-4.11.0.0, base-4.10.1.0, base-4.10.0.0, base-4.9.1.0,
base-4.9.0.0,
base-4.8.2.0, base-4.8.1.0, base-4.8.0.0, base-4.7.0.2,
base-4.7.0.1,
base-4.7.0.0, base-4.6.0.1, base-4.6.0.0, base-4.5.1.0,
base-4.5.0.0,
base-4.4.1.0, base-4.4.0.0, base-4.3.1.0, base-4.3.0.0,
base-4.2.0.2,
base-4.2.0.1, base-4.2.0.0, base-4.1.0.0, base-4.0.0.0,
base-3.0.3.2,
base-3.0.3.1 (constraint from non-upgradeable package requires
installed
instance)
[__1] fail (backjumping, conflict set: base, nofib)
After searching the rest of the dependency tree exhaustively,
these were the
goals I've had most trouble fulfilling: base, nofib

make: *** [/home/simonpj/code/Makefile-spj:39: nofib] Error 1

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs





___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Can't build nofib

2023-07-12 Thread Andreas Klebinger

Try adding --allow-newer we probably just haven't updated the bounds yet.

(cd nofib; cabal v2-run --allow-newer -- nofib-run
--compiler=`pwd`/../_build/stage1/bin/ghc --output=`date -I`)

Am 12/07/2023 um 13:38 schrieb Simon Peyton Jones:

Friends

With a clean HEAD I can't build nofib.  See below.  What should I do?

Thanks

Simon

(cd nofib; cabal v2-run -- nofib-run
--compiler=`pwd`/../_build/stage1/bin/ghc --output=`date -I`)
Resolving dependencies...
Error: cabal: Could not resolve dependencies:
[__0] trying: nofib-0.1.0.0 (user goal)
[__1] next goal: base (dependency of nofib)
[__1] rejecting: base-4.18.0.0/installed-4.18.0.0 (conflict: nofib =>
base>=4.5 && <4.17)
[__1] skipping: base-4.18.0.0, base-4.17.1.0, base-4.17.0.0 (has the same
characteristics that caused the previous version to fail: excluded by
constraint '>=4.5 && <4.17' from 'nofib')
[__1] rejecting: base-4.16.4.0, base-4.16.3.0, base-4.16.2.0,
base-4.16.1.0,
base-4.16.0.0, base-4.15.1.0, base-4.15.0.0, base-4.14.3.0, base-4.14.2.0,
base-4.14.1.0, base-4.14.0.0, base-4.13.0.0, base-4.12.0.0, base-4.11.1.0,
base-4.11.0.0, base-4.10.1.0, base-4.10.0.0, base-4.9.1.0, base-4.9.0.0,
base-4.8.2.0, base-4.8.1.0, base-4.8.0.0, base-4.7.0.2, base-4.7.0.1,
base-4.7.0.0, base-4.6.0.1, base-4.6.0.0, base-4.5.1.0, base-4.5.0.0,
base-4.4.1.0, base-4.4.0.0, base-4.3.1.0, base-4.3.0.0, base-4.2.0.2,
base-4.2.0.1, base-4.2.0.0, base-4.1.0.0, base-4.0.0.0, base-3.0.3.2,
base-3.0.3.1 (constraint from non-upgradeable package requires installed
instance)
[__1] fail (backjumping, conflict set: base, nofib)
After searching the rest of the dependency tree exhaustively, these
were the
goals I've had most trouble fulfilling: base, nofib

make: *** [/home/simonpj/code/Makefile-spj:39: nofib] Error 1


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Using GHC API with multiple targets

2023-02-06 Thread Andreas Klebinger

I think this is an ok forum for this kind of question. You could also
try the haskell mailing list but I'm not sure if you will get more
help tehre.

I recently played around with the ghc api and I found the `hint` package
to be quite helpful as an example on how to do various
things when using the ghc api to implement your own interpreter.

Have you tried setting verbose? Perhaps the include dir is relative to
the working directory. In that case setting:

                  , workingDirectory = Just targetDir
                  , importPaths      = [targetDir] ++ importPaths dynflags

would mean ghc will search in targetDir/targetDir for Lib/Lib2. Should
be easy to say for sure by enabling verbosity and looking at the output.

Am 06/02/2023 um 13:42 schrieb Eternal Recursion via ghc-devs:

If this is the wrong forum for this question (which as I think about
it, I suppose it is) then redirection to a more appropriate mailing
list or forum (or any advice, really) would be appreciated. I just
figured this would be the forum with the best understanding of how the
GHC API works (and has changed over time), and my longer term goal is
indeed to contribute to it after I get past my learning curve.

Sincerely,

Bob

Sent with Proton Mail  secure email.

--- Original Message ---
On Saturday, February 4th, 2023 at 4:04 PM, Eternal Recursion via
ghc-devs  wrote:


Hi Everyone!

I'm new here, trying to learn the GHC API. using 944 with cabal 3.8.1.0.

How do I correctly set a GHC Session's DynFlags (and/or other
properties) to ensure local libraries imported by the main target are
resolved properly at compile time?

What flags need to be set so that GHC is able to load/analyze/compile
all relevant Libraries in a package?

This is my current code:

withPath :: FilePath -> IO ()
withPath target = do
let targetDir = takeDirectory target
  let targetFile = takeFileName target
listing <- listDirectory targetDir
  let imports = filter (\f -> takeExtension f == ".hs") listing
  print imports
  let moduleName = mkModuleName targetFile
  g <- defaultErrorHandler defaultFatalMessager defaultFlushOut
    $ runGhc (Just libdir) $ do
initGhcMonad (Just libdir)
dynflags <- getSessionDynFlags
setSessionDynFlags $ dynflags { ghcLink          = LinkInMemory
                          , ghcMode          = CompManager
                          , backend          = Interpreter
                          , mainModuleNameIs = moduleName
                          , workingDirectory = Just targetDir
                          , importPaths      = [targetDir] ++
importPaths dynflags
                          }
targets <- mapM (\t -> guessTarget t Nothing Nothing) imports
setTargets targets
setContext [ IIDecl $ simpleImportDecl (mkModuleName "Prelude") ]
load LoadAllTargets
liftIO . print . ppr =<< getTargets
getModuleGraph
putStrLn "Here we go!"
  print $ ppr $ mgModSummaries g
putStrLn "☝️ "

However, when I run it (passing to example/app/Main.hs, in which
directory are Lib.hs and Lib2.hs, the latter being imported into
Main), I get:

$cabal run cli -- example/app/Main.hs
Up to date
["Main.hs","Lib.hs","Lib2.hs"]
[main:Main.hs, main:Lib.hs, main:Lib2.hs]
Here we go!
[ModSummary {
   ms_hs_hash = 23f9c4415bad851a1e36db9d813f34be
   ms_mod = Lib,
   unit = main
   ms_textual_imps = [(, Prelude)]
   ms_srcimps = []
},
ModSummary {
   ms_hs_hash = e1eccc23af49f3498a5a9566e63abefd
   ms_mod = Lib2,
   unit = main
   ms_textual_imps = [(, Prelude)]
   ms_srcimps = []
},
ModSummary {
   ms_hs_hash = 5f6751d7f0d5547a1bdf39af84f8c07f
   ms_mod = Main,
   unit = main
   ms_textual_imps = [(, Prelude), (, Lib2)]
   ms_srcimps = []
}]
☝

example/app/Main.hs:4:1: error:
   Could not find module ‘Lib2’
   Use -v (or `:set -v` in ghci) to see a list of the files searched for.
 |
4 |import qualified Lib2 as L2
 |^^^
cli: example/app/Main.hs:4:1: error:
   Could not find module `Lib2'
   Use -v (or `:set -v` in ghci) to see a list of the files searched
for.

​What do I need to do differently to make this work?

I have a local Cabal file I could use, but to know what I need out of
it, I need to understand the minimum required info to get this to
work. TIA!

Sincerely,

Bob

Sent with Proton Mail  secure email.



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Mixed boxed/unboxed arrays?

2022-08-02 Thread Andreas Klebinger

Indeed I misunderstood. As you already suspected this wouldn't work for
Int# (or other unboxed types) sadly as the GC would assume these to be
pointers which no doubt would lead to segfaults or worse.

Rereading your initial mail I can say the runtime currently doesn't
support such a heap object.
If I understand you correctly what you would like is basically a
something like:

Con n P I# P I# P I#  ...
   \/ \/\/
 Pair1 Pair2 Pair3 ...

Where n gives the number of pairs.

I can see how it might be feasible to add a heap object like this to GHC
but I'm unsure if it would be worth the complexity as it's layout
diverges quite a bit from what GHC usually expects.

The other option would be to expose to users a way to have an object
that consist of a given number of words and a bitmap which indicates to
the GHC which fields are pointers. This is more or less
the representation that's already used to deal with stack frames iirc so
that might not be as far fetched as it seems at first.
It might even be possible to implement some sort of prototype for this
using hand written Cmm.

But there are not any plans to implement anything like this as far as I
know.

Am 02/08/2022 um 20:51 schrieb Jaro Reinders:


It seems you have misunderstood me. I want to store *unboxed* Int#s
inside the array, not just some unlifted types. Surely in the case of
unboxed integers the unsafeCoerce# function can make the garbage
collector crash as they might be interpreted as arbitrary pointers.

Cheers,

Jaro

On 02/08/2022 20:24, Andreas Klebinger wrote:


I think it's possible to do this *today* using unsafeCoerce#.

I was able to come up with this basic example below. In practice one
would at the very least want to abstract away the gnarly stuff inside a
library. But since it sounds like you want to be the one to write a
library that shouldn't be a problem.

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
{-# LANGUAGE UnliftedDatatypes #-}
moduleMainwhere
importGHC.Exts
importGHC.IO
importUnsafe.Coerce
importData.Kind
dataSA= SA (SmallMutableArray# RealWorldAny)
mkArray:: Int-> a-> IO(SA)
mkArray (I# n) initial = IO $ \s ->
caseunsafeCoerce# (newSmallArray# n initial s) of
        (# s', arr #) -> (# s', SA arr #)
readLifted:: SA-> Int-> IOa
readLifted (SA arr) (I# i) = IO (\s ->
    unsafeCoerce# (readSmallArray# arr i s)
    )
dataUWrap(a:: UnliftedType) = UWrap a
-- UWrap is just here because we can't return unlifted types in IO
-- If you don't need your result in IO you can eliminate this
indirection.
readUnlifted:: foralla. SA-> Int-> IO(UWrapa)
readUnlifted (SA arr) (I# i) = IO (\s ->
caseunsafeCoerce# (readSmallArray# arr i s) of
        (# s', a :: a#) -> (# s', UWrap a #)
    )
writeLifted:: a-> Int-> SA-> IO()
writeLifted x (I# i) (SA arr) = IO $ \s ->
casewriteSmallArray# (unsafeCoerce# arr) i x s of
        s -> (# s, ()#)
writeUnlifted:: (a:: UnliftedType) -> Int-> SA-> IO()
writeUnlifted x (I# i) (SA arr) = IO $ \s ->
casewriteSmallArray# arr i (unsafeCoerce# x) s of
        s -> (# s, ()#)
typeUB:: UnliftedType
dataUB= UT | UF
showU:: UWrapUB-> String
showU (UWrap UT) = "UT"
showU (UWrap UF) = "UF"
main:: IO()
main = do
    arr <- mkArray 4()
    writeLifted True 0arr
    writeLifted False 1arr
    writeUnlifted UT 2arr
    writeUnlifted UT 3arr
    (readLifted arr 0:: IOBool) >>= print
    (readLifted arr 1:: IOBool) >>= print
    (readUnlifted arr 2:: IO(UWrapUB)) >>= (putStrLn . showU)
    (readUnlifted arr 3:: IO(UWrapUB)) >>= (putStrLn . showU)
    return ()

Cheers

Andreas

Am 02/08/2022 um 17:32 schrieb J. Reinders:

Could you use `StablePtr` for the keys?

That might be an option, but I have no idea how performant stable pointers are 
and manual management is obviously not ideal.


How does the cost of computing object hashes and comparing colliding
objects compare with the potential cache miss cost of using boxed
integers or a separate array?  Would such an "optimisation" be worth
the effort?

Literature on hash tables suggests that cache misses were a very important 
factor in running time (in 
2001):https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.4189

I don’t know whether it has become less or more important now, but I have been 
told there haven’t been that many advances in memory latency.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Mixed boxed/unboxed arrays?

2022-08-02 Thread Andreas Klebinger

I think it's possible to do this *today* using unsafeCoerce#.

I was able to come up with this basic example below. In practice one
would at the very least want to abstract away the gnarly stuff inside a
library. But since it sounds like you want to be the one to write a
library that shouldn't be a problem.

{-# LANGUAGE MagicHash #-}
{-# LANGUAGE UnboxedTuples #-}
{-# LANGUAGE UnliftedDatatypes #-}
moduleMainwhere
importGHC.Exts
importGHC.IO
importUnsafe.Coerce
importData.Kind
dataSA= SA (SmallMutableArray# RealWorldAny)
mkArray:: Int-> a-> IO(SA)
mkArray (I# n) initial = IO $ \s ->
caseunsafeCoerce# (newSmallArray# n initial s) of
        (# s', arr #) -> (# s', SA arr #)
readLifted:: SA-> Int-> IOa
readLifted (SA arr) (I# i) = IO (\s ->
    unsafeCoerce# (readSmallArray# arr i s)
    )
dataUWrap(a:: UnliftedType) = UWrap a
-- UWrap is just here because we can't return unlifted types in IO
-- If you don't need your result in IO you can eliminate this indirection.
readUnlifted:: foralla. SA-> Int-> IO(UWrapa)
readUnlifted (SA arr) (I# i) = IO (\s ->
caseunsafeCoerce# (readSmallArray# arr i s) of
        (# s', a :: a#) -> (# s', UWrap a #)
    )
writeLifted:: a-> Int-> SA-> IO()
writeLifted x (I# i) (SA arr) = IO $ \s ->
casewriteSmallArray# (unsafeCoerce# arr) i x s of
        s -> (# s, ()#)
writeUnlifted:: (a:: UnliftedType) -> Int-> SA-> IO()
writeUnlifted x (I# i) (SA arr) = IO $ \s ->
casewriteSmallArray# arr i (unsafeCoerce# x) s of
        s -> (# s, ()#)
typeUB:: UnliftedType
dataUB= UT | UF
showU:: UWrapUB-> String
showU (UWrap UT) = "UT"
showU (UWrap UF) = "UF"
main:: IO()
main = do
    arr <- mkArray 4()
    writeLifted True 0arr
    writeLifted False 1arr
    writeUnlifted UT 2arr
    writeUnlifted UT 3arr
    (readLifted arr 0:: IOBool) >>= print
    (readLifted arr 1:: IOBool) >>= print
    (readUnlifted arr 2:: IO(UWrapUB)) >>= (putStrLn . showU)
    (readUnlifted arr 3:: IO(UWrapUB)) >>= (putStrLn . showU)
    return ()

Cheers

Andreas

Am 02/08/2022 um 17:32 schrieb J. Reinders:

Could you use `StablePtr` for the keys?

That might be an option, but I have no idea how performant stable pointers are 
and manual management is obviously not ideal.


How does the cost of computing object hashes and comparing colliding
objects compare with the potential cache miss cost of using boxed
integers or a separate array?  Would such an "optimisation" be worth
the effort?

Literature on hash tables suggests that cache misses were a very important 
factor in running time (in 
2001):https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.25.4189

I don’t know whether it has become less or more important now, but I have been 
told there haven’t been that many advances in memory latency.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: [External] Re: Specialising NOINLINE functions

2022-05-09 Thread Andreas Klebinger


Am 09/05/2022 um 10:12 schrieb Simon Peyton Jones:

I think the only downside is compilation time and code bloat.


It can cause some rules to stop firing in a similar fashion as
https://gitlab.haskell.org/ghc/ghc/-/issues/20364 discusses for W/W.

But that can be fixed by adding the appropriate INLINE[ABLE] pragmas.

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Absence info at run-time

2022-04-13 Thread Andreas Klebinger

W/W should transform such a function into one who takes one less
argument removing any runtime overhead at least for fully applied functions.

I suppose your suggestion is then if we an expression`f x` where bar
takes multiple arguments, but doesn't use the current argument then GHC
should:

* Inspect f, check if the first argument to f is used
* If we can determine it isn't used instead of creating a PAP capturing
`f` and `x` instead only capture `f` and record this in the PAP closure
somehow.
* Once the PAP is fully applied pass a dummy argument instead of `x` to f.

If f is a known call that seems doable, although adding a bitmap to paps
might require us to increase the size of all PAP closures, making this
optimization less useful.

If `f` is a unknown function there is currently no way to get
absent/used info for it's arguments at runtime. And changing that would
be a major change which seems unlikely to pay off.

So I think this would be theoretically possible, but it would rarely pay
off.

Also do you have an example where `(const a) b` leads to stupid thunks?
It seems to me const should always be inlined in such a case, avoiding a
PAP allocation.

Am 12/04/2022 um 23:02 schrieb David Feuer:

Suppose `f` doesn't use its first argument. When forming the thunk (or
partial application) `f a`, we don't need to record `a`. What if
instead of arity, we store a bitmap used/absent arguments, terminated
by a 1 bit? Could we then get rid of "stupid thunks" like `(const a) b`?

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Semantics of Cmm `switch` statement?

2022-01-11 Thread Andreas Klebinger

Hi Norman,

I vaguely remember that we "finish" such unterminated code blocks by
jumping to the block again.

That is for code like this:

    myswitch2 (bits32 n) {
  foreign "C" D();
    }

We produce code like this:

    {
  cg: call "ccall" arg hints:  []  result hints:  [] D();
  goto cg;
    }

Instead of blowing up the compiler at compile time or the program at
runtime.

For switch statements I think blocks are just syntactic sugar. E.g. if
you write

case n: {  }

it's treated as if you wrote

case n: jmp codeBlock;
...

codeBlock:
    

And since your blocks don't terminate we get the behaviour you are seeing.
But I haven't looked at any of the code related to this so it's possible
I got it wrong.

Cheers
Andreas


Am 12/01/2022 um 01:02 schrieb Norman Ramsey:

For testing purposes, I created the following Cmm program:

 myswitch (bits32 n) {
   switch [0 .. 4] n {
 case 0, 1: { foreign "C" A(); }
 case 2: { foreign "C" B(); }
 case 4: { foreign "C" C(); }
 default: { foreign "C" D(); }
   }
   return (666);
 }

In the original C-- specification, it's pretty clear that when, say,
the call to foreign function `A` terminates, the switch statement is
supposed to finish and function `myswitch` is supposed to return 666.
What actually happens in GHC is that this source code is parsed into a
control-flow graph in which execution loops forever, repeating the
call.  The relevant fragment of the prettyprinted CFG looks like this:

{offset
  ca: // global
  _c1::I32 = %MO_XX_Conv_W64_W32(R1);
  //tick src
  switch [0 .. 4] _c1::I32 {
  case 0, 1 : goto c5;
  case 2 : goto c7;
  case 4 : goto c9;
  default: {goto c3;}
  }
  ...
  c5: // global
  //tick src
  _c4::I64 = A;
  call "ccall" arg hints:  []  result hints:  [] (_c4::I64)();
  goto c5;
  ...
}

Surprising, at least to me.

Is this behavior a bug or a feature?  And if it is a feature, can
anyone explain it to me?


Norman
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Alternatives for representing a reverse postorder numbering

2021-12-09 Thread Andreas Klebinger

Hello Norman,

There is no invariant that Cmm control flow is reducible. So we can't
always rely on this being the case.
Depending on what you want to use this for this might or might not matter.

In terms of implementation I think the question is if doing lookups in a
LabelMap is more expensive
than making the CmmGraph representation both more polymorphic, and
putting more info into the Graph.

Which I guess mostly depends on how much mileage we get out of the
numbering. Which is impossible for
me to say in advance. If you only need/use this info for a small part of
the pipeline then keeping it as LabelMap
seems more reasonable. If you have plans to improve all sorts of passes
with this information at various stages
in the pipeline integrating it into CmmGraph seems better. It's
impossible to say without knowing the details.

All that being said I rarely have lost sleep over the overhead of
looking things up in IntMaps.
The constants are pretty good there and it seems reasonable easy to
change it later if needed.


Am 06/12/2021 um 22:50 schrieb Norman Ramsey:

Reverse postorder numbering is a superpower for control-flow analysis
and other back-end things.  For example,

   - In a reducible flow graph, a node Q is a loop header if and only if
 it is targeted by an edge P -> Q where Q's reverse postorder
 number is not greater than P's.

   - If a loop has multiple exits, the reverse postorder numbering of
 the exit nodes tells exactly the order in which the nodes must
 appear so they can be reached by multilevel `exit`/`break`
 statements, as are found in WebAssembly.

   - Reverse postorder numbers enable efficient computations of set
 intersection for dominator analysis.

One could go on.

In a perfect world, our representation of control-flow graphs would
provide a place to store a reverse postorder number for each
(reachable) basic block.  Alas, we live in an imperfect world, and I
am struggling to figure out how best to store reverse postorder
numbers for the blocks in a `GenCmmGraph`.

  1. One could apply ideas from "trees that grow" to the `Block` type
 from `GHC.Cmm.Dataflow.Block`.  But this type is already one of
 the more complicated types I have ever encountered in the wild,
 and the thought of further complexity fills me with horror.

  2. One could generalize quite a few types in `Cmm`.  In particular,
 one could create an analog of the `GenCmmGraph` type.  The analog,
 instead of taking a node type `n` as its parameter, would take a
 block type as its parameter.  It would use `Graph'` as defined in
 `GHC.Cmm.Dataflow.Graph`.  This change would ripple into
 `GHC.Cmm.Dataflow` without doing a whole lot of violence to the
 code that it there.  It would then become possible to do dataflow
 analysis (and perhaps other operations) over graphs with annotated
 blocks.

 It's worth noting that the `Graph'` representation already exists,
 but it doesn't seem to be used anywhere.  I'm not sure how it
 survived earlier rounds of culling.

  3. One could simple build an auxiliary `LabelMap` that includes the
 reverse postorder number of every node.  This idea bugs me a bit.
 I don't love spending O(log N) steps twice every time I look to
 see the direction of a control-flow edge.  But what I *really*
 don't love is what happens to the interfaces.  I can compute the
 reverse postorder map twice, or I can pollute my interfaces by
 saying "here is the dominator relation, and by the way, here also
 are the reverse postorder numbers, which I had to compute."

I'm currently using alternative 3: when I need reverse postorder
numbers, I call `revPostorderFrom` (defined in `GHC.Cmm.Dataflow.Graph`),
then zip with `[1..]`.  But I'm really tempted by alternative 2,
which would allow me to, for example, define a graph annotated with
reverse postorder numbers, then do both the dominator analysis and my
translation on that graph.

How deep into the weeds should I go?  Make dataflow analysis even more 
polymorphic?
or learn to love `LabelMap`?


Norman





___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: [Take 2] Unexpected duplicate join points in "Core" output?

2021-11-20 Thread Andreas Klebinger

At this point I think it would be good if you could put your problem
into a ghc-ticket.

I can't look in detail into this in greater detail atm because of time
constraints.
And without a ticket it's likely to fall by the wayside eventually.
But it does seem like something where we maybe could do better.

And having good examples for the problematic behaviour is always
immensely helpful
to solve these kinds of problems.

Cheers
Andreas

Am 20/11/2021 um 19:54 schrieb Viktor Dukhovni:

On Sat, Nov 20, 2021 at 12:49:08PM +0100, Andreas Klebinger wrote:


For the assembly I opened a ticket:
https://gitlab.haskell.org/ghc/ghc/-/issues/20714

Thanks, much appreciated.  Understood re redundant join points, though
in the non-toy context the redundnat point code is noticeably larger.

 join {
   exit4
 :: Addr# -> Word# -> State# RealWorld -> Maybe (Int64, 
ByteString)
   exit4 (ww4 :: Addr#) (ww5 :: Word#) (ipv :: State# RealWorld)
 = case touch# dt1 ipv of { __DEFAULT ->
   let {
 dt3 :: Int#
 dt3 = minusAddr# ww4 dt } in
   case ==# dt3 dt2 of {
 __DEFAULT -> jump exit1 ww2 wild dt dt1 dt2 cs dt3 ww5;
 1# -> jump $wconsume cs (orI# ww2 dt3) ww5
   }
   } } in
 join {
   exit5
 :: Addr# -> Word# -> State# RealWorld -> Maybe (Int64, 
ByteString)
   exit5 (ww4 :: Addr#) (ww5 :: Word#) (w1 :: State# RealWorld)
 = case touch# dt1 w1 of { __DEFAULT ->
   let {
 dt3 :: Int#
 dt3 = minusAddr# ww4 dt } in
   case ==# dt3 dt2 of {
 __DEFAULT -> jump exit1 ww2 wild dt dt1 dt2 cs dt3 ww5;
 1# -> jump $wconsume cs (orI# ww2 dt3) ww5
   }
   } } in

FWIW, these don't appear to be deduplicated, both result from the same
conditional: `acc < q || acc == q && d < 5`.  I need some way to make
this compute a single boolean value without forking the continuation.

There's a another source of code bloat that I'd like to run by you...
In the WIP code for Lazy ByteString 'readInt', I started with:

   readInt !q !r =
 \ !s -> consume s False 0
   where
 -- All done
 consume s@Empty !valid !acc
 = if valid then convert acc s else Nothing
 -- skip empty chunk
 consume (Chunk (BI.BS _ 0) cs) !valid !acc
-- Recurse
 = consume cs valid acc
 -- process non-empty chunk
 consume s@(Chunk c@(BI.BS _ !len) cs) !valid !acc
 = case _digits q r c acc of
 Result used acc'
 | used <= 0 -- No more digits present
   -> if valid then convert acc' s else Nothing
 | used < len -- valid input not entirely digits
   -> let !c' = BU.unsafeDrop used c
   in convert acc' $ Chunk c' cs
 | otherwise -- try to read more digits
-- Recurse
   -> consume cs True acc'
 Overflow -> Nothing

Now _digits is the I/O loop I shared before, and the calling code gets
inlined into that recursive loop with various join points.  But the loop
gets forked into multiple copies which are compiled separately, because
there are two different recursive calls into "consume" that got compiled
into separate "joinrec { ... }".

So I tried instead:

   readInt !q !r =
 \ !s -> consume s False 0
   where
 -- All done
 consume s@Empty !valid !acc
 = if valid then convert acc s else Nothing
 consume s@(Chunk c@(BI.BS _ !len) cs) !valid !acc
 = case _digits q r c acc of
 Result used acc'
 | used == len -- try to read more digits
-- Recurse
   -> consume cs (valid || used > 0) acc'
 | used > 0 -- valid input not entirely digits
   -> let !c' = BU.unsafeDrop used c
   in convert acc' $ Chunk c' cs
 | otherwise -- No more digits present
   -> if valid then convert acc' s else Nothing
 Overflow -> Nothing

But was slightly surprised to find even more duplication (3 copies
instead of tw) of the I/O loop, because in the call:

 consume cs (valid || used > 0) acc'

the boolean argument got floated out, giving:

 case valid of
 True -> consume cs True acc'
 _ -> case used > 0 of
 True -> consume cs True acc'
   

Re: [Take 2] Unexpected duplicate join points in "Core" output?

2021-11-20 Thread Andreas Klebinger

Hello Victor,

generally GHC does try to common up join points and duplicate
expressions like that.
But since that's relatively expensive most of the duplication happens
during the core-cse pass which only happens once.

We don't create them because they are harmless. They are simple a side
product of optimizations happening after
the main cse pass has run. There is no feasible way to fix this I think.
As you say with some luck they get caught at the Cmm stage and
deduplicated there. Sadly it doesn't always happen. In most cases the
impact of this is thankfully rather
small.

For the assembly I opened a ticket:
https://gitlab.haskell.org/ghc/ghc/-/issues/20714

Am 20/11/2021 um 02:02 schrieb Viktor Dukhovni:

[ Sorry wrong version of attachment in previous message. ]

The below "Core" output from "ghc -O2" (9.2/8.10) for the attached
program shows seemingly rendundant join points:

   join {
 exit :: State# RealWorld -> (# State# RealWorld, () #)
 exit (ipv :: State# RealWorld) = jump $s$j ipv } in

   join {
 exit1 :: State# RealWorld -> (# State# RealWorld, () #)
 exit1 (ipv :: State# RealWorld) = jump $s$j ipv } in

that are identical in all but name.  These correspond to fallthrough
to the "otherwise" case in:

...
| acc < q || (acc == q && d <= 5)
  -> loop (ptr `plusPtr` 1) (acc * 10 + d)
| otherwise -> return Nothing

but it seems that the generated X86_64 code (also below) ultimately
consolidates these into a single target... Is that why it is harmless to
leave these duplicated in the generated "Core"?

[ Separately, in the generated machine code, it'd also be nice to avoid
   comparing the same "q" with the accumulator twice.  A single load and
   compare should I think be enough, as I'd expect the status flags to
   persist across the jump the second test.

   This happens to not be performance critical in my case, because most
   calls should satisfy the first test, but generally I think that 3-way
   "a < b", "a == b", "a > b" branches ideally avoid comparing twice... ]

 Associated Core output

 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
 main2 :: Addr#
 main2 = "12345678901234567890 junk"#

 -- RHS size: {terms: 129, types: 114, coercions: 0, joins: 6/8}
 main1 :: State# RealWorld -> (# State# RealWorld, () #)
 main1
   = \ (eta :: State# RealWorld) ->
   let {
 end :: Addr#
 end = plusAddr# main2 25# } in
   join {
 $s$j :: State# RealWorld -> (# State# RealWorld, () #)
 $s$j _ = hPutStr2 stdout $fShowMaybe4 True eta } in
   join {
 exit :: State# RealWorld -> (# State# RealWorld, () #)
 exit (ipv :: State# RealWorld) = jump $s$j ipv } in
   join {
 exit1 :: State# RealWorld -> (# State# RealWorld, () #)
 exit1 (ipv :: State# RealWorld) = jump $s$j ipv } in
   join {
 exit2
   :: Addr# -> Word# -> State# RealWorld -> (# State# RealWorld, () 
#)
 exit2 (ww :: Addr#) (ww1 :: Word#) (ipv :: State# RealWorld)
   = case eqAddr# ww main2 of {
   __DEFAULT ->
 hPutStr2
   stdout
   (++
  $fShowMaybe1
  (case $w$cshowsPrec3 11# (integerFromWord# ww1) [] of
   { (# ww3, ww4 #) ->
   : ww3 ww4
   }))
   True
   eta;
   1# -> jump $s$j ipv
 } } in
   joinrec {
 $wloop
   :: Addr# -> Word# -> State# RealWorld -> (# State# RealWorld, () 
#)
 $wloop (ww :: Addr#) (ww1 :: Word#) (w :: State# RealWorld)
   = join {
   getDigit :: State# RealWorld -> (# State# RealWorld, () #)
   getDigit (eta1 :: State# RealWorld)
 = case eqAddr# ww end of {
 __DEFAULT ->
   case readWord8OffAddr# ww 0# eta1 of { (# ipv, ipv1 #) 
->
   let {
 ipv2 :: Word#
 ipv2 = minusWord# (word8ToWord# ipv1) 48## } in
   case gtWord# ipv2 9## of {
 __DEFAULT ->
   case ltWord# ww1 1844674407370955161## of {
 __DEFAULT ->
   case ww1 of {
 __DEFAULT -> jump exit ipv;
 1844674407370955161## ->
   case leWord# ipv2 5## of {
 __DEFAULT -> jump exit1 ipv;
 1# ->
 

Re: DWARF support

2021-11-17 Thread Andreas Klebinger


Am 17/11/2021 um 17:08 schrieb Richard Eisenberg:

For windows we have Portable Executable (PE) as the container format.

This implies that the DWARF work is (unsurprisingly) completely inapplicable 
for Windows.

It's not quite as simple. Dwarf info can be embedded into windows
executables/libraries. Tools that understand dwarf can read that
information and use it.
However many tools on windows don't understand dwarf debugging
information since windows has it's own debugging format.

So it's not quite as helpful as on linux where everyone agreed to use
dwarf. But it still can be used for some things.

E.g. there are debuggers on windows (lldbg, gdb) that can read and use
this information.

Further we could in theory start emitting the same information in the
windows native format. In that case we could re-use much of the
work that went into GHC to allows us to collect debugging information.
Putting into the right format comes fairly late in the pipeline and
all steps up to that could be shared.

Cheers

Andreas

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: 8-bit and 16-bit arithmetic

2021-10-28 Thread Andreas Klebinger

I think carter has it still right that it happens in the backends.

If a new backend doesn't support these we could move this up into Cmm
though without much issue I think.

Am 28/10/2021 um 23:12 schrieb Carter Schonwald:

I think thats done on a per backend basis (though theres been a lot of
changes since i last looked at some of the relevent pieces). (i'm
actually based in Cambridge MA for the next 1-2 years if you wanna
brain storm IRL sometime)

On Thu, Oct 28, 2021 at 4:59 PM Norman Ramsey  wrote:

On x86, GHC can translate 8-bit and 16-bit operations directly
into the 8-bit and 16-bit machine instructions that the hardware
supports.  But there are other platforms on which the smallest
unit of arithmetic may be 32 or even 64 bits.  Is there a central
module in GHC that can take care of rewriting 8-bit and 16-bit
operations
into 32-bit or 64-bit operations?  Or is each back end on its own
for this?

(One of my students did some nice work on implementing this
transformation
with a minimal set of sign-extension and zero-extension operations:
https://www.cs.tufts.edu/~nr/pubs/widen.pdf.)


Norman
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: help validating a modified compiler?

2021-10-12 Thread Andreas Klebinger

I tried it myself and validate fails locally as well.

I've created a ticket here:
https://gitlab.haskell.org/ghc/ghc/-/issues/20506

Am 11/10/2021 um 22:27 schrieb Norman Ramsey:

  > Speaking for myself: I have not validated locally for quite a while. I just
  > rely on CI.

I've confirmed that a fresh checkout doesn't validate.  Is anyone else
willing to try?  If it's a problem that only I have, I'm reluctant to
open an issue.


  > You can mark an MR as a "Draft" to avoid triggering a review.

How is it so marked?  Put the word "Draft" in the title?

  > Instead of validating locally, I tend to just run the testsuite on the
  > built GHC.

I'll give that a try, thanks.


N
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Why can't arguments be levity polymorphic for inline functions?

2021-10-08 Thread Andreas Klebinger

Hey Clinton,

I think the state of things is best summarised it's in principle
possible to implement. But it's unclear how best to do so
or even if it's worth having this feature at all.

The biggest issue being code bloat.

As you say a caller could create it's own version of the function with
the right kind of argument type.
But that means duplicating the function for every use site (although
some might be able to be commoned up). Potentially causing a lot of code
bloat and compile time overhead.

In a similar fashion we could create each potential version we need from
the get go to avoid duplicating the same function.
But that runs the risk of generating far more code than what is actually
used.

Last but not least GHC currently doesn't always load unfoldings. In
particular if you compile a module with optimizations disabled the RHS
is currently *not*
available to the compiler when looking at the use site. There already is
a mechanism to bypass this in GHC where a function *must* be inlined
(compulsory unfoldings).
But it's currently reserved for built in functions. We could just make
INLINE bindings compulsory if they have levity-polymorphic arguments
sure. But again it's not clear
this is really desireable.

I don't think this has to mean we couldn't change how things work to
accomodate levity-polymorphic arguments. It just seems it's unclear what
a good design
would look like and if it's worth having.

Cheers
Andreas

Am 08/10/2021 um 01:36 schrieb Clinton Mead:

Hi All

Not sure if this belongs in ghc-users or ghc-devs, but it seemed devy
enough to put it here.

Section 6.4.12.1

of the GHC user manual points out, if we allowed levity polymorphic
arguments, then we would have no way to compile these functions,
because the code required for different levites is different.

However, if such a function is {-# INLINE #-} or{-# INLINABLE #-}
there's no need to compile it as it's full definition is in the
interface file. Callers can just compile it themselves with the levity
they require. Indeed callers of inline functions already compile their
own versions even without levity polymorphism (for example, presumably
inlining function calls that are known at compile time).

The only sticking point to this that I could find was that GHC will
only inline the function if it is fully applied
,
which suggests that the possibility of partial application means we
can't inline and hence need a compiled version of the code. But this
seems like a silly restriction, as we have the full RHS of the
definition in the interface file. The caller can easily create and
compile it's own partially applied version. It should be able to do
this regardless of levity.

It seems to me we're okay as long as the following three things aren't
true simultaneously:

1. Blah has levity polymorphic arguments
2. Blah is exported
3. Blah is not inline

If a function "Blah" is not exported, we shouldn't care about levity
polymorphic arguments, because we have it's RHS on hand in the current
module and compile it as appropriate. And if it's inline, we're
exposing it's full RHS to other callers so we're still fine also. Only
when these three conditions combine should we give an error, say like:

"Blah has levity polymorphic arguments, is exported, and is not
inline. Please either remove levity polymorphic arguments, not export
it or add an {-# INLINE #-} or {-# INLINABLE #-} pragma.

I presume however there are some added complications that I don't
understand, and I'm very interested in what they are as I presume
they'll be quite interesting.

Thanks,
Clinton


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Perf backtrace support.

2021-10-07 Thread Andreas Klebinger

Hello Devs,

as some of you know I've recently been working on #8272.
The goal being to use the machine stack register for the STG stack as a
means to
get perf backtraces. I've succeeded in making a branch that works for
the first
part but have so far been unable to get perf to generate proper back traces.

For various reasons I will stop looking at that particular issue. So if
anyone feels
interested in figuring out where the interaction between our dwarf info
and perf
unwinding the stack goes wrong please take a look! There is more
information on
the ticket.

Cheers

Andreas


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: On CI

2021-03-24 Thread Andreas Klebinger

> What about the case where the rebase *lessens* the improvement? That
is, you're expecting these 10 cases to improve, but after a rebase, only
1 improves. That's news! But a blanket "accept improvements" won't tell you.

I don't think that scenario currently triggers a CI failure. So this
wouldn't really change.

As I understand it the current logic is:

* Run tests
* Check if any cross the metric thresholds set in the test.
* If so check if that test is allowed to cross the threshold.

I believe we don't check that all benchmarks listed with an expected
in/decrease actually do so.
It would also be hard to do so reasonably without making it even harder
to push MRs through CI.

Andreas

Am 24/03/2021 um 13:08 schrieb Richard Eisenberg:

What about the case where the rebase *lessens* the improvement? That is, you're expecting 
these 10 cases to improve, but after a rebase, only 1 improves. That's news! But a 
blanket "accept improvements" won't tell you.

I'm not hard against this proposal, because I know precise tracking has its own 
costs. Just wanted to bring up another scenario that might be factored in.

Richard


On Mar 24, 2021, at 7:44 AM, Andreas Klebinger  wrote:

After the idea of letting marge accept unexpected perf improvements and
looking at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4759
which failed because of a single test, for a single build flavour
crossing the
improvement threshold where CI fails after rebasing I wondered.

When would accepting a unexpected perf improvement ever backfire?

In practice I either have a patch that I expect to improve performance
for some things
so I want to accept whatever gains I get. Or I don't expect improvements
so it's *maybe*
worth failing CI for in case I optimized away some code I shouldn't or
something of that
sort.

How could this be actionable? Perhaps having a set of indicator for CI of
"Accept allocation decreases"
"Accept residency decreases"

Would be saner. I have personally *never* gotten value out of the
requirement
to list the indivial tests that improve. Usually a whole lot of them do.
Some cross
the threshold so I add them. If I'm unlucky I have to rebase and a new
one might
make it across the threshold.

Being able to accept improvements (but not regressions) wholesale might be a
reasonable alternative.

Opinions?

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: On CI

2021-03-24 Thread Andreas Klebinger

After the idea of letting marge accept unexpected perf improvements and
looking at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4759
which failed because of a single test, for a single build flavour
crossing the
improvement threshold where CI fails after rebasing I wondered.

When would accepting a unexpected perf improvement ever backfire?

In practice I either have a patch that I expect to improve performance
for some things
so I want to accept whatever gains I get. Or I don't expect improvements
so it's *maybe*
worth failing CI for in case I optimized away some code I shouldn't or
something of that
sort.

How could this be actionable? Perhaps having a set of indicator for CI of
"Accept allocation decreases"
"Accept residency decreases"

Would be saner. I have personally *never* gotten value out of the
requirement
to list the indivial tests that improve. Usually a whole lot of them do.
Some cross
the threshold so I add them. If I'm unlucky I have to rebase and a new
one might
make it across the threshold.

Being able to accept improvements (but not regressions) wholesale might be a
reasonable alternative.

Opinions?

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: GHC 8.10 backports?

2021-03-24 Thread Andreas Klebinger

Yes, only changing the rule did indeed cause regressions.
Whichwhen not including the string changes. I don't think it's worth
having one without the other.

But it seems you already backported this?
See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5263

Cheers
Andreas

Am 22/03/2021 um 07:02 schrieb Moritz Angermann:

The commit message from
https://gitlab.haskell.org/ghc/ghc/-/commit/f10d11fa49fa9a7a506c4fdbdf86521c2a8d3495
,

makes the changes to string seem required. Applying the commit on its
own doesn't apply cleanly and pulls in quite a
bit of extra dependent commits. Just applying the elem rules appears
rather risky. Thus will I agree that having that
would be a nice fix to have, the amount of necessary code changes
makes me rather uncomfortable for a minor release :-/

On Mon, Mar 22, 2021 at 1:58 PM Gergő Érdi mailto:ge...@erdi.hu>> wrote:

Thanks, that makes it less appealing. In the original thread, I
got no further replies after my email announcing my "discovery" of
that commit, so I thought that was the whole story.

On Mon, Mar 22, 2021, 13:53 Viktor Dukhovni
mailto:ietf-d...@dukhovni.org>> wrote:

On Mon, Mar 22, 2021 at 12:39:28PM +0800, Gergő Érdi wrote:

> I'd love to have this in a GHC 8.10 release:
>
https://mail.haskell.org/pipermail/ghc-devs/2021-March/019629.html


This is already in 9.0, 9.2 and master, but it is a rather
non-trivial
change, given all the new work that went into the String
case.  So I am
not sure it is small/simple enough to make for a compelling
backport.

There's a lot of recent activity in this space.  See also
>,
which is not
yet merged into master, and might still be eta-reduced one
more step).

I don't know whether such optimisation tweaks (not a bugfix)
are in
scope for backporting, we certainly need to be confident
they'll not
cause any new problems.  FWIW, 5259 is dramatically simpler...

Of course we also have
> in
much the
same territory, but there we're still blocked on someone
figuring out
what's going on with the 20% compile-time hit with T13056, and
whether
that's acceptable or not...

--
    Viktor.
___
ghc-devs mailing list
ghc-devs@haskell.org 
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org 
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: On CI

2021-03-17 Thread Andreas Klebinger

> I'd be quite happy to accept a 25% regression on T9872c if it yielded
a 1% improvement on compiling Cabal. T9872 is very very very strange!
(Maybe if *all* the T9872 tests regressed, I'd be more worried.)

While I fully agree with this. We should *always* want to know if a
small syntetic benchmark regresses by a lot.
Or in other words we don't want CI to accept such a regression for us
ever, but the developer of a patch should need to explicitly ok it.

Otherwise we just slow down a lot of seldom-used code paths by a lot.

Now that isn't really an issue anyway I think. The question is rather is
2% a large enough regression to worry about? 5%? 10%?

Cheers,
Andreas

Am 17/03/2021 um 14:39 schrieb Richard Eisenberg:




On Mar 17, 2021, at 6:18 AM, Moritz Angermann
mailto:moritz.angerm...@gmail.com>> wrote:

But what do we expect of patch authors? Right now if five people
write patches to GHC, and each of them eventually manage to get their
MRs green, after a long review, they finally see it assigned to
marge, and then it starts failing? Their patch on its own was fine,
but their aggregate with other people's code leads to regressions? So
we now expect all patch authors together to try to figure out what
happened? Figuring out why something regressed is hard enough, and we
only have a very few people who are actually capable of debugging
this. Thus I believe it would end up with Ben, Andreas, Matthiew,
Simon, ... or someone else from GHC HQ anyway to figure out why it
regressed, be it in the Review Stage, or dissecting a marge
aggregate, or on master.


I have previously posted against the idea of allowing Marge to accept
regressions... but the paragraph above is sadly convincing. Maybe
Simon is right about opening up the windows to, say, be 100% (which
would catch a 10x regression) instead of infinite, but I'm now
convinced that Marge should be very generous in allowing regressions
-- provided we also have some way of monitoring drift over time.

Separately, I've been concerned for some time about the peculiarity of
our perf tests. For example, I'd be quite happy to accept a 25%
regression on T9872c if it yielded a 1% improvement on compiling
Cabal. T9872 is very very very strange! (Maybe if *all* the T9872
tests regressed, I'd be more worried.) I would be very happy to learn
that some more general, representative tests are included in our
examinations.

Richard

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Changes to performance testing?

2021-02-22 Thread Andreas Klebinger

This seems quite reasonable to me.
Not sure about the cost of implementing it (and the feasability of it
if/when merge-trains arrive).

Andreas

Am 21/02/2021 um 21:31 schrieb Richard Eisenberg:




On Feb 21, 2021, at 11:24 AM, Ben Gamari mailto:b...@well-typed.com>> wrote:

To mitigate this I would suggest that we allow performance test failures
in marge-bot pipelines. A slightly weaker variant of this idea would
instead only allow performance *improvements*. I suspect the latter
would get most of the benefit, while eliminating the possibility that a
large regression goes unnoticed.


The value in making performance improvements a test failure is so that
patch authors can be informed of what they have done, to make sure it
matches expectations. This need can reasonably be satisfied without
stopping merging. That is, if Marge can accept performance
improvements, while (say) posting to each MR involved that it may have
contributed to a performance improvement, then I think we've done our
job here.

On the other hand, a performance degradation is a bug, just like, say,
an error message regression. Even if it's a combination of commits
that cause the problem (an actual possibility even for error message
regressions), it's still a bug that we need to either fix or accept
(balanced out by other improvements). The pain of debugging this
scenario might be mitigated if there were a collation of the
performance wibbles for each individual commit. This information is,
in general, available: each commit passed CI on its own, and so it
should be possible to create a little report with its rows being perf
tests and its columns being commits or MR #s; each cell in the table
would be a percentage regression. If we're lucky, the regression Marge
sees will be the sum(*) of the entries in one of the rows -- this
means that we have a simple agglomeration of performance degradation.
If we're less lucky, the whole will not equal the sum of the parts,
and some of the patches interfere. In either case, the table would
suggest a likely place to look next.

(*) I suppose if we're recording percentages, it wouldn't necessarily
be the actual sum, because percentages are a bit funny. But you get my
meaning.

Pulling this all together:
* I'm against the initial proposal of allowing all performance
failures by Marge. This will allow bugs to accumulate (in my opinion).
* I'm in favor of allowing performance improvements to be accepted by
Marge.
* To mitigate against the information loss of Marge accepting
performance improvements, it would be great if Marge could alert MR
authors that a cumulative performance improvement took place.
* To mitigate against the annoyance of finding a performance
regression in a merge commit that does not appear in any component
commit, it would be great if there were a tool to collect performance
numbers from a set of commits and present them in a table for further
analysis.

These "mitigations" might take work. If labor is impossible to produce
to complete this work, I'm in favor of simply allowing the performance
improvements, maybe also filing a ticket about these potential
improvements to the process.

Richard

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Benchmarking experiences: Cabal test vs compiling nofib/spectral/simple/Main.hs

2021-01-20 Thread Andreas Klebinger

Hello Devs,

When I started to work on GHC a few years back the Wiki recommended
using nofib/spectral/simple/Main.hs as
a test case for compiler performance changes. I've been using this ever
since.

"Recently" the cabal-test (compiling cabal-the-library) has become sort
of a default benchmark for GHC performance.
I've used the Cabal test as well and it's probably a better test case
than nofib/spectral/simple/Main.hs.
I've started using both usually using spectral/simple to benchmark
intermediate changes and then looking
at the cabal test for the final patch at the end. So far I have rarely
seen a large
difference between using cabal or spectral/simple. Sometimes the
magnitude of the effect was different
between the two, but I've never seen one regress/improve while the other
didn't.

Since the topic came up recently in a discussion I wonder if others use
similar means to quickly bench ghc changes
and what your experiences were in terms of simpler benchmarks being
representative compared to the cabal test.

Cheers,
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fwd: Restricted sums in BoxedRep

2020-10-15 Thread Andreas Klebinger

From a implementors perspective my main questions would be:

* How big is the benefit in practice? How many use cases are there?
* How bad are the costs? (Runtime overhead, rts complexity, ...)

The details of how this would be exposed to a user would are important.
But if the costs are too high for the drawbacks then it becomes a moot
point.

David Feuer schrieb am 14.10.2020 um 22:21:

Forwarded from Andrew Martin below. I think we want more than just
Maybe (more than one null), but the nesting I described is certainly
more convenience than necessity.

-- Forwarded message -
From: *Andrew Martin* mailto:andrew.thadd...@gmail.com>>
Date: Wed, Oct 14, 2020, 4:14 PM
Subject: Re: Restricted sums in BoxedRep
To: David Feuer mailto:david.fe...@gmail.com>>


You'll have to forward this to the ghc-devs list to share it with
others since I'm not currently subscribed to it, but I've had this
same thought before. It is discussed at
https://github.com/andrewthad/impure-containers/issues/12. Here's the
relevant excerpt:

Relatedly, I was thinking the other day that after finishing
implementing

https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0203-pointer-rep.rst,
I should really look at seeing if it's possible to add this
maybe-of-a-lifted value trick straight to GHC. I think that with:

|data RuntimpRep = BoxedRep Levity | MaybeBoxedRep Levity | IntRep
| ... data BuiltinMaybe :: forall (v :: Levity). TYPE v -> TYPE
('MaybeBoxedRep v) |

This doesn't have the nesting issues because the kind system
prevents nesting. But anyway, back to the original question. I
would recommend not using |Maybe.Unsafe| and using
|unpacked-maybe| instead. The latter is definitely safe, and it
only costs an extra machine word of space in each data constructor
it gets used in, and it doesn't introduce more indirections.


On Tue, Oct 13, 2020 at 5:47 PM David Feuer mailto:david.fe...@gmail.com>> wrote:

Null pointers are widely known to be a lousy language feature in
general, but there are certain situations where they're *really*
useful for compact representation. For example, we define

    newtype TMVar a = TMVar (TVar (Maybe a))

We don't, however, actually use the fact that (Maybe a) is lifted.
So we could represent this much more efficiently using something like

    newtype TMVar a = TMVar (TVar a)

where Nothing is represented by a distinguished "null" pointer.

While it's possible to implement this sort of thing in user code
(with lots of fuss and care), it's not very nice at all. What I'd
really like to be able to do is represent certain kinds of sums
like this natively.

Now that we're getting BoxedRep, I think we can probably make it
happen. The trick is to add a special Levity constructor
representing sums of particular shapes. Specifically, we can
represent a type like this if it is a possibly-nested sum which,
when flattened into a single sum, consists of some number of
nullary tuples and at most one Lifted or Unlifted type.  Then we
can have (inline) primops to convert between the BoxedRep and the
sum-of-sums representations.

Anyone have thoughts on details for what the Levity constructor
arguments might look like?



--
-Andrew Thaddeus Martin


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Fixing type synonyms to Uniq(D)FM newtypes

2020-06-23 Thread Andreas Klebinger

My main motivation for going with a phantom type over newtypes was that
it makes it easier to use in
an adhoc fashion without giving up type safety.

As a second benefit it seemed a lot easier to implement.

Cheers
Andreas



George Colpitts schrieb am 24.06.2020 um 00:40:

I read the email thread you refer to but it doesn't seem to explain
why you went with solution 2. If you think it worthwhile can you
explain here why you chose solution 2?

On Tue, Jun 23, 2020 at 6:55 PM Andreas Klebinger
mailto:klebinger.andr...@gmx.at>> wrote:

There was a discussion about making UniqFM typed for the keys here a
while ago.
(https://mail.haskell.org/pipermail/ghc-devs/2020-January/018451.html
and following)

I wrote up an MR for one possible approach here:
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3577

It implements solution 2 from that discussion.

Just while getting the patch to typecheck I've already seen a
number of
cases where this increased
readability of the code quite a bit so I think it's a good
improvement.

If there are strong objections to this solution let me know. In that
case I'm happy to abandon the patch.
If not I will clean it up and get it ready for merging.


___
ghc-devs mailing list
ghc-devs@haskell.org <mailto:ghc-devs@haskell.org>
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Fixing type synonyms to Uniq(D)FM newtypes

2020-06-23 Thread Andreas Klebinger

There was a discussion about making UniqFM typed for the keys here a
while ago.
(https://mail.haskell.org/pipermail/ghc-devs/2020-January/018451.html
and following)

I wrote up an MR for one possible approach here:
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3577

It implements solution 2 from that discussion.

Just while getting the patch to typecheck I've already seen a number of
cases where this increased
readability of the code quite a bit so I think it's a good improvement.

If there are strong objections to this solution let me know. In that
case I'm happy to abandon the patch.
If not I will clean it up and get it ready for merging.


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: profiling a constructor?

2020-05-23 Thread Andreas Klebinger

Surely, the RTS must be able to count the number of times a constructor is used.

It can't. For one there are different kinds of "uses" for constructors.
* Allocations - They might be dead by the time we gc the nursery, so the
RTS never get's to see them.
* Accessing the Constructor? That's even harder to track.
* The constructor being present during GC? One can do this using heap
profiling (as ben described).

There are also top level constructors which currently don't generate
code at all (just static data).

So currently there is no such feature.

For allocations in particular we could implement one on top of the ticky
profiler.
It operates on the STG => Cmm boundry so doesn't affect core optimizations.

There we could for every runtime constructor allocation emit code which
will bump a counter for that specific constructor.

I imagine this wouldn't that hard either. But it does require
modification of the compiler.

Cheers,
Andreas



Richard Eisenberg schrieb am 23.05.2020 um 14:58:

Hi devs,

Is there a way to count the number of times a particular constructor is 
allocated? I might want to know, say, the total number of cons cells allocated 
over the course of a program, or (in my actual case) the total number of FunTy 
cells allocated over the course of a program.

I can try to do this with an SCC. But it's very clunky:
* The constructor may be used in a number of places; each such place would need 
the SCC.
* SCCs can interfere with optimizations. In my use case, this would negate the 
usefulness of the exercise entirely, as I think some of the structures I wish 
to observe should never come into being, due to optimizations (e.g. 
case-of-known-constructor after inlining).

Surely, the RTS must be able to count the number of times a constructor is 
used. But is there any way to access such a feature? If others agree that this 
would sometimes be helpful, perhaps we can build the feature. Now is not the 
first time I've wanted this.

Thanks!
Richard
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Removal of UniqMap module

2020-05-10 Thread Andreas Klebinger

My two cents:


I needed it today and now I am unsure what the
suitable replacement is.


If you need it (inside GHC) just re add it.

If you think it should be kept as part of GHC's API re add it and add a
comment stating such.

Iirc it was removed since no one used it, not because anything was wrong
with it design wise iirc.


As a general point, please can we stop with these annoying
refactorings which delete unused code


In general I think these kind of refactorings are needed and fine.
But it does feel like people became a bit overzealous in that regard
recently.

Cheers
Andreas

Richard Eisenberg schrieb am 10.05.2020 um 14:43:



On May 10, 2020, at 11:22 AM, Matthew Pickering  
wrote:

Hi,

I noticed that the UniqMap module was removed from the tree

See 1c7c6f1afc8e7f7ba5d256780bc9d5bb5f3e7601

Why was it removed?

 From the commit message: "This module isn't used anywhere in GHC." That seems 
like a good reason to remove, to me. While I can understand the frustration at having 
this disappear when you need it, we can't quite just keep whole unused modules around in 
the hope that someone someday will use them.


I needed it today and now I am unsure what the
suitable replacement is.

A UniqFM whose range includes the domain element would work fine, I think.


As a general point, please can we stop with these annoying
refactorings which delete unused code

I disagree here. A codebase as large and sprawling as GHC's needs constant 
pruning. The alternative is not to control the sprawl, and that seems 
considerably worse than refactorings and churn.

Richard
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: [RFC] Compiler pipeline timings visualization

2020-05-01 Thread Andreas Klebinger

Hi Sergej,

I think this is a good idea in general, and it seems you did some great
work there already.
Something like this can also help with pinpointing performance issues
inside the compiler
so would not just be useful to end users.

intuitively I would assume that instead of adding another way
to produce compiler events we should:
* Ship GHC with eventlog enabled by default
* Provide an official converter from eventLog to the chrome trace format.

The eventlog format is quite flexible, and if it's unsuitable to what
you want I would
prefer to extend it rather than adding support for additional formats
inside GHC itself.

This way we:
* Continue to have one unified way to dump events from haskell apps (the
eventlog)
* Users need not go to third party apps, as the converter could
reasonably come with GHC (like hp2ps)
* We are free to include information that can't (easily) be encoded in
the chrome format.

The downside is that users will have to invoke ghc, followed by some
other tool to get the
chrome trace, but to me that seems like a worthwhile tradeoff for
keeping the compiler and
various visualization formats somewhat separated.

The obvious question there is how much enabling the eventlog by default
would impact non-logging ghc
invocations. I have not measured this and it might rule out this
approach if it has a big impact and isn't
easily corrected.

As a last point I want to encourage you to open a ticket about this.
Mailing list threads tend to be harder to follow and find down the line
then tickets in my experience.

Cheers,
Andreas

Sergej Jaskiewicz via ghc-devs schrieb am 01.05.2020 um 11:09:

tl;dr: I propose adding a new GHC flag for generating a log that allows
visualizing how much time each stage in the compiler pipleline took, similar
to Clang's -ftime-trace.

Hello, everyone.

I'd like to tell you about a feature I've been working on recently, namely,
the ability to generate and visualize how long each stage of the compilation
pipeline takes to complete. This is basically about the same as
the -ftime-trace flag that has landed in Clang several months ago [1].

The initial motivation for this feature was the desire to understand why
compilation times are so long when using LLVM as backend. But later I realized
that this functionality is useful on its own for end users, not just GHC devs,
so it would make sense to add a new flag -ddump-time-trace.

Since not only does Haskell have complex type system, but also there is
a variety of language extensions, we, the Haskell users, often experience
very long compilation times. For instance, the usage of the TypeFaimilies
extension can considerably slow down the compilation process [2].
It is useful to understand how you can fix your code so that it compiles faster.

There are two options for that right now:
- Building GHC from source for profiling and using the just-built GHC for
   compiling your problem code.
- Building the compiler from source with event log support [3].

The former option just doesn't do it, since the output would be
"which GHC function calls took how much time", but there'd be no information
about which part of the user code was being compiled.

The latter option is much closer to what we need. If we link the GHC executable
with the -eventlog flag, then various significant events will be written to
a special log file. For example, "Parser began parsing the Main.hs file after
5 ms since GHC has started, and ended parsing it 3 ms after that".
The resulting log file can be parsed with the ghc-events library [4], and also
visualized using the ThreadScope app [5].

Bu this approach has its disadvantages.

Firstly, if the user wants visualization, they'd have to install ThreadScope.
Some companies' internal policies prohibit installing third-party apps from
the internet. It would be good if we could generate a log that could be
visualized on any computer with a browser. For that we could use
the Chrome Trace Event format [6]. This is an ordinary JSON file with a specific
structure that can be viewed in the Chrome browser by going to
the chrome://tracing page, or on https://speedscope.app. A file in exactly this
format would be generated by Clang if you passed it the -ftime-trace flag.

Secondly, the event log contains many events that are not relevant to the end
user, for example, thread creation, memory allocation, etc.

As an initial proof of concept, I've developed a command line tool for
transforming event log files to Chrome Trace Event files [7].

Thirdly, in order for the event log to be generated, you'd still have to build
GHC from source. The majority of the GHC users won't go this way. Not only
would it require some basic understanding of the GHC building process, but also
building itself takes quite some time. It would be great if the needed
functionality came with GHC out of the box.

This is why I've added support for generating Trace Event files into my fork
of GHC [8], and I would like to propose including this 

Targeting old Windows versions

2020-04-08 Thread Andreas Klebinger

Hello devs,

GHC is planning to use the Large Address Space mode of allocation for
future releases on windows.
See https://gitlab.haskell.org/ghc/ghc/issues/12576

This is a significant optimization for the GC and well tested as we use
it on Linux already.

However it will regress memory useage on versions of windows *older*
than Windows 8.1/Sever 2012.

Please let us know if you are targeting older versions than that either
by responding to the mailing list,
commenting on the ticket or contacting me directly if you have privacy
concerns.

Depending on how many people are affected by this change we might
consider measures to reduce the impact.

Cheers,
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Measuring compiler performance

2020-04-06 Thread Andreas Klebinger

Hi Simon,

things I do to measure performance:

* compile nofib/spectral/simple/Main.hs, look at instructions (perf) and
allocations/time (+RTS -s)
* compile nofib as a whole (Use NoFibRuns=0 to avoid running the
benchmarks). Look at compile time/allocations.
* compile Cabal the library (cd cabal-head/Cabal && ghc Setup.hs
-fforce-recomp). Look at allocations time via +RTS -s or instructions
using perf.
* compile a particular files triggering the case I want to optimize

In general:
Adjust depending on flags you want to look at. If you optimize the
simplifier -O0 will be useless.
If you optimize type-checking -O2 will be pointless. And so on.

In general I only compile as linking adds overhead which isn't really
part of GHC.


Another question regarding performing compiler perf measurements
locally is which build flavour to use: So far I have used the "perf"
flavour. A problem here is that a full build seems to take close to an
hour. A rebuild with --freeze1 takes ~15 minutes on my machine. Is
this the right flavour to use?

Personally I use the quick flavour, freeze stage 1 and configure hadrian
to pass -O to stage2
unless I know the thing I'm working on will benefit significantly from -O2.

That is if I optimize an algorithm -O2 won't really make a difference so
I use -O.
If I optimize a particular hotspot in the implementation of an algorithm
by using
bangs it's worthwhile to look at -O2 as well.

You can also set particular flags for only specific files using
OPTIONS_GHC pragmas.
This way you can avoid compiling the whole of GHC with -O/-O2.


Ideally I wouldn't have to perform these measurements on my local
machine at all! Do you usually use a separate machine for this? _Very_
convenient would be some kind of bot whom I could tell e.g.

I use another machine. Others only look at metrics which are less
affected by system load like allocations.


Ideally I wouldn't have to perform these measurements on my local
machine at all! Do you usually use a separate machine for this? _Very_
convenient would be some kind of bot whom I could tell e.g.

Various people have come up with scripts to automate the measurements on
nofib which get's you
closer to this. I discussed with ben and others a few times in the past
having a wider framework for
collecting compiler performance indicators. But it's a lot of work to
get right and once the immediate
need is gone those ideas usually get shelved again.

BTW what's the purpose of the profiled GHC modules built with this
flavour which just seem to additionally prolong compile time? I don't
see a ghc-prof binary or similar in _build/stage1/bin.

As far as I know if you compile (non-ghc) code using -prof then you will
need the ghc library
available in the prof way. But it would be good to have the option to
disable this.


Also, what's the status of gipeda? The most recent commit at
https://perf.haskell.org/ghc/ is from "about a year ago"?

I think the author stopped maintaining it after he switched jobs. So
it's currently not useful
for investigating performance. But I'm sure he wouldn't object if anyone
were to pick it up.

Cheers Andreas

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Module Renaming: GHC.Core.Op

2020-04-04 Thread Andreas Klebinger

Thanks for the response Sylvain.

> put all the Core types in GHC.Core.Types and move everything
operation from GHC.Core.Op to GHC.Core?

That would work as well. But I still favour the renaming approach.

Almost all of these passes are optimization, and the few who are not are
just there to support
the optimizations so their placements still makes sense. To me anyway.

If people reject the renaming your suggestion would still be an
improvement over .Op though.

Cheers,
Andreas

Sylvain Henry schrieb am 03.04.2020 um 23:29:

Hi Andreas,

"Op" stands for "Operation" but it's not very obvious (ironically when
I started this renaming work one of the motivation was to avoid
ambiguous acronyms... failed).

The idea was to separate Core types from Core
transformations/analyses/passes. I couldn't find something better then
"Operation" to sum up the latter category but I concede it's not very
good.

But perhaps we should do the opposite as we're doing in GHC.Tc: put
all the Core types in GHC.Core.Types and move everything operation
from GHC.Core.Op to GHC.Core?

Cheers,
Sylvain


On 03/04/2020 22:26, Andreas Klebinger wrote:

Hello devs,

While I looked at the renaming a bit when proposed I only just realized
we seem to be using Op as a short name for optimize.

I find this very unintuitive. Can we spare another letter to make this
GHC.Core.Opt instead?

We use opt pretty much everywhere else in GHC already.

Cheers
Andreas


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Module Renaming: GHC.Core.Op

2020-04-03 Thread Andreas Klebinger

Hello devs,

While I looked at the renaming a bit when proposed I only just realized
we seem to be using Op as a short name for optimize.

I find this very unintuitive. Can we spare another letter to make this
GHC.Core.Opt instead?

We use opt pretty much everywhere else in GHC already.

Cheers
Andreas


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Blocking MVar# primops not performing stack checks?

2020-03-04 Thread Andreas Klebinger

I just took a look at the implementation and it looks like you are right
Cheng.

I opened a ticket here: https://gitlab.haskell.org/ghc/ghc/issues/17893


Carter Schonwald schrieb am 02.03.2020 um 06:27:

The simplest way to answer this is if you can help us construct a
program, whether as Haskell or cmm, which tickles the failure you
suspect is there ?

The rts definitely gets less love overall.  And there’s fewer folks
involved in those layers overall.



On Wed, Feb 26, 2020 at 10:03 AM Shao, Cheng mailto:cheng.s...@tweag.io>> wrote:

Hi all,

When an MVar# primop blocks, it jumps to a function in
HeapStackCheck.cmm which pushes a RET_SMALL stack frame before
returning to the scheduler (e.g. the takeMVar# primop jumps to
stg_block_takemvar for stack adjustment). But these functions directly
bump Sp without checking for possible stack overflow, I wonder if it
is a bug?

Cheers,
Cheng
___
ghc-devs mailing list
ghc-devs@haskell.org 
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Windows build broken

2020-02-25 Thread Andreas Klebinger

Hi devs,

it seems the windows build is broken. (Can't build stage2 locally).

Quickest way to reproduce is a validate of master.

It also happens on CI: https://gitlab.haskell.org/ghc/ghc/-/jobs/270248

I remember ben mentioning something along the lines of builds with "old"
(8.6) boot compilers being broken.
Is that the culprint? How do I get back to a functioning environment?

Cheers,
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Request for two new MR-labels: "review-needed" and "changes-required"

2020-02-07 Thread Andreas Klebinger

Hello devs,

Recently in a MR discussion the topic came up of contributors being
frustrated about lack of reviews.

As a contributor and as a reviewer the lack of tools to find MR's in
need of attention also frustrated me often.

I suggest two new MR labels:

Label: "review-needed":  If a contributor has an MR which he considers
to be blocked on (lack of) review he can assign the label to clearly
signal this.
Label: "changes-required": This is the inverse - if a reviewer looked at
a patch and it's clear that the patch needs more work from the
contributor this label can signal this.

What are the benefits?

* As a reviewer I can look for MR's tagged with review-needed and can be
sure that time spent reviewing is not wasted.
* As a contributor I can mark a patch as changes-required if I think my
patch is ready to land, or if I need input on the design.
* Core maintainers can more easily identify patches which have been
stalled for a long time, making their review a priority.

The obvious downside is more labels.

Cheers
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How to turn LHExpr GhcPs into CoreExpr

2020-01-22 Thread Andreas Klebinger

I tried this for fun a while ago and ran into the issue of needing to
provide a type environment containing Prelude and so on.
I gave up on that when some of the calls failed because I must have
missed to set up some implicit state properly.
I didn't have an actual use case (only curiosity) so I didn't look
further into it. If you do find a way please let me know.

I would also support adding any missing functions to GHC-the-library to
make this possible if any turn out to be required.

As an alternative you could also use the GHCi approach of using a fake
Module. This would allow you to copy whatever GHCi is doing.
But I expect that to be slower if you expect to process many such strings,

Richard Eisenberg schrieb am 22.01.2020 um 10:36:

You'll need to run the expression through the whole pipeline.

1. Parsing
2. Renaming
3. Type-checking
3a. Constraint generation
  3b. Constraint solving
  3c. Zonking
4. Desugaring


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Problem with compiler perf tests

2019-11-17 Thread Andreas Klebinger


Ömer Sinan Ağacan schrieb am 17.11.2019 um 09:22:

I think what we should do instead is that once it's clear that the
patch did not
introduce *accidental* increases in numbers (e.g. in !2100 I checked and
explained the increase in average residency, and showed that the increase makes
sense and is not a leak) and it's the right thing to do, we should merge it

But that's what we do already isn't it? We don't expect all changes to
have no performance implications
if they can be argued for.

However it's easy for "insignificant" changes to compound to a
significant slowdown so I don't think we are too careful currently. I've
never seen anyone care about "a few bytes".
Assuming we get 6 MR's who regresses a metric by 1% per year that adds
up quickly. Three years and we will be about 20% worse! So I think we
are right to be cautions with those things.

It's just that people sometimes (as in !2100 initially) disagree on what
the right thing to do is.
But I don't see a way around that no matter where we set the thresholds.
That can only be resolved by discourse.

What I don't agree with is pushing that discussion into separate tickets
in general.
That would just mean we get a bunch of performance regression, and a
bunch of tickets documenting them.

Which is better than not documenting them! And sometimes that will be
the best course of action.
But if there is a chance to resolve performance issues while a patch is
still being worked on that
will in general always be a better solution.

At least that's my opinion on the general case.

Cheers,
Andreas


Hi,

Currently we have a bunch of tests in testsuite/tests/perf/compiler for keeping
compile time allocations, max residency etc. in the expected ranges and avoid
introducing accidental compile time performance regressions.

This has a problem: we expect every MR to keep the compile time stats in the
specified ranges, but sometimes a patch fixes an issue, or does something right
(removes hacks/refactors bad code etc.) but also increases the numbers because
sometimes doing it right means doing more work or keeping more things in memory
(e.g. !1747, !2100 which is required by !1304).

We then spend hours/days trying to shave a few bytes off in those patches,
because the previous hacky/buggy code set the standards. It doesn't make sense
to compare bad/buggy code with good code and expect them to do the same thing.

Second problem is that it forces the developer to focus on a tiny part of the
compiler to reduce the numbers to the where they were. If they looked at the big
picture instead it might be possible to see rooms of improvements in other
places that could be possibly lead to much more efficient use of the developer
time.

I think what we should do instead is that once it's clear that the patch did not
introduce *accidental* increases in numbers (e.g. in !2100 I checked and
explained the increase in average residency, and showed that the increase makes
sense and is not a leak) and it's the right thing to do, we should merge it, and
track the performance issues in another issue. The CI should still run perf
tests, but those should be allowed to fail.

Any opinions on this?

Ömer
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Compiling binaries of bytecode: ever been considered?

2019-11-04 Thread Andreas Klebinger

I've heard the idea come up once or twice. But I'm not aware of any
efforts going further than that.



Christopher Done schrieb am 04.11.2019 um 14:59:

Hi all,

I was just wondering: has a compiler output mode ever been considered
that would dump bytecode to a file, dynamic link to the ghc runtime,
and then on start-up that program would just interpret the bytecode
like ghci does?

The purpose would be simply faster compile-and-restart times.

Cheers,

Chris
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: How to navigate around the source tree?

2019-10-25 Thread Andreas Klebinger



"Workarounds" are for problems, but I don't understand why duplicate file names
are a problem. Can you elaborate? Perhaps this is problem with your IDE/editor
setup? Many of us (as can be seen in responses) use tools/editors/IDEs that can
handle this just fine.

I worked on projects with duplicate file names in the past too; having tools
that can deal with this helps, and I don't think this is too hard to achieve.
You can't ask devs of every project you join to rename their files because your
editor can't handle them.

But I can ask if we really want to create more identical ones :)
GHC already has a few duplicate file names. It's not a tragedy.

File names as the primary identifier crops up all the time.
Be it `find`ind files, jumping to them in the editor or other things.

It's not horrible by any means. VS Code has fuzzy search which usually
works for me in these cases.
Although it might not work so well if we rename all 15 Utils modules in
ghc to Utils.hs.

Overall it just seems easier to work with unique names when we have the
choice to do so.
And clearly in this case we have the choice.

Personally I never ran into a situation where prefixing the file name
was an issue at all.
But cases where non-unique names cause annoyance do happen from time to
time.

Hence why I prefer one over the other.


(I don't know VS Code enough to help ..)

Ömer

Andreas Klebinger , 24 Eki 2019 Per, 14:48
tarihinde şunu yazdı:

Hello devs,

I also often jump to files. In my case usually using VS Code using Ctr+P as 
well which searches for files by name.
While I can check which folder a file is in in the case of duplicates it is a 
overhead which this refactor forces onto me.

While there are workarounds, both for my case as for Matts. It's worth asking 
if requiring these workarounds is better
than just accepting redundant prefixes on module names.

Personally I would prefer unique file names even at the cost of redundancy.
I rarely add import statements/full module names, but I *very* often jump to 
files.

Cheers
Andreas

Bryan Richter schrieb am 23.10.2019 um 18:00:

Duplicate record fields is going to make this a bigger problem. Vim does 
support duplicate tags (:tselect and :tjump and related bindings), but 
hopefully haskell-ide-engine will one day provide us with semantic tags and 
solve this problem once and for all!

On Wed, 23 Oct 2019, 17.49 Matthew Pickering,  
wrote:

Thanks Omer, Sylvain and Sebastian

.

I just configured my editor to use fzf and now I can use the `:GFiles`
command to perform fuzzy search on files which is probably better than
tags. If anyone else is using NixOS, all I had to do was add the
`fzf-vim` plugin to the vim configuration.

Cheers,

Matt

On Wed, Oct 23, 2019 at 2:54 PM Ömer Sinan Ağacan  wrote:

I use a file finder (fzf) for jumping to files. Because module names follow file
paths to jump to e.g. StgToCmmUtils.Utils I usually type `stgcmmutils` and
fzf finds the correct file `compiler/GHC/StgToCmm/Utils.hs`.

When generating tags I omit module names for this reason, it's easy with a good
file finder to jump to modules already, no need to generate tags for the
modules.

fast-tags commands I use:

- When working on the compiler:

   $ fast-tags --no-module-tags driver ghc compiler

- When working on the RTS:

   $ fast-tags --no-module-tags driver ghc compiler
   $ ctags --append -R rts/**/*.c rts/**/*.h includes/**/*.h

- When working on the libraries:

   $ fast-tags --no-module-tags driver ghc compiler libraries

Ömer

Sebastian Graf , 23 Eki 2019 Çar, 16:49 tarihinde
şunu yazdı:

FWIW, I'm using VSCode's fuzzy file search with Ctrl+P (and vim's equivalent) 
rather successfully. Just tried it for Hs/Utils.hs by typing 'hsutils.hs'. It 
didn't turn up as the first result in VSCode, but it in vim.

Am Mi., 23. Okt. 2019 um 14:27 Uhr schrieb Matthew Pickering 
:

I use `fast-tags` which doesn't look at the hierarchy at all and I'm
not sure what the improvement would be as the names of the modules
would still clash.

If there is some other recommended way to jump to a module then that
would also work for me.

Matt


On Wed, Oct 23, 2019 at 12:08 PM Sylvain Henry  wrote:

Hi,

How do you generate your tags file? It seems to be a shortcoming of the
generator to not take into account the location of the definition file.

  > Perhaps `HsUtils` and `StgUtils` would be appropriate to
disambiguate`Hs/Utils` and `StgToCmm/Utils`.

We are promoting the module prefixes (`Hs`, `Stg`, `Tc`, etc.) into
proper module layers (e.g. `HsUtils` becomes `GHC.Hs.Utils`) so it would
be redundant to add the prefixes back. :/

Cheers,
Sylvain

On 23/10/2019 12:52, Matthew Pickering wrote:

Hi,

The module rework has broken my workflow.

Now my tags file is useless for jumping for modules as there are
multiple "Utils" and "Types" modules. Invariable I am jumping to the
wrong one. What do other people do to avoid this?

Can we either revert these changes or give these m

Re: How to navigate around the source tree?

2019-10-24 Thread Andreas Klebinger

Hello devs,

I also often jump to files. In my case usually using VS Code using Ctr+P
as well which searches for files by name.
While I can check which folder a file is in in the case of duplicates it
is a overhead which this refactor forces onto me.

While there are workarounds, both for my case as for Matts. It's worth
asking if requiring these workarounds is better
than just accepting redundant prefixes on module names.

Personally I would prefer unique file names even at the cost of redundancy.
I rarely add import statements/full module names, but I *very* often
jump to files.

Cheers
Andreas

Bryan Richter schrieb am 23.10.2019 um 18:00:

Duplicate record fields is going to make this a bigger problem. Vim
does support duplicate tags (:tselect and :tjump and related
bindings), but hopefully haskell-ide-engine will one day provide us
with semantic tags and solve this problem once and for all!

On Wed, 23 Oct 2019, 17.49 Matthew Pickering,
mailto:matthewtpicker...@gmail.com>> wrote:

Thanks Omer, Sylvain and Sebastian

.

I just configured my editor to use fzf and now I can use the `:GFiles`
command to perform fuzzy search on files which is probably better than
tags. If anyone else is using NixOS, all I had to do was add the
`fzf-vim` plugin to the vim configuration.

Cheers,

Matt

On Wed, Oct 23, 2019 at 2:54 PM Ömer Sinan Ağacan
mailto:omeraga...@gmail.com>> wrote:
>
> I use a file finder (fzf) for jumping to files. Because module
names follow file
> paths to jump to e.g. StgToCmmUtils.Utils I usually type
`stgcmmutils` and
> fzf finds the correct file `compiler/GHC/StgToCmm/Utils.hs`.
>
> When generating tags I omit module names for this reason, it's
easy with a good
> file finder to jump to modules already, no need to generate tags
for the
> modules.
>
> fast-tags commands I use:
>
> - When working on the compiler:
>
>   $ fast-tags --no-module-tags driver ghc compiler
>
> - When working on the RTS:
>
>   $ fast-tags --no-module-tags driver ghc compiler
>   $ ctags --append -R rts/**/*.c rts/**/*.h includes/**/*.h
>
> - When working on the libraries:
>
>   $ fast-tags --no-module-tags driver ghc compiler libraries
>
> Ömer
>
> Sebastian Graf mailto:sgraf1...@gmail.com>>, 23 Eki 2019 Çar, 16:49 tarihinde
> şunu yazdı:
> >
> > FWIW, I'm using VSCode's fuzzy file search with Ctrl+P (and
vim's equivalent) rather successfully. Just tried it for
Hs/Utils.hs by typing 'hsutils.hs'. It didn't turn up as the first
result in VSCode, but it in vim.
> >
> > Am Mi., 23. Okt. 2019 um 14:27 Uhr schrieb Matthew Pickering
mailto:matthewtpicker...@gmail.com>>:
> >>
> >> I use `fast-tags` which doesn't look at the hierarchy at all
and I'm
> >> not sure what the improvement would be as the names of the
modules
> >> would still clash.
> >>
> >> If there is some other recommended way to jump to a module
then that
> >> would also work for me.
> >>
> >> Matt
> >>
> >>
> >> On Wed, Oct 23, 2019 at 12:08 PM Sylvain Henry
mailto:sylv...@haskus.fr>> wrote:
> >> >
> >> > Hi,
> >> >
> >> > How do you generate your tags file? It seems to be a
shortcoming of the
> >> > generator to not take into account the location of the
definition file.
> >> >
> >> >  > Perhaps `HsUtils` and `StgUtils` would be appropriate to
> >> > disambiguate`Hs/Utils` and `StgToCmm/Utils`.
> >> >
> >> > We are promoting the module prefixes (`Hs`, `Stg`, `Tc`,
etc.) into
> >> > proper module layers (e.g. `HsUtils` becomes
`GHC.Hs.Utils`) so it would
> >> > be redundant to add the prefixes back. :/
> >> >
> >> > Cheers,
> >> > Sylvain
> >> >
> >> > On 23/10/2019 12:52, Matthew Pickering wrote:
> >> > > Hi,
> >> > >
> >> > > The module rework has broken my workflow.
> >> > >
> >> > > Now my tags file is useless for jumping for modules as
there are
> >> > > multiple "Utils" and "Types" modules. Invariable I am
jumping to the
> >> > > wrong one. What do other people do to avoid this?
> >> > >
> >> > > Can we either revert these changes or give these modules
unique names
> >> > > to facilitate that only reliable way of navigating the
code base.
> >> > > Perhaps `HsUtils` and `StgUtils` would be appropriate to
disambiguate
> >> > > `Hs/Utils` and `StgToCmm/Utils`.
> >> > >
> >> > > Cheers,
> >> > >
> >> > > Matt
> >> > > ___
> >> > > ghc-devs mailing list
> >> > > ghc-devs@haskell.org 
> >> > > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >> > ___
> >> > ghc-devs mailing list
> >> 

Unsquashed merge requests

2019-10-10 Thread Andreas Klebinger

Hello devs,

I've recently seen a few MR's make it into master with commits
which rather obviously were meant to be squashed.

Please look out for this when assigning MRs to marge.

Cheers
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


GHC: Policy on -O flags?

2019-08-27 Thread Andreas Klebinger

Hello ghc-devs and haskell users.

I'm looking for opinions on when an optimization should be enabled by
default.

-O is currently the base line for an optimized build.
-O2 adds around 10-20% compile time for a few % (around 2% if I remember
correctly) in performance for most things.

The question is now if I implement a new optimization, making code R%
faster but slowing
down the compiler down by C% at which point should an optimization be:

* Enabled by default (-O)
* Enabled only at -O2
* Disabled by default

Cheap always beneficial things make sense for -O
Expensive optimizations which add little make sense for -O2

But where exactly is the line here?
How much compile time is runtime worth?

If something slows down the compiler by 1%/2%/5%
and speeds up code by 0.5%/1%/2% which combinations make sense
for -O, -O2?

Can there even be a good policy with the -O/-O2 split?

Personally I generally want code to either:
* Typecheck/Run at all (-O0, -fno-code, repl)
* Not blow through all my RAM when adding a few Ints while developing: -O ?
* Make a reasonable tradeoff between runtime/compiletime: -O ?
* Give me all you got: -O2 (-O9)

The use case for -O0 is rather clear, so is -O2.
But what do people consider the use case for -O

What trade offs seem acceptable to you as a user of GHC?

Is it ok for -O to become slower for faster runtimes? How much slower?
Should all new improvements which might slow down compilation
be pushed to -O2?

Or does an ideal solution add new flags?
Tell me what do you think.

Cheers,
Andreas Klebinger

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Is HEAD broken?

2019-07-14 Thread Andreas Klebinger

Upon restarting the build it seems to proceed further. Strange but good
enough for me at the moment.

Andreas Klebinger schrieb am 15.07.2019 um 00:13:

Is HEAD broken?

I get this error with hadrian:

I suspect it's only broken on windows and has to do with MSYS #ifdefs

/-\

| Successfully built library 'ghci' (Stage0, way
v).  |
| Library:
_build/stage0/libraries/ghci/build/libHSghci-8.9.0.20190714.a  |
| Library synopsis: The library supporting GHC's interactive
interpreter. |
\-/

| Copy package 'ghci'
# cabal-copy (for
_build/stage0/lib/package.conf.d/ghci-8.9.0.20190714.conf)
| Register package 'ghci'
# cabal-register (for
_build/stage0/lib/package.conf.d/ghci-8.9.0.20190714.conf)
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Encoding/Fusion/Common.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Encoding/Fusion/Common.o

| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Show.hs =>
_build/stage0/libraries/text/build/Data/Text/Show.o
# cabal-configure (for _build/stage0/compiler/setup-config)
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Encoding/Fusion.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Encoding/Fusion.o
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Lazy/Encoding/Fusion.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy/Encoding/Fusion.o

# cabal-autogen (for _build/stage0/compiler/build/autogen/cabal_macros.h)
| Run GhcPkg Dependencies Stage0: process
WARNING: cache is out of date:
C:\ghc\msys64\opt\ghc\lib\package.conf.d\package.cache
ghc will see an old view of this package db. Use 'ghc-pkg recache' to
fix.
| Run GhcPkg Unregister Stage0: process => none
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Encoding.hs =>
_build/stage0/libraries/text/build/Data/Text/Encoding.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text.hs =>
_build/stage0/libraries/text/build/Data/Text.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Foreign.hs =>
_build/stage0/libraries/text/build/Data/Text/Foreign.o
ghc-pkg.exe: cannot find package process
| Run GhcPkg Copy Stage0: process =>
_build/stage0/lib/package.conf.d/process-1.6.5.0.conf
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Read.hs =>
_build/stage0/libraries/text/build/Data/Text/Read.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Internal/Lazy.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Internal/IO.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/IO.o
WARNING: cache is out of date:
C:\ghc\msys64\opt\ghc\lib\package.conf.d\package.cache
ghc will see an old view of this package db. Use 'ghc-pkg recache' to
fix.
| Run Cc FindCDependencies Stage0: compiler/parser/cutils.c =>
_build/stage0/compiler/build/c/parser/cutils.o.d
| Run Cc FindCDependencies Stage0: compiler/ghci/keepCAFsForGHCi.c =>
_build/stage0/compiler/build/c/ghci/keepCAFsForGHCi.o.d
| Run Cc FindCDependencies Stage0: compiler/cbits/genSym.c =>
_build/stage0/compiler/build/c/cbits/genSym.o.d
| Run DeriveConstants: none => _build/generated/DerivedConstants.h (and
1 more)
| Run DeriveConstants: none =>
_build/generated/GHCConstantsHaskellExports.hs (and 1 more)
| Run Happy: compiler/parser/Parser.y =>
_build/stage0/compiler/build/Parser.hs
| Run DeriveConstants: none =>
_build/generated/GHCConstantsHaskellWrappers.hs (and 1 more)
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Lazy/Search.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy/Search.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Lazy/Internal.hs =>
_build/stage0/libraries/text/build/Data/Text/Lazy/Internal.o
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Lazy/Fusion.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy/Fusion.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/IO.hs =>
_build/stage0/libraries/text/build/Data/Text/IO.o
| Successfully generated _build/stage0/compiler/build/Config.hs.
| Run Alex: compiler/cmm/CmmLex.x =>
_build/stage0/compiler/build/CmmLex.hs
| Run HsCpp: compiler/prelude/primops.txt.pp =>
_build/stage0/compiler/build/primops.txt
In file included from includes/MachDeps.h:45:0,
 from compiler/prelude/primops.txt.pp:122:
includes/ghcautoconf.h:1:0: error: unterminated #if
 #if !defined(__GHCAUTOCONF_H__)

Error when running Shake build system:
  at action, called at src\Rules.hs:68:19 in main:Rules
  at need, called at src\Rules.hs:90:5 in main:Rules
* Depends on: _build/stage0/lib/package.conf.d/ghc-8.9.0.20190714.conf
  at need, called at src\Rules\Register.hs:115:5 in main:Rules.Register
* Depends on: _build/stage0/compiler/build/libHSghc-8.9.0.20190714.a
  at need, called at s

Is HEAD broken?

2019-07-14 Thread Andreas Klebinger

Is HEAD broken?

I get this error with hadrian:

I suspect it's only broken on windows and has to do with MSYS #ifdefs

/-\
| Successfully built library 'ghci' (Stage0, way v).  |
| Library: _build/stage0/libraries/ghci/build/libHSghci-8.9.0.20190714.a  |
| Library synopsis: The library supporting GHC's interactive interpreter. |
\-/
| Copy package 'ghci'
# cabal-copy (for _build/stage0/lib/package.conf.d/ghci-8.9.0.20190714.conf)
| Register package 'ghci'
# cabal-register (for
_build/stage0/lib/package.conf.d/ghci-8.9.0.20190714.conf)
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Encoding/Fusion/Common.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Encoding/Fusion/Common.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Show.hs =>
_build/stage0/libraries/text/build/Data/Text/Show.o
# cabal-configure (for _build/stage0/compiler/setup-config)
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Encoding/Fusion.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Encoding/Fusion.o
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Lazy/Encoding/Fusion.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy/Encoding/Fusion.o
# cabal-autogen (for _build/stage0/compiler/build/autogen/cabal_macros.h)
| Run GhcPkg Dependencies Stage0: process
WARNING: cache is out of date:
C:\ghc\msys64\opt\ghc\lib\package.conf.d\package.cache
ghc will see an old view of this package db. Use 'ghc-pkg recache' to fix.
| Run GhcPkg Unregister Stage0: process => none
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Encoding.hs =>
_build/stage0/libraries/text/build/Data/Text/Encoding.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text.hs =>
_build/stage0/libraries/text/build/Data/Text.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Foreign.hs =>
_build/stage0/libraries/text/build/Data/Text/Foreign.o
ghc-pkg.exe: cannot find package process
| Run GhcPkg Copy Stage0: process =>
_build/stage0/lib/package.conf.d/process-1.6.5.0.conf
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Read.hs =>
_build/stage0/libraries/text/build/Data/Text/Read.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Internal/Lazy.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Internal/IO.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/IO.o
WARNING: cache is out of date:
C:\ghc\msys64\opt\ghc\lib\package.conf.d\package.cache
ghc will see an old view of this package db. Use 'ghc-pkg recache' to fix.
| Run Cc FindCDependencies Stage0: compiler/parser/cutils.c =>
_build/stage0/compiler/build/c/parser/cutils.o.d
| Run Cc FindCDependencies Stage0: compiler/ghci/keepCAFsForGHCi.c =>
_build/stage0/compiler/build/c/ghci/keepCAFsForGHCi.o.d
| Run Cc FindCDependencies Stage0: compiler/cbits/genSym.c =>
_build/stage0/compiler/build/c/cbits/genSym.o.d
| Run DeriveConstants: none => _build/generated/DerivedConstants.h (and
1 more)
| Run DeriveConstants: none =>
_build/generated/GHCConstantsHaskellExports.hs (and 1 more)
| Run Happy: compiler/parser/Parser.y =>
_build/stage0/compiler/build/Parser.hs
| Run DeriveConstants: none =>
_build/generated/GHCConstantsHaskellWrappers.hs (and 1 more)
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Lazy/Search.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy/Search.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/Lazy/Internal.hs =>
_build/stage0/libraries/text/build/Data/Text/Lazy/Internal.o
| Run Ghc CompileHs Stage0:
libraries/text/Data/Text/Internal/Lazy/Fusion.hs =>
_build/stage0/libraries/text/build/Data/Text/Internal/Lazy/Fusion.o
| Run Ghc CompileHs Stage0: libraries/text/Data/Text/IO.hs =>
_build/stage0/libraries/text/build/Data/Text/IO.o
| Successfully generated _build/stage0/compiler/build/Config.hs.
| Run Alex: compiler/cmm/CmmLex.x => _build/stage0/compiler/build/CmmLex.hs
| Run HsCpp: compiler/prelude/primops.txt.pp =>
_build/stage0/compiler/build/primops.txt
In file included from includes/MachDeps.h:45:0,
 from compiler/prelude/primops.txt.pp:122:
includes/ghcautoconf.h:1:0: error: unterminated #if
 #if !defined(__GHCAUTOCONF_H__)

Error when running Shake build system:
  at action, called at src\Rules.hs:68:19 in main:Rules
  at need, called at src\Rules.hs:90:5 in main:Rules
* Depends on: _build/stage0/lib/package.conf.d/ghc-8.9.0.20190714.conf
  at need, called at src\Rules\Register.hs:115:5 in main:Rules.Register
* Depends on: _build/stage0/compiler/build/libHSghc-8.9.0.20190714.a
  at need, called at src\Rules\Library.hs:144:5 in main:Rules.Library
* Depends on: _build/stage0/compiler/build/TcTypeable.o
  at &%>, called at src\Rules\Compile.hs:47:9 in main:Rules.Compile
* Depends on: _build/stage0/compiler/build/TcTypeable.o

Re: Vector registers assumed to be caller or callee-saved?

2019-06-30 Thread Andreas Klebinger

It is my understanding that we only communicate the calling convention
to be used via LLVM IR
 and LLVM handles generation of the save/restore instructions required
for the call.

So indeed neither the macro nor this function would be used there. But I
gathered that just by skimming
the LLVM code at times so maybe I got something wrong there.


Stefan Schulze Frielinghaus schrieb am 30.06.2019 um 20:36:

But this only includes the NCG. What about the LLVM backend? For LLVM I
only found in compiler/llvmGen/LlvmCodeGen/CodeGen.hs function
definition getTrashRegs which makes use of function callerSaves which is
defined in includes/CodeGen.Platform.hs:

callerSaves :: GlobalReg -> Bool
#if defined(CALLER_SAVES_Base)
callerSaves BaseReg   = True
#endif
...
callerSaves _ = False

There only for general-purpose and floating-point registers function
callerSaves may be defined to True. Thus, for XMMi, YMMi, and ZMMi
arguments the function evaluates to False.

Do I miss something for the LLVM backend? Maybe we just need to extend
the definition of callerSaves in order to respect vector registers, too?

Cheers,
Stefan


On Sun, Jun 30, 2019 at 07:16:15PM +0200, Andreas Klebinger wrote:

What you want is not the macro but this function:
https://hackage.haskell.org/package/ghc-8.6.5/docs/src/X86.Regs.html#callClobberedRegs


whose results depend on the System ABI.

Cheers,
Andreas





Hi all,

I'm wondering what GHC assumes about vector registers XMMi, YMMi, and ZMMi used
by the STG machine: are those assumed to be caller or callee-saved? Only for
the x86-64 architecture there exist macro definitions like CALLER_SAVES_XMM1 in
includes/stg/MachRegs.h.  However, I cannot find any other place where those
macros are used.  AFAIK most C ABIs assume that vector registers are call
clobbered. Is this also the case for GHC?

Many thanks in advance,
Stefan



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Vector registers assumed to be caller or callee-saved?

2019-06-30 Thread Andreas Klebinger

What you want is not the macro but this function:
https://hackage.haskell.org/package/ghc-8.6.5/docs/src/X86.Regs.html#callClobberedRegs


whose results depend on the System ABI.

Cheers,
Andreas





Hi all,

I'm wondering what GHC assumes about vector registers XMMi, YMMi, and ZMMi used
by the STG machine: are those assumed to be caller or callee-saved? Only for
the x86-64 architecture there exist macro definitions like CALLER_SAVES_XMM1 in
includes/stg/MachRegs.h.  However, I cannot find any other place where those
macros are used.  AFAIK most C ABIs assume that vector registers are call
clobbered. Is this also the case for GHC?

Many thanks in advance,
Stefan

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Workflow question

2019-06-29 Thread Andreas Klebinger

I your situation I would just create a base commit with the (empty)
files you intend to add. Then you should not need to reconfigure to
build at least.

However it sounds like there is nothing stopping you from just building
stage1 with a working base commit, then freezing and iterating on stage2
and testing on non-ghc test cases.
It's what I usually do.




___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Container type classes

2019-05-29 Thread Andreas Klebinger

ghc-devs-requ...@haskell.org schrieb:

Hello,

I think refactoring to use consistent naming is a good idea, but I am
not sure about the class idea.

To see if it is viable, we should list the types in question and the
operations we'd like to overload.

I find that with containers there tend to be two cases: either the
operations are similar but not exactly the same and you have to do
type hackery to make things fit, or you realize that you can just use
the same type in multiple places.

Iavor

The function prototype are already part of the merge request. See here:
https://gitlab.haskell.org/ghc/ghc/blob/a0781d746c223636a90a0837fe678aab5b70e4b6/compiler/structures/Collections.hs

As for the data structures in question these are:
* EnumSet
* Data.IntSet
* Data.Set
* UniqSet
* UniqDSet

* Data.IntMap
* Data.Map
* LabelMap
* UniqFM
* UniqDFM
* UniqMap

* Maybe the TrieMap Variants

Maybe I missed some but these are all I can think of currently. But they
are already plenty.

Imo using type classes IS a kind of type hackery required "to make
things fit".
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Overloaded names for Map/Set?

2019-05-24 Thread Andreas Klebinger

Hello devs,

I would appreciate feedback on the idea in
https://gitlab.haskell.org/ghc/ghc/merge_requests/934

Maps/Sets in GHC for the most part offer the same basic functionality
but their interfaces differ.
In order to make this easier to work with I propose using overloading
via IsSet/IsMap classes.

The goal is to make working with these data structures simpler by having
a uniform interface
when it comes to names and argument orders.

There are downsides, but to me they seem minor. Error messages can be
more confusing when one
get's the types wrong. We have to import the class to use it and the like.
However overall I think making code easier by not having to remember the
naming scheme + argument order
for the different possible instances would make this worthwhile.

But GHC isn't my project but one of the community so please voice your
opinion on the matter on the
merge request!

Cheers
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Windows release quality

2019-03-19 Thread Andreas Klebinger

Just to make this clear it's not my intention to blame anyone or point
fingers.
Given the resources at hand I think Phyx and you have done an amazing job
so far to keep things working!

The core of the issue is that if someone sits down and installs a
"stable" GHC today he
will either get a version that hangs on any dependency using TH (8.6.3)
or run into weird errors if he tries to profile an executable.
Both of which I would call rather mundane development activities.

But what I take issue with is not that there is some brokenness.
It's how we deal with that fact from a user perspective.

Pretty much all distribution channels currently provide 8.6.3 or 8.6.4
as stable.

Haskell Platform:
"The latest version of the Haskell Platform for Windows is 8.6.3."
Stackage:
* LTS 13.13 for GHC 8.6.4 ,
published 3 days ago
* LTS 13.11 for GHC 8.6.3 ,
published a week ago
haskell.org:
Current Stable Release: 8.6.4

And none of the download pages give any indication of issues. Neither
does the user guide.
How much effort can we really expect from a user to find out something
basic like profiling or TH is simply broken
in a release marked as stable?

I've actually got hit by both issues despite at least following ghc
development somewhat.
We don't have to be debian. But as a windows user a release being stable
has lost all meaning to me.

And I imagine it's worse for people not looking behind the curtain.

Ben Gamari schrieb:

Phyx  writes:


Hi Andreas,

GHC 8.6.4 not supporting profiling libs was the first thing mentioned in
the release email

  - A regression resulting in segmentation faults on Windows introduced
by the fix for #16071 backported in 8.6.3. This fix has been reverted,
meaning that 8.6.4 is once again susceptible to #16071. #16071 will
be fixed in GHC 8.8.1.

It was also stated that it would be back in 8.8.1. At this point there
was no way to get profiling libs on 8.6.x without a major backport of
linker changes from master. The choice was made to revert the change
and release 8.6.4 without profiling libraries because of a stack
allocation bug that was dormant for years but completely killed the 32
bit distribution. That said the changelog linked to the wrong issue,
the second two should have been #15934 but that's not hard to figure
out by looking at the ticket.


I will reiterate that having functional profiling in 8.6.4 was never
in the cards (unless a contributor was willing to step up to backport
Phyx's linker patch).

However, I will also say that the fact that the omission of the
profiling libraries and haddock from the release tarball (#16408) was
not my intention. Rather this was an accidental side-effect of an
oversight in the release CI job. This is something I only realized
rather recently (leading to !516) and thought I would fix after when I
re-spun the Windows tarballs to include an i386 build.

In hindsight I should have advertised this more widely and perhaps even
pulled the bindist. However, in my defense I did not expect it to more
than a few days to get the fixes through CI and have a new set of
bindists ready for release. On the whole I agree that it is not fair to
users to expect them to discover this sort of thing by browsing the
issue tracker. This is something that I will improve on in the future.

In general I'm not sure how to handle signalling of release stability.
Tamar has done an absolutely amazing job keeping the Windows boat afloat
(and even improving it, c.f. his new IO manager), However, I cannot deny
that there are indeed issues, as evidenced by the fact that my patch
making Windows a mandatory-green CI platforms needs to disable quite a
number of flaky or failing tests. Should we be signalling that this is
stable? It's hard to say; many of these cases are rather niche. Needless
to say if there's consensus that this doesn't constitute a production
ready compiler then I will advocate adjusting the priorities of our
efforts at Well-Typed to put more weight on fixing the Windows issues.

Cheers,

- Ben



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Windows release quality

2019-03-19 Thread Andreas Klebinger

Hello Devs,

After running into #16408 today I realized there is as of yet no
released bindist
of the 8.6 series which I would consider stable for windows.

GHC 8.6.1 and 8.6.2 had a series of critical bugs which applied to
multiple platforms: https://gitlab.haskell.org/ghc/ghc/issues/16408
GHC 8.6.3 loops forever if compiling certain code using TH on windows.
This affects some very popular hackage packages: (#16057)

GHC 8.6.4 (marked stable) currently ships without profiling libraries,
making profiling impossible.

Being stuck with 8.4 is one thing, and if properly communicated not too bad.
But it requires work to even find out about these (major) issues and to
discover that 8.6 is NOT production ready for windows.

We offered the broken 8.6.3 as stable for weeks without any indication
that it was broken.
We still serve GHC 8.6.4 as stable without any hint about the missing
profiling libraries.

I can't offer solutions in this case but I feel like something about the
release management has to change if .
Having to check the GHC bugtracker to find out if the current stable
release is actually stable is just not sustainable.




___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Thoughts on the Contributing page

2019-01-29 Thread Andreas Klebinger

On hadrian:

Documentation will eventually catch up as more people use hadrian but 
imo things that

need to be supported are:

- Some workflows:
  * make fast
  * ./validate
  * make EXTRA_HC_OPTS="..."
- On windows build.bat defaults to stack which I think has never worked 
on my box.
- It's still too easy to run into hadrian bugs imo, which will probably 
work itself out in a few months.


There are also a few quality of life issues like ctrl+c not canceling 
the build on windows.
Which I hope will be resolved at some point but not sure if these should 
be showstoppers.


ghc-devs-requ...@haskell.org schrieb:

One more thought I'd like to throw out in the open here:

The current Newcomers' Guide uses the current Makefile workflow, but
this is on a fast track to deprecation - but then, I doubt Hadrian has
seen enough exposure yet to use for a good beginner-friendly "Just
Works" guide. I'm leaning towards sticking with make for now, also
because existing material is already written this way; and then once
Hadrian is truly ready for prime time, we can rewrite the relevant
parts.

Thoughts?


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Getting a profiled GHC with hadrian.

2019-01-25 Thread Andreas Klebinger

Hello Devs,

What's the proper way to get a profiled GHC in hadrian?

I tried

  hadrian/build.cabal.sh -j8 --flavour=Prof

but that didn't work out:

  $ _build/stage1/bin/ghc.exe +RTS -p
  ghc.exe: the flag -p requires the program to be built with -prof
  ghc.exe:
  ...

What am I doing wrong here?

Cheers
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: nofib output difference

2018-12-08 Thread Andreas Klebinger

I fear that one is my fault.

This should fix it: https://phabricator.haskell.org/D5426

I did not want to add a large binary file to the repo so instead 
compress generated the out file during boot.

Which used to work well on my box.

However make boot also creates the dependency files that guarantee the 
proper build order. Turns out sometimes
we end up trying to build the compress executable before building the 
dependency file, so the build fails.


I've just gave in to storing the output file in the repo now.


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Considerations for Control flow hint implementation.

2018-12-01 Thread Andreas Klebinger

Thank your for the feedback Simon!

This indeed seems like the way to go. I tried that and
adding weights to core this way only added 0.1% allocations on average 
for -O2.


Cheers,
Andreas




Simon Peyton Jones schrieb:


Alternative 1: Putting the weights directly into the case alternative 
tuple


Rather than putting it in the tuple, you could put it in the AltCon

type Alt b = (AltCon, [b], Expr b)

data AltCon = DataAlt Freq DataCon | LitAlt Freq Literal | DEFAULT Freq

My guess is that a lot more current code pattern matches on those 
triples than pattern matches on the AltCon itself.  And AltCon isn't 
used for anything else


Simon

*From:*ghc-devs  *On Behalf Of *Andreas 
Klebinger

*Sent:* 30 November 2018 11:37
*To:* ghc-devs@haskell.org
*Subject:* Considerations for Control flow hint implementation.

Hello Devs,

I've started thinking about the implementation of 
https://github.com/ghc-proposals/ghc-proposals/pull/182 recently. (Add 
control flow hint pragmas.)


For this purpose I've rebased *D4327 *"WIP: Add likelyhood to 
alternatives from stg onwards" which already does a lot of the work at 
the Cmm/Stg level.


The issue I ask you for feedback now is how to best attach branch 
weights to case alternatives in core.


My prefered approach would be to expand core data types to include 
them unconditionally.
While this is quite far reaching in the amount of code it touches it 
would be rather straight forward to implement:


Alternative 1: Putting the weights directly into the case alternative 
tuple:
+ It it's trivial to check which places manipulate case alternatives 
as they will initially fail to compile.
+ It's very mechanical, almost all use sites won't actually change the 
weight.
+ It's easy to keep this working going forward as any new 
optimizations can't "forget" they have to consider them.

- It will introduce a cost in compiler performance.
- New optimization who don't have to care about branchweights still 
have to at least pipe them through.
- While syntactically heavy in terms of real complexity it's a simply 
approach.


Alternative 2:  Putting the weights into the case constructor.
+ Might give better compiler performance as I expect us to rebuild 
cases less often than alternatives.

- Seems kind of clunky.
- Weaker coupling between case alternatives and their weights.

Or we could use ticks:
+ There is some machinery already there
+ Can be turned off for -O0
+ Can be ignored when convenient.
- Can be ignored when convenient.
- Very weak coupling between case alternatives and their weights.
- The existing machinery doesn't exactly match the needs of this.
- We would have to extend tick semantics to a degree where complexity 
might grow too large

  for me to successfully implement this.
- If new optimizations end up just removing these ticks because they 
are allowed to  then

  the whole exercise becomes rather pointless.
- Makes it harder to ensure all relevant code paths in GHC are 
actually updated.


In particular there is currently no tick category which can stick to 
case alternatives but just get's removed in case

it get's in the way of optimizations.
The closest match is SoftScope which allows ticks to be floated up, 
something that could impact performance quite badly
in this case. As then we might float something intended to mark a 
branch as unlikely into another branch that is actually

along the hot path.

I think the core variant(s) mostly stand and fall with the actual 
compile time impact. For -O0 the impact
would be negligible as the compile time is already dominated by 
codegen and typechecking. For the rest

it's hard to say.

So I'm looking for feedback on this. Maybe you have other suggestions 
I haven't considered?
How much compile time cost increase would be acceptable for what kind 
of performance boost?


Cheers,
Andreas Klebinger



___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Considerations for Control flow hint implementation.

2018-11-30 Thread Andreas Klebinger

Hello Devs,

I've started thinking about the implementation of 
https://github.com/ghc-proposals/ghc-proposals/pull/182 recently. (Add 
control flow hint pragmas.)


For this purpose I've rebased D4327 "WIP: Add likelyhood to alternatives 
from stg onwards" which already does a lot of the work at the Cmm/Stg level.


The issue I ask you for feedback now is how to best attach branch 
weights to case alternatives in core.


My prefered approach would be to expand core data types to include them 
unconditionally.
While this is quite far reaching in the amount of code it touches it 
would be rather straight forward to implement:


Alternative 1: Putting the weights directly into the case alternative tuple:
+ It it's trivial to check which places manipulate case alternatives as 
they will initially fail to compile.
+ It's very mechanical, almost all use sites won't actually change the 
weight.
+ It's easy to keep this working going forward as any new optimizations 
can't "forget" they have to consider them.

- It will introduce a cost in compiler performance.
- New optimization who don't have to care about branchweights still have 
to at least pipe them through.
- While syntactically heavy in terms of real complexity it's a simply 
approach.


Alternative 2:  Putting the weights into the case constructor.
+ Might give better compiler performance as I expect us to rebuild cases 
less often than alternatives.

- Seems kind of clunky.
- Weaker coupling between case alternatives and their weights.

Or we could use ticks:
+ There is some machinery already there
+ Can be turned off for -O0
+ Can be ignored when convenient.
- Can be ignored when convenient.
- Very weak coupling between case alternatives and their weights.
- The existing machinery doesn't exactly match the needs of this.
- We would have to extend tick semantics to a degree where complexity 
might grow too large

  for me to successfully implement this.
- If new optimizations end up just removing these ticks because they are 
allowed to  then

  the whole exercise becomes rather pointless.
- Makes it harder to ensure all relevant code paths in GHC are actually 
updated.


In particular there is currently no tick category which can stick to 
case alternatives but just get's removed in case

it get's in the way of optimizations.
The closest match is SoftScope which allows ticks to be floated up, 
something that could impact performance quite badly
in this case. As then we might float something intended to mark a branch 
as unlikely into another branch that is actually

along the hot path.

I think the core variant(s) mostly stand and fall with the actual 
compile time impact. For -O0 the impact
would be negligible as the compile time is already dominated by codegen 
and typechecking. For the rest

it's hard to say.

So I'm looking for feedback on this. Maybe you have other suggestions I 
haven't considered?
How much compile time cost increase would be acceptable for what kind of 
performance boost?


Cheers,
Andreas Klebinger
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: split_marker crash

2018-11-19 Thread Andreas Klebinger

I might already have found the specific cause.

It seems with split sections we generate a dummy CmmProc, which has as 
entry label

> error (split_sections_entry)

My patch likely introduces strictness on that field where it was lazy 
before.
If this is the cause I expect to have a patch up in a hour or two and 
will merge

it after it validates.

Cheers,
Andreas

Ben Gamari schrieb:

Andreas Klebinger  writes:


Hello,

I'm fine with reverting for now. But could you give me a way to
reproduce this error?

I've not seen crashes on either linux and windows in various configs.


I suspect that Simon is building with SplitObjects enabled. To be
honest, I would really like to remove this feature; SplitSections is
better in nearly every regard. However, we have been stalled since
SplitSections doesn't quite work yet on Windows (or, IIRC, the toolchain
is prohibitively slow when it's used). I believe Tamar was working on
fixing this.

Cheers,

- Ben


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: split_marker crash

2018-11-19 Thread Andreas Klebinger

Hello,

I'm fine with reverting for now. But could you give me a way to 
reproduce this error?


I've not seen crashes on either linux and windows in various configs.

Cheers,
Andreas

Ben Gamari schrieb:

Simon Peyton Jones via ghc-devs  writes:


OK I have verified that

   *   This split_marker crash happens on a clean HEAD build
   *   Reverting "NCG: New code layout algorithm.", 
575515b4909f3d77b3d31f2f6c22d14a92d8b8e0, solves the problem.
Andreas: I propose to revert in HEAD unless you have a rapid fix, or believe 
that is somehow my fault.
(Or maybe someone else can revert.)


Simon, are you using split-objs by any chance?

Regardless, yes, let's revert until we work out what is going on here.

Cheers,

- Ben


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Validate on master broken.

2018-11-15 Thread Andreas Klebinger

Hello Devs,

it seems Simons patch "Smarter HsType pretty-print for promoted 
datacons" broke ./validate.





I discovered that there were two copies of the PromotionFlag
type (a boolean, with helpfully named data cons), one in
IfaceType and one in HsType.  So I combined into one,
PromotionFlag, and moved it to BasicTypes. 


In particular haddock seems to have depended on the changed constructors.
If anyone with access and knowledge of haddock could fix  this I would 
be grateful.



Cheers
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


CI failures on OSX

2018-09-02 Thread Andreas Klebinger

Hello devs,

OSX CI seems to have testsuite failures since at least 44ba66527ae2 
 
(but likely caused by something else) and started sometime after 
966aa7818222 


going by the Phab history.

So seems to have been caused by one of the commits on Aug 22.

Could anyone with an Mac take a look there?

Cheers
Andreas




___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Compile times: Average vs worst case runtime

2018-09-01 Thread Andreas Klebinger

I've looked further into this.

The bad news is the ratio is likely not a useful indicator for compile time.

The good news is I spent the last day and a half profiling and 
optimizing the code and found the major reason why performance degraded 
so fast.
There was a certain pattern which had complexity of O(blocks * branches) 
which now should be O(blocks * log branches) with negligibly increased 
memory usage.


The worst cases I know (a function with 1500 guards compiled with -O0) 
is now only about 3% slower than the old algorithm. Which seems reasonable.


I will check how this affects compile time for regular code and rerun 
benchmarks to make sure I've not introduced regressions.
But if it holds up we should at least enable it at O1/O2. Depending how 
compile times are for regular code maybe for -O0 as well.


Cheers
Andreas




Andreas Klebinger schrieb:


Simon Peyton Jones schrieb:

Good work.

| However it will also be possible to fall back to the old behavior to
| avoid the large compile times for modules known to trigger this
| edge case.

Not for specific modules, I hope; rather fall back when the number of 
cases becomes large, or something like that?  That'd be fine.
As it stands it's a regular flag and hence per Module (just like 
worker wrapper or similar) and user controlled.
In the code example I have in my head it was a large case that get's 
desugared to a larger chain of if/elseif/elseif//else.

So it's not strictly large cases but usually it is.

I actually didn't consider doing a "dynamic" fall back so far. That's 
a great idea!
If we are lucky we can use the ratio of edges to blocks, and fall back 
to the old one if we are above a certain function size and ratio.

With the threshold in turn depending on the optimization level.

But not sure if that's good design as we then end up with cases where 
performance suddenly gets 5% worse if

people add a constructor or code get's slightly larger for some reason.


The only other potential worry is how tricky/understandable is your 
patch.  Hopefully it's not hard.
I hope so, the algorithm itself isn't that complicated to some of the 
stuff in GHC and I tried to document it well.

But it's also far from being trivial code.


| 1. Always since on average both compile and runtime is faster.
| 2. -O1/O2 since users often use this when performance matters.
| 3. -O2 since users expect compilation times to blow up with it?
| 4. Never as users experience compile time slowdowns when they hit 
these

| edge cases.

Well, if you can robustly fall back in edge cases, you'll never hit 
the blow-ups.  So then (1) would be fine would it not?
Guess I will have to see how well the edge/block ratio correlates with 
compile time blowups. If we can use that to rule out the really bad 
cases then (1) should be fine.

If not I will have to come back to the question.

Cheers
Andreas


Simon




| -Original Message-
| From: ghc-devs  On Behalf Of Andreas
| Klebinger
| Sent: 30 August 2018 18:08
| To: ghc-devs@haskell.org
| Subject: Compile times: Average vs worst case runtime
|
| Hello Devs,
|
| I developed a patch improving on GHC's code layout during GSoC:
| https://ghc.haskell.org/trac/ghc/ticket/15124
| The gains are pretty nice with most library benchmarks showing
| improvments in the 1-2% range or more.
|
| Even compile times went down comparing head vs stage2 built with, and
| using my patch!
| Making compilation of nofib overall faster by 0.5-1% depending on the
| exact setup.
|
| Now the bad news:
| In some cases the algorithm has bad big O performance, in practice 
this

| seems to be code with
| things like cases with 200+ alternatives. Where these cases also
| represent most of the code compiled.
|
| The worst case I saw was doubled compile time in a Module which only
| consisted of a 500 deep if/else chain only selecting
| an value.
|
| While there are some small optimizations possible to improve on this I
| don't expect to make these edge cases much faster overall.
| However it will also be possible to fall back to the old behavior to
| avoid the large compile times for modules known to trigger this
| edge case.
|
| Which brings me to my main question: When should we use the new code
| layout.
| 1. Always since on average both compile and runtime is faster.
| 2. -O1/O2 since users often use this when performance matters.
| 3. -O2 since users expect compilation times to blow up with it?
| 4. Never as users experience compile time slowdowns when they hit 
these

| edge cases.
|
| I'm would prefer 2. 3. 1. 4. in that order but I wonder what the wider
| community is thinking.
|
| Cheers
| Andreas
| ___
| ghc-devs mailing list
| ghc-devs@haskell.org
| http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs




___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Compile times: Average vs worst case runtime

2018-08-30 Thread Andreas Klebinger


Simon Peyton Jones schrieb:

Good work.

| However it will also be possible to fall back to the old behavior to
| avoid the large compile times for modules known to trigger this
| edge case.

Not for specific modules, I hope; rather fall back when the number of cases 
becomes large, or something like that?  That'd be fine.
As it stands it's a regular flag and hence per Module (just like worker 
wrapper or similar) and user controlled.
In the code example I have in my head it was a large case that get's 
desugared to a larger chain of if/elseif/elseif//else.

So it's not strictly large cases but usually it is.

I actually didn't consider doing a "dynamic" fall back so far. That's a 
great idea!
If we are lucky we can use the ratio of edges to blocks, and fall back 
to the old one if we are above a certain function size and ratio.

With the threshold in turn depending on the optimization level.

But not sure if that's good design as we then end up with cases where 
performance suddenly gets 5% worse if

people add a constructor or code get's slightly larger for some reason.


The only other potential worry is how tricky/understandable is your patch.  
Hopefully it's not hard.
I hope so, the algorithm itself isn't that complicated to some of the 
stuff in GHC and I tried to document it well.

But it's also far from being trivial code.


| 1. Always since on average both compile and runtime is faster.
| 2. -O1/O2 since users often use this when performance matters.
| 3. -O2 since users expect compilation times to blow up with it?
| 4. Never as users experience compile time slowdowns when they hit these
| edge cases.

Well, if you can robustly fall back in edge cases, you'll never hit the 
blow-ups.  So then (1) would be fine would it not?
Guess I will have to see how well the edge/block ratio correlates with 
compile time blowups. If we can use that to rule out the really bad 
cases then (1) should be fine.

If not I will have to come back to the question.

Cheers
Andreas


Simon




| -Original Message-
| From: ghc-devs  On Behalf Of Andreas
| Klebinger
| Sent: 30 August 2018 18:08
| To: ghc-devs@haskell.org
| Subject: Compile times: Average vs worst case runtime
|
| Hello Devs,
|
| I developed a patch improving on GHC's code layout during GSoC:
| https://ghc.haskell.org/trac/ghc/ticket/15124
| The gains are pretty nice with most library benchmarks showing
| improvments in the 1-2% range or more.
|
| Even compile times went down comparing head vs stage2 built with, and
| using my patch!
| Making compilation of nofib overall faster by 0.5-1% depending on the
| exact setup.
|
| Now the bad news:
| In some cases the algorithm has bad big O performance, in practice this
| seems to be code with
| things like cases with 200+ alternatives. Where these cases also
| represent most of the code compiled.
|
| The worst case I saw was doubled compile time in a Module which only
| consisted of a 500 deep if/else chain only selecting
| an value.
|
| While there are some small optimizations possible to improve on this I
| don't expect to make these edge cases much faster overall.
| However it will also be possible to fall back to the old behavior to
| avoid the large compile times for modules known to trigger this
| edge case.
|
| Which brings me to my main question: When should we use the new code
| layout.
| 1. Always since on average both compile and runtime is faster.
| 2. -O1/O2 since users often use this when performance matters.
| 3. -O2 since users expect compilation times to blow up with it?
| 4. Never as users experience compile time slowdowns when they hit these
| edge cases.
|
| I'm would prefer 2. 3. 1. 4. in that order but I wonder what the wider
| community is thinking.
|
| Cheers
| Andreas
| ___
| ghc-devs mailing list
| ghc-devs@haskell.org
| http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Compile times: Average vs worst case runtime

2018-08-30 Thread Andreas Klebinger

Hello Devs,

I developed a patch improving on GHC's code layout during GSoC: 
https://ghc.haskell.org/trac/ghc/ticket/15124
The gains are pretty nice with most library benchmarks showing 
improvments in the 1-2% range or more.


Even compile times went down comparing head vs stage2 built with, and 
using my patch!
Making compilation of nofib overall faster by 0.5-1% depending on the 
exact setup.


Now the bad news:
In some cases the algorithm has bad big O performance, in practice this 
seems to be code with
things like cases with 200+ alternatives. Where these cases also 
represent most of the code compiled.


The worst case I saw was doubled compile time in a Module which only 
consisted of a 500 deep if/else chain only selecting

an value.

While there are some small optimizations possible to improve on this I 
don't expect to make these edge cases much faster overall.
However it will also be possible to fall back to the old behavior to 
avoid the large compile times for modules known to trigger this

edge case.

Which brings me to my main question: When should we use the new code layout.
1. Always since on average both compile and runtime is faster.
2. -O1/O2 since users often use this when performance matters.
3. -O2 since users expect compilation times to blow up with it?
4. Never as users experience compile time slowdowns when they hit these 
edge cases.


I'm would prefer 2. 3. 1. 4. in that order but I wonder what the wider 
community is thinking.


Cheers
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


PSA: You likely want to use -O2 for the stage1 compiler.

2018-08-04 Thread Andreas Klebinger

I've wondered for a good while if using O2 on stage1 might be worth it.

So I did some measurements and it should be worth it for most cases.

For a single "quick" flavour build they are more or less on equal footing.
If you rebuild stage2 multiple times reusing stage1 it will be faster.
If you build stage2 with optimizations/profiling it will be faster.

Below are the timings using "time make -j9" for a quick build.
I forgot to write down the seconds as I didn't expect them to be so close.
But it is what it is.

Timings stage1 options O1 vs O2, quick build after make clean:

stage1 opt | time (wall) | time (user)
-O1|   13m   |   53m
-O2|   13m   |   51m

I've also run the numbers for a optimized stage2 compiler a while ago,
where stage1 with O2 was faster.
But I no longer have these numbers around.

So it seems safe to say one should use O2 if either:
* stage2 is built with optimizations
* you freeze stage1 and reuse it while working on stage2


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Combining Bag/OrdList?

2018-06-02 Thread Andreas Klebinger
> we are free to improve the implementation of Bag in the future so 
that it doesn’t preserve order


Imo we lost that ability by exposing consBag & snocBag which imply that 
there is a front and a back.
Which at first glance also seem to be already used in GHC with that 
behavior in mind.


I agree with the thought that not guaranteeing an ordering might have 
benefits.
But in practice they are almost the same data structure with slightly 
different interfaces.

Kavon Farvardin <mailto:ka...@farvard.in>
Samstag, 2. Juni 2018 18:00
If we have an algorithm that only needs a Bag, then we are free to 
improve the implementation of Bag in the future so that it doesn’t 
preserve order under the hood (e.g, use a hash table). So, I 
personally think it’s useful to have around.


Sent from my phone.


Andreas Klebinger <mailto:klebinger.andr...@gmx.at>
Samstag, 2. Juni 2018 12:13
We have OrdList which does:

Provide trees (of instructions), so that lists of instructions
can be appended in linear time.

And Bag which claims to be:

an unordered collection with duplicates

However the actual implementation of Bag is also a tree if things.
Given that we have snocBag, consBag that implies to me it's
also an ordered collection.

I wondered if besides of someone having to do it if there is a reason 
why these couldn't be combined
into a single data structure? Their implementation seems similar 
enough as far as I can tell.


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Combining Bag/OrdList?

2018-06-02 Thread Andreas Klebinger

We have OrdList which does:

Provide trees (of instructions), so that lists of instructions
can be appended in linear time.

And Bag which claims to be:

an unordered collection with duplicates

However the actual implementation of Bag is also a tree if things.
Given that we have snocBag, consBag that implies to me it's
also an ordered collection.

I wondered if besides of someone having to do it if there is a reason 
why these couldn't be combined
into a single data structure? Their implementation seems similar enough 
as far as I can tell.
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Basic Block Layout in the NCG

2018-05-06 Thread Andreas Klebinger

On branch probability:

I've actually created a patch to add more probabilities in the recent 
past which included
probabilities on CmmSwitches. Although it made little difference when 
only tagging error

branches.

Partially that also run into issues with code layout which is why I put 
that on ice for now.


The full patch is here:
https://phabricator.haskell.org/D4327

I think this really has to be done back to front as GHC currently throws
away all likelyhood information before we get to block layout.
Which makes it very hard to take advantage of this information.

Code Layout:

That seems exactly like the kind of pointers I was looking for!
I do wonder how well some of them aged. For example [5] and [6] use at 
most two way associative cache.

But as you said it should be at least a good starting point.

I will put your links into the ticket so they are easily found once I 
(or someone else!) has time to look

deeper into this.

Cheers
Andreas



Kavon Farvardin <mailto:ka...@farvard.in>
Sonntag, 6. Mai 2018 20:17

Does anyone have good hints for literature on basic block layout
algorithms?


Here are some thoughts:

* Branch Probability *

Any good code layout algorithm should take branch probabilities into
account.  From what I've seen, we already have a few likely-branch
heuristics baked into the generation/transformation of Cmm, though
perhaps it's worth doing more to add probabilities as in [1,3].  The
richer type information in STG could come in handy.

I think the first step in leveraging branch proability information is
to use a Float to represent the magnitude of likeliness instead of a
Bool.

Target probabilities on CmmSwitches could also help create smarter
SwitchPlans. Slides 20-21 in [2] demonstrate a lower-cost decision tree
based on these probabilities.


* Code Layout *

The best all-in-one source for static code positioning I've seen is in
[5], and might be a good starting point for exploring that space.  More
importantly, [5] talks about function positioning, which is something I
think we're missing.  A more sophisticated extension to [5]'s function
positioning can be found in [6].

Keeping in mind that LLVM is tuned to optimize loops within functions,
at at high-level LLVM does the following [4]:

 The algorithm works from the inner-most loop within a
 function outward, and at each stage walks through the
 basic blocks, trying to coalesce them into sequential
 chains where allowed by the CFG (or demanded by heavy
 probabilities). Finally, it walks the blocks in
 topological order, and the first time it reaches a
 chain of basic blocks, it schedules them in the
 function in-order.

There are also plenty of heuristics such as "tail duplication" to deal
with diamonds and other odd cases in the CFG that are harder to layout.
   Unfortunately, there don't seem to be any sources cited.  We may want
to develop our own heuristics to modify the CFG for better layout as
well.



[1] Thomas Ball, James R. Larus. Branch Prediction for Free (https://do
i.org/10.1145/173262.155119)

[2] Hans Wennborg. The recent switch lowering improvements. (http://llv
m.org/devmtg/2015-10/slides/Wennborg-SwitchLowering.pdf) See also: http
s://www.youtube.com/watch?v=gMqSinyL8uk

[3] James E. Smith. A study of branch prediction strategies (https://dl
.acm.org/citation.cfm?id=801871)

[4] http://llvm.org/doxygen/MachineBlockPlacement_8cpp_source.html

[5] Karl Pettis, Robert C. Hansen. Profile guided code positioning. (ht
tps://doi.org/10.1145/93542.93550)

[6] Hashemi et al. Efficient procedure mapping using cache line
coloring (https://doi.org/10.1145/258915.258931)


~kavon


On Sat, 2018-05-05 at 21:23 +0200, Andreas Klebinger wrote:

Does anyone have good hints for literature on basic block layout
algorithms?
I've run into a few examples where the current algorithm falls apart
while working on Cmm.

There is a trac ticket https://ghc.haskell.org/trac/ghc/ticket/15124#
ticket
where I tracked some of the issues I ran into.

As it stands some cmm optimizations are far out weighted by
accidental changes they cause in the layout of basic blocks.

The main problem seems to be that the current codegen only considers
the
last jump
in a basic block as relevant for code layout.

This works well for linear chains of control flow but behaves badly
and
somewhat
unpredictable when dealing with branch heavy code where blocks have
more
than
one successor or calls.

In particular if we have a loop

A jmp B call C call D

which we enter into at block B from Block E
we would like something like:

E,B,C,D,A

Which means with some luck C/D might be still in cache if we return
from
the call.

However we can currently get:

E,B,A,X,D,X,C

where X are other unrelated blocks. This happens since call edges
are
invisible to the layout algorithm.
It even happens when we have (conditional) jumps from B  to C and C
to D
since these are invisible as well!

I came across cases where inve

Re: Re: potential for GHC benchmarks w.r.t. optimisations being incorrect

2018-05-06 Thread Andreas Klebinger

Joachim Breitner schrieb:

This runs on a dedicated physical machine, and still the run-time
numbers were varying too widely and gave us many false warnings (and
probably reported many false improvements which we of course were happy
to believe). I have since switched to measuring only dynamic
instruction counts with valgrind. This means that we cannot detect
improvement or regressions due to certain low-level stuff, but we gain
the ability to reliably measure *something* that we expect to change
when we improve (or accidentally worsen) the high-level
transformations.
While this matches my experience with the default settings, I had good 
results by tuning the number of measurements nofib does.
With a high number of NoFibRuns (30+) , disabling frequency scaling, 
stopping background tasks and walking away from the computer
till it was done I got noise down to differences of about +/-0.2% for 
subsequent runs.


This doesn't eliminate alignment bias and the like but at least it gives 
fairly reproducible results.


Sven Panne schrieb:
4% is far from being "big", look e.g. at 
https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues 
 
where changing just the alignment of the code lead to a 10% 
difference. :-/ The code itself or its layout wasn't changed at all. 
The "Producing Wrong Data Without Doing Anything Obviously Wrong!" 
paper gives more funny examples.


I'm not saying that code layout has no impact, quite the opposite. The 
main point is: Do we really have a benchmarking machinery in place 
which can tell you if you've improved the real run time or made it 
worse? I doubt that, at least at the scale of a few percent. To reach 
just that simple yes/no conclusion, you would need quite a heavy 
machinery involving randomized linking order, varying environments (in 
the sense of "number and contents of environment variables"), various 
CPU models etc. If you do not do that, modern HW will leave you with a 
lot of "WTF?!" moments and wrong conclusions.
You raise good points. While the example in the blog seems a bit 
constructed with the whole loop fitting in a cache line the principle is 
a real concern though.
I've hit alignment issues and WTF moments plenty of times in the past 
when looking at micro benchmarks.


However on the scale of nofib so far I haven't really seen this happen. 
It's good to be aware of the chance for a whole suite to give

wrong results though.
I wonder if this effect is limited by GHC's tendency to use 8 byte 
alignment for all code (at least with tables next to code)?
If we only consider 16byte (DSB Buffer) and 32 Byte (Cache Lines) 
relevant this reduces the possibilities by a lot after all.


In the particular example I've hit however it's pretty obvious that 
alignment is not the issue. (And I still verified that).
In the end how big the impact of a better layout would be in general is 
hard to quantify. Hence the question if anyone has

pointers to good literature which looks into this.

Cheers
Andreas


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Basic Block Layout in the NCG

2018-05-05 Thread Andreas Klebinger

Does anyone have good hints for literature on basic block layout algorithms?
I've run into a few examples where the current algorithm falls apart 
while working on Cmm.


There is a trac ticket https://ghc.haskell.org/trac/ghc/ticket/15124#ticket
where I tracked some of the issues I ran into.

As it stands some cmm optimizations are far out weighted by
accidental changes they cause in the layout of basic blocks.

The main problem seems to be that the current codegen only considers the 
last jump

in a basic block as relevant for code layout.

This works well for linear chains of control flow but behaves badly and 
somewhat
unpredictable when dealing with branch heavy code where blocks have more 
than

one successor or calls.

In particular if we have a loop

A jmp B call C call D

which we enter into at block B from Block E
we would like something like:

E,B,C,D,A

Which means with some luck C/D might be still in cache if we return from 
the call.


However we can currently get:

E,B,A,X,D,X,C

where X are other unrelated blocks. This happens since call edges are 
invisible to the layout algorithm.
It even happens when we have (conditional) jumps from B  to C and C to D 
since these are invisible as well!


I came across cases where inverting conditions lead to big performance 
losses since suddenly block layout

got all messed up. (~4% slowdown for the worst offenders).

So I'm looking for solutions there.

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Looking for GSoC Mentor

2018-02-14 Thread Andreas Klebinger

Hello Everyone,
I'm looking for a mentor for a GSoC Project/Proposal.

The whole idea revolves around:
* Getting hot path information from static analysis and user annotations.
* Making sure it's not destroyed by optimization passes.
* Using it to generate better code where applicable.

See also:
#14672: Make likelihood of branches/conditions available throughout the 
compiler.



Towards these goals I would like for GSoC:
* Land https://phabricator.haskell.org/D4327(Branchweights in STG and Cmm)
  + It might be ready before then. But it's a prerequisite so worth 
listing here.

* Design and implement a way for users to:
  + Mark hot paths in Haskell code.
  + Push the information through the passes (up to STG) on a best 
effort basis. STG and beyond is part of the diff above.
  + The current backend already uses this information in some places, 
so this alone should lead to small but consistent gains.
  + I don't plan to add additional optimizations unless there is time 
left at the end.



Where I think I will need help/experience from a mentor is:
* Frontend questions:
  I haven't looked at GHC's parser/renamer yet so while
  this is surmountable I expect some questions to arise there.

* Design questions/feedback:
  Things like:
  + How should the user be able to give this information.
  + What should be flag controlled? What always on?
  + Things like compile time/executable speed tradeoffs.
  ...

* Potentially the simplifier:
  I've looked into it briefly as originally D4327 was aimed at the
  core stage instead of stg.
  There doesn't seem to be a lot that could
  mess with hot path information/branchweights. But I haven't worked on
  the simplifier before and it seems like a good place for unexpected
  issues to arise.

* If this involves the proposal process then also guidance there.

For myself:
* I'm studying at TU Vienna (Software & Information Engineering)
  and am close to finishing my undergrad. Located in Austria (UTC+1)
* I have contributed a few small changes to GHC throughout last year.
  See AndreasK on Phab/Trac.
* If you want to know more contact me!


If you are interested in mentoring me or looking for more details feel free
to contact me at klebinger.andr...@gmx.at 
or look for AndreasK in #ghc


Best Regards
Andreas
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Asserting vars with different Uniques represent the same Object

2017-11-11 Thread Andreas Klebinger

My core questions are:

* Should variables representing the same thing always have the same unique?
* If not how can one assert they actually represent the same thing?

Working on the pattern matching code I came across this assertion:

> ASSERT(tvs1 `equalLength` ex_tvs )
http://git.haskell.org/ghc.git/blob/HEAD:/compiler/deSugar/MatchCon.hs#l125

tvs1 and ex_tvs are both the existentially quantified type variables of 
a pattern.
One gained by taking apart the pattern itself and one by taking apart 
the ConLike in the pattern.


While as far as I can tell they always represent the same Types they 
don't always compare as equal.
Is there any other stable way to compare them? (By name? Something 
else?). Or is them not being

equal a bug to begin with.

Follow up question:
The whole assert is essentially a unit test as ex_tvs isn't used outside 
of the assert.
Is there a solution to check these invariants in tests instead of the 
source code?
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Feedback on applying implicit exceptions wanted

2017-09-13 Thread Andreas Klebinger
I'm currently working out ideas for potential optimizations in pattern 
matching compilation for GHC.


While I think the theory works out fine in regards to exceptions I would 
like feedback on the possible implications.


Trac ticket with details is here:
https://ghc.haskell.org/trac/ghc/ticket/14201#comment:4

As for the theory I worked based of the assumptions that error = bottom 
and pattern match failure = error. Otherwise we get different results 
from both cases even in theory.


While I mostly worry about the practical implications I would also 
appreciate any other feedback.

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs