RE: 8.2.1-rc2 upgrade report

2017-06-09 Thread Simon Peyton Jones via ghc-devs
Great.   Could you put all this into a Trac ticket?

Thanks!

Simon

From: Alberto Valverde [mailto:albe...@toscat.net]
Sent: 08 June 2017 13:57
To: Simon Peyton Jones 
Cc: GHC users 
Subject: Re: 8.2.1-rc2 upgrade report

Hi Simon,

Thanks for the pointer. I re-did both builds with -dshow-passes and made a 
small script to plot the results of the lines which summarize the elapsed time 
and allocated memory per phase and module. I've uploaded the raw logs, a plot 
of the results and the script I wrote to generate it to 
https://gist.githubusercontent.com/albertov/145ac5c01bfbadc5c9ff55e9c5c2e50e.

The plotted results live here 
https://gist.githubusercontent.com/albertov/145ac5c01bfbadc5c9ff55e9c5c2e50e/raw/8996644707fc5c18c1d42ad43ee31b1817509384/bench.png

Apparently, the biggest slowdown in respect to 8.0.2 seems to occur in the 
SpecConstr and Simplifier passes in the Propag (where the "main" function is) 
and the Sigym4.Propag.Engine (where the main algorithm lives) modules.

Any other tests that would be helpful for me to run? I'm not sure where to 
start to create a reproducible case but I'll see if I can come up with 
something soon...

Alberto

On Tue, Jun 6, 2017 at 1:58 PM, Simon Peyton Jones 
> wrote:
Thanks for the report.

Going from 67G to 56G allocation is a very worthwhile improvement in runtime!  
Hurrah.

However, trebling compile time is very bad.  It is (I think) far from typical: 
generally 8.2 is faster at compiling than 8.0 so you must be hitting something 
weird.  Anything you can do to make a reproducible case would be helpful.  
-dshow-passes shows the size of each intermediate form, which at least 
sometimes shows where the big changes are.

Simon

From: Glasgow-haskell-users 
[mailto:glasgow-haskell-users-boun...@haskell.org]
 On Behalf Of Alberto Valverde
Sent: 06 June 2017 12:39
To: GHC users 
>
Subject: 8.2.1-rc2 upgrade report

Hi,

I've finally managed to upgrade all the dependencies of the proprietary app I 
mentioned some days ago in this list and there are good and bad differences 
I've noticed between 8.0.2 that I'd like to share.

The bad
---

* An optimized cold build (-O2)  is about 3 times slower (~53s vs. ~2m55s) and 
consumes more memory (~2Gb vs. ~7Gb) at it's peak.

The good
-

* An un-optimized cold build (-O0) takes about the same time (~21s, phew! :) 
It's maybe even slightly faster with 8.2 (too few and badly taken measurements 
to really know, though)
* The optimized executable is slightly faster and allocates less memory. For 
this app it makes up for the performance regression of the optimized build 
(which is almost always done by CI), IMHO.

I did only a couple of runs and only wrote down [1] the last run results (which 
were similar to the previous results) so take these observations with a grain 
of salt (except maybe the optimized build slowdown, which doesn't have much 
margin for variance to be skewing the results). I also measured the peak memory 
usage by observing "top".

In case gives a clue: The app is a multi-threaded 2D spread simulator which 
deals with many mmapped Storable mutable vectors and has been pretty optimized 
for countless hours (I mean by this that it has (too) many INLINE pragmas. 
Mostly on polymorphic functions to aid in their specialization). I think some 
of this information can be deduced from the results I'm linking at the footer. 
I believe the INLINEs are playing a big part of the slowdown since the slowest 
modules to compile are the "Main" ones which put everything together, along 
with the typical lens-th-heavy "Types" ones.

I'd like to help by producing a reproducible and isolated benchmark or a better 
analysis or ... so someone more knowledgeable than me on GHC internals can 
someday hopefully attack the regression. Any pointers on what would help and 
where can I learn to do it?

Thanks!


[1] 

Re: Removing Hoopl dependency?

2017-06-09 Thread Michal Terepeta
> On Fri, Jun 9, 2017 at 9:50 AM Simon Peyton Jones 
wrote:
> > Maybe this is the core of our disagreement - why is it a good idea to
have Hoopl as a separate package in the first place?
>
>
> One reason only: because it makes Hoopl usable by compilers other than
GHC.  And, dually, efforts by others to improve Hoopl will benefit GHC.
>
> > If I proposed extracting parts of Core optimizer to a separate package,
wouldn't you expect some really good reasons for doing this?
>
>
> A re-usable library should be
> a)  a significant chunk of code,
> b)  that can plausibly be re-purposed by others
> c)  and that has an explicable API
>
> I think the Core optimiser is so big, and so GHC specific, that (b) and
(c) are unlikely to hold.  But we carefully designed Hoopl from the ground
up so that it was agnostic about the node types, and so can be re-used for
control flow graphs of many kinds.  It’s designed to be re-usable.  Whether
it is actually re-used is another matter, of course.  But if it’s part of
GHC, it can’t be.

I agree with your characterization of a re-usable library and that
Core optimizer would not be a good fit. But I do think that Hoopl also
has some problems with b) and c) (although smaller):
- Using an optimizer-as-a-library is not really common (I'm not aware
  of any compilers doing this, LLVM is to some degree close but it
  exposes the whole language as the interface so it's closer to the
  idea of extracting the whole Cmm backend). So I don't think the API
  for such a project is well understood.
- The API is pretty wide and does put serious constraints on the IR
  (after all it defines blocks and graphs), making reusability
  potentially more tricky.

So I think I understand your argument and we just disagree on whether
this is worth the effort of having a separate package.

>
> [...]
>
> > I've pointed multiple reasons why I think it has a significant cost.
>
> Can you just summarise them again briefly for me?  If we are free to
choose nomenclature and API for hoopl2, I’m not yet seeing why making it a
separate package is harder than not doing so. E.g. template-haskell is a
separate package.

Having even Hoopl2 as a separate package would still entail
additional work:
- Hoopl2 would still need to duplicate some concepts (eg, `Unique`,
  etc. since it needs to be standalone)
- Understanding code (esp. by newcommers) would be harder: the Cmm
  backend would be split between GHC and Hoopl2, with the latter
  necessarily being far more general/polymorphic than needed by GHC.
- Getting the right performance in the presence of all this additional
  generality/polymorphism will likely require fair amount of
  additional work.
- If Hoopl2 is used by other compilers, then we need to be more
  careful changing anything in incompatible ways, this will require
  more discussions & release coordination.

Considering that Hoopl was never actually picked up by other
compilers, I'm not convinced that this cost is justified. But I
understand that other people might have a different opinion.
So how about a compromise:
- decouple GHC from the current Hoopl (ie, go ahead with my diff),
- keep everything Hoopl related only in `compiler/cmm/Hoopl` with the
  long-term intention of creating a separate package,
- experiment with and improve the code,
- once (if?) we're happy with the results, discuss what/how to
  extract to a separate package.
That gives us the freedom to try things out and see what works well
(I simply don't have ready solutions for anything, being able to
experiment is IMHO quite important). And once we reach the right
performance/representation/abstraction/API we can work on extracting
that.

What do you think?

Cheers,
Michal
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Removing Hoopl dependency?

2017-06-09 Thread Alan & Kim Zimmerman
But equally, stackage is a major part of the haskell ecosystem.

As such, implications and paths forward need to be considered.

Alan

On 9 June 2017 at 11:16, Herbert Valerio Riedel  wrote:

> Hi Simon,
>
> On 2017-06-09 at 09:50:51 +0200, Simon Peyton Jones via ghc-devs wrote:
>
> [...]
>
> >> Stackage only allows one version of each package
> >
> > I didn’t know that, but I can see it makes sense.  That makes a strong
> > case for re-doing it as a new package hoopl2
>
> The limitations of Stackage's design shouldn't drive nor limit
> library design. Cabal has been moving to finally allow us to have
> multiple versions and even multiple configurations/instances of the same
> version of a package registered in the package db at the same time, and
> subjecting ourselves to Stackage's limitations after all the work done
> (and more in that direction is being considered to push the boundaries
> even further) to that effect *now* seems quite backward to me.
>
> If we push the idea to its conclusion, that we shall rather publish a
> new package rather than release a new major version of a package to
> workaround Stackage, you'd see a proliferation of number-suffixed
> packages on Hackage.  Moreover, packages which can easily support
> multiple major versions of a package would have to use conditional logic
> boilerplate in their .cabal files (which again would be incompatible
> with Stackage's inherent limitations, as it allows only *one
> configuration* of a given package version).
>
> We should build upon the facilities we already have in place; and major
> versions are here to encode the epoch/generation of an API; moreover, as
> a big advantage over classic SemVer, we also have this 2-component major
> version which gives us more flexibility for versioning during developing
> two or more epochs of an API in parallel. So hoopl-1.* and hoopl-2.*
> could keep evolving independently, each branch being able to perform
> major version increments in their respective version namespace.
>
> Cheers,
>   HVR
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Removing Hoopl dependency?

2017-06-09 Thread Herbert Valerio Riedel
Hi Simon,

On 2017-06-09 at 09:50:51 +0200, Simon Peyton Jones via ghc-devs wrote:

[...]

>> Stackage only allows one version of each package
>
> I didn’t know that, but I can see it makes sense.  That makes a strong
> case for re-doing it as a new package hoopl2

The limitations of Stackage's design shouldn't drive nor limit
library design. Cabal has been moving to finally allow us to have
multiple versions and even multiple configurations/instances of the same
version of a package registered in the package db at the same time, and
subjecting ourselves to Stackage's limitations after all the work done
(and more in that direction is being considered to push the boundaries
even further) to that effect *now* seems quite backward to me.

If we push the idea to its conclusion, that we shall rather publish a
new package rather than release a new major version of a package to
workaround Stackage, you'd see a proliferation of number-suffixed
packages on Hackage.  Moreover, packages which can easily support
multiple major versions of a package would have to use conditional logic
boilerplate in their .cabal files (which again would be incompatible
with Stackage's inherent limitations, as it allows only *one
configuration* of a given package version).

We should build upon the facilities we already have in place; and major
versions are here to encode the epoch/generation of an API; moreover, as
a big advantage over classic SemVer, we also have this 2-component major
version which gives us more flexibility for versioning during developing
two or more epochs of an API in parallel. So hoopl-1.* and hoopl-2.*
could keep evolving independently, each branch being able to perform
major version increments in their respective version namespace.

Cheers,
  HVR
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Removing Hoopl dependency?

2017-06-09 Thread Merijn Verstraaten
Lemme toss in my 2 cents as an outsider who likes to dabble in programming 
language and compilers: I would *love* to be able just drop in (parts) of GHC's 
optimisation into my toy compilers. Optimisation is complicated, lots of work, 
and not really the part I care about when toying with languages. I wasn't 
really aware of Hoopl before this thread, so now that I do I'm kinda sad by the 
idea of this reusable infrastructure being tossed out. I don't really have any 
vested interest/opinion on how to deal with the current Hoopl situation, so if 
it's decided to write a Hoopl2.0 instead, without backwards compatibility, I 
would still consider that a win.

Cheers,
Merijn

> On 9 Jun 2017, at 9:50, Simon Peyton Jones via ghc-devs 
>  wrote:
> 
> Maybe this is the core of our disagreement - why is it a good idea to have 
> Hoopl as a separate package in the first place?
> 
>  
> One reason only: because it makes Hoopl usable by compilers other than GHC.  
> And, dually, efforts by others to improve Hoopl will benefit GHC.
>  
> If I proposed extracting parts of Core optimizer to a separate package, 
> wouldn't you expect some really good reasons for doing this?
> 
>  
> A re-usable library should be
> a)  a significant chunk of code,
> b)  that can plausibly be re-purposed by others
> c)  and that has an explicable API
>  
> I think the Core optimiser is so big, and so GHC specific, that (b) and (c) 
> are unlikely to hold.  But we carefully designed Hoopl from the ground up so 
> that it was agnostic about the node types, and so can be re-used for control 
> flow graphs of many kinds.  It’s designed to be re-usable.  Whether it is 
> actually re-used is another matter, of course.  But if it’s part of GHC, it 
> can’t be.
>  
> Stackage only allows one version of each package
>  
> I didn’t know that, but I can see it makes sense.  That makes a strong case 
> for re-doing it as a new package hoopl2, if the API needs to change 
> substantially (something we have yet to discuss).
>  
> I've pointed multiple reasons why I think it has a significant cost.
> 
> Can you just summarise them again briefly for me?  If we are free to choose 
> nomenclature and API for hoopl2, I’m not yet seeing why making it a separate 
> package is harder than not doing so. E.g. template-haskell is a separate 
> package.
>  
> Thanks!
>  
> Simon
>  
>  
>  
> From: Michal Terepeta [mailto:michal.terep...@gmail.com] 
> Sent: 08 June 2017 19:59
> To: Simon Peyton Jones ; ghc-devs 
> 
> Cc: Kavon Farvardin 
> Subject: Re: Removing Hoopl dependency?
>  
> > On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones  
> > wrote:
> 
> > Michael
> 
> >  
> 
> > Sorry to be slow.
> 
> >  
> 
> > > Note that what I’m actually advocating is to *finish* forking Hoopl. The
> 
> > > fork really started in ~2012 when the “new Cmm backend” was being
> 
> > > finished.
> 
> >  
> 
> > Yes, I know.  But what I’m suggesting is to revisit the reasons for that 
> > fork, and re-join if possible.  Eg if Hoopl is too slow, can’t we make it 
> > faster?  Why is GHC’s version faster?
> 
> >  
> 
> > > apart from the performance
> 
> > > (as noted above), there’s the issue of Hoopl’s interface. IMHO the
> 
> > > node-oriented approach taken by Hoopl is both not flexible enough and it
> 
> > > makes it harder to optimize it. That’s why I’ve already changed GHC’s
> 
> > > `Hoopl.Dataflow` module to operate “block-at-a-time”
> 
> >  
> 
> > Well that sounds like an argument to re-engineer Hoopl’s API, rather an 
> > argument to fork it.  If it’s a better API, can’t we make it better for 
> > everyone?  I don’t yet understand what the “block-oriented” API is, or how 
> > it differs, but let’s have the conversation.
> 
>  
> 
> Sure, but re-engineering the API of a publicly use package has significant
> 
> cost for everyone involved:
> 
> - GHC: we might need to wait longer for any improvements and spend
> 
>   more time discussing various options (and compromises - what makes
> 
>   sense for GHC might not make sense for other people)
> 
> - Hoopl users: will need to migrate to the new APIs potentially
> 
>   multiple times
> 
> - Hoopl maintainers: might need to maintain more than one branches of
> 
>   Hoopl for a while
> 
>  
> 
> And note that just bumping a version number might not be enough.  IIRC
> 
> Stackage only allows one version of each package and since Hoopl is a
> 
> boot package for GHC, the new version will move to Stackage along with
> 
> GHC. So any users of Hoopl that want to use the old package, will not
> 
> be able to use that version of Stackage.
> 
>  
> 
> > > When you say
> 
> > > that we should “just fix Hoopl”, it sounds to me that we’d really need
> 
> > > to rewrite it from scratch. And it’s much easier to do that if we can
> 
> > > just experiment within GHC without worrying about breaking other
> 
> > > existing 

RE: Removing Hoopl dependency?

2017-06-09 Thread Simon Peyton Jones via ghc-devs
Maybe this is the core of our disagreement - why is it a good idea to have 
Hoopl as a separate package in the first place?

One reason only: because it makes Hoopl usable by compilers other than GHC.  
And, dually, efforts by others to improve Hoopl will benefit GHC.

If I proposed extracting parts of Core optimizer to a separate package, 
wouldn't you expect some really good reasons for doing this?

A re-usable library should be

a)  a significant chunk of code,

b)  that can plausibly be re-purposed by others

c)  and that has an explicable API

I think the Core optimiser is so big, and so GHC specific, that (b) and (c) are 
unlikely to hold.  But we carefully designed Hoopl from the ground up so that 
it was agnostic about the node types, and so can be re-used for control flow 
graphs of many kinds.  It’s designed to be re-usable.  Whether it is actually 
re-used is another matter, of course.  But if it’s part of GHC, it can’t be.

Stackage only allows one version of each package

I didn’t know that, but I can see it makes sense.  That makes a strong case for 
re-doing it as a new package hoopl2, if the API needs to change substantially 
(something we have yet to discuss).

I've pointed multiple reasons why I think it has a significant cost.
Can you just summarise them again briefly for me?  If we are free to choose 
nomenclature and API for hoopl2, I’m not yet seeing why making it a separate 
package is harder than not doing so. E.g. template-haskell is a separate 
package.

Thanks!

Simon



From: Michal Terepeta [mailto:michal.terep...@gmail.com]
Sent: 08 June 2017 19:59
To: Simon Peyton Jones ; ghc-devs 
Cc: Kavon Farvardin 
Subject: Re: Removing Hoopl dependency?

> On Wed, Jun 7, 2017 at 7:05 PM Simon Peyton Jones 
> > wrote:
> Michael
>
> Sorry to be slow.
>
> > Note that what I’m actually advocating is to *finish* forking Hoopl. The
> > fork really started in ~2012 when the “new Cmm backend” was being
> > finished.
>
> Yes, I know.  But what I’m suggesting is to revisit the reasons for that 
> fork, and re-join if possible.  Eg if Hoopl is too slow, can’t we make it 
> faster?  Why is GHC’s version faster?
>
> > apart from the performance
> > (as noted above), there’s the issue of Hoopl’s interface. IMHO the
> > node-oriented approach taken by Hoopl is both not flexible enough and it
> > makes it harder to optimize it. That’s why I’ve already changed GHC’s
> > `Hoopl.Dataflow` module to operate “block-at-a-time”
>
> Well that sounds like an argument to re-engineer Hoopl’s API, rather an 
> argument to fork it.  If it’s a better API, can’t we make it better for 
> everyone?  I don’t yet understand what the “block-oriented” API is, or how it 
> differs, but let’s have the conversation.

Sure, but re-engineering the API of a publicly use package has significant
cost for everyone involved:
- GHC: we might need to wait longer for any improvements and spend
  more time discussing various options (and compromises - what makes
  sense for GHC might not make sense for other people)
- Hoopl users: will need to migrate to the new APIs potentially
  multiple times
- Hoopl maintainers: might need to maintain more than one branches of
  Hoopl for a while

And note that just bumping a version number might not be enough.  IIRC
Stackage only allows one version of each package and since Hoopl is a
boot package for GHC, the new version will move to Stackage along with
GHC. So any users of Hoopl that want to use the old package, will not
be able to use that version of Stackage.

> > When you say
> > that we should “just fix Hoopl”, it sounds to me that we’d really need
> > to rewrite it from scratch. And it’s much easier to do that if we can
> > just experiment within GHC without worrying about breaking other
> > existing Hoopl users
>
> Fine.  But then let’s call it hoopl2, make it a separate package (perhaps 
> with GHC as its only client for now), and declare that it’s intended to 
> supersede hoopl.

Maybe this is the core of our disagreement - why is it a good idea to
have Hoopl as a separate package in the first place?

I've pointed multiple reasons why I think it has a significant cost.
But I don't really see any major benefits. Looking at the commit
history of Hoopl there hasn't been much development on it since 2012
when Simon M was trying to get the new GHC backend working (since
then, it's mostly maintenance patches to keep up with changes in
`base`, etc).
Extracting a core part of any project to a shared library has some
real costs, so there should be equally real benefits that outweigh
that cost. (If I proposed extracting parts of Core optimizer to a
separate package, wouldn't you expect some really good reasons for
doing this?)
I also do think this is quite different than a dependency on, say,
`binary`, `containers` or `pretty`, where the API of the library is
smaller