Re: TTG hsSyn for Batch and Interactive Parsing

2018-05-08 Thread Alan & Kim Zimmerman
I have started a wiki page at
https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow/IdeSupport

On 8 May 2018 at 10:54, Simon Peyton Jones  wrote:

> At first blush, “running the parser in two modes” and “changing the Pass”
> type don’t match up in my mind.  One seems quite local (how to run the
> parser).  The other seems more pervasive.
>
>
>
> Can you say more about your proposed design, perhaps even on a wiki page?
>
>
>
> Simon
>
>
>
> *From:* ghc-devs  *On Behalf Of *Alan & Kim
> Zimmerman
> *Sent:* 07 May 2018 16:17
> *To:* ghc-devs 
> *Subject:* TTG hsSyn for Batch and Interactive Parsing
>
>
>
> I want to be able to run the GHC parser in one of two modes, batch which
> functions as now, and interactive which will (eventually) be incremental.
>
> In addition, the hsSyn AST for each will have different TTG[1]
> annotations, so that it can better support IDE usage.
>
> I think this can be done by changing the types in HsExtension to introduce
> a 'Process'  type as follows
>
> data Pass = Parsed Process | Renamed | Typechecked
>  deriving (Data)
>
> data Process = Batch | Interactive
>   deriving (Show, Data)
>
> We then rename the pass synonyms so that batch is the default
>
> type GhcPs   = GhcPass ('Parsed 'Batch)
> type GhcPsI  = GhcPass ('Parsed 'Interactive)
>
> I have attached a simple proof of concept file, which emulates parsing and
> renaming.
>
> Is this an appropriate approach to take?
>
> Alan
>
>
>
> [1] Trees That Grow https://ghc.haskell.org/trac/ghc/wiki/
> ImplementingTreesThatGrow
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: TTG hsSyn for Batch and Interactive Parsing

2018-05-08 Thread Simon Peyton Jones via ghc-devs
At first blush, “running the parser in two modes” and “changing the Pass” type 
don’t match up in my mind.  One seems quite local (how to run the parser).  The 
other seems more pervasive.

Can you say more about your proposed design, perhaps even on a wiki page?

Simon

From: ghc-devs  On Behalf Of Alan & Kim Zimmerman
Sent: 07 May 2018 16:17
To: ghc-devs 
Subject: TTG hsSyn for Batch and Interactive Parsing

I want to be able to run the GHC parser in one of two modes, batch which 
functions as now, and interactive which will (eventually) be incremental.
In addition, the hsSyn AST for each will have different TTG[1] annotations, so 
that it can better support IDE usage.
I think this can be done by changing the types in HsExtension to introduce a 
'Process'  type as follows

data Pass = Parsed Process | Renamed | Typechecked
 deriving (Data)

data Process = Batch | Interactive
  deriving (Show, Data)
We then rename the pass synonyms so that batch is the default

type GhcPs   = GhcPass ('Parsed 'Batch)
type GhcPsI  = GhcPass ('Parsed 'Interactive)
I have attached a simple proof of concept file, which emulates parsing and 
renaming.
Is this an appropriate approach to take?
Alan

[1] Trees That Grow 
https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: Basic Block Layout in the NCG

2018-05-08 Thread Simon Peyton Jones via ghc-devs
There is good info in this thread … do add it to the ticket #15124

Simon

From: ghc-devs  On Behalf Of Andreas Klebinger
Sent: 06 May 2018 20:59
To: Kavon Farvardin 
Cc: ghc-devs@haskell.org
Subject: Re: Basic Block Layout in the NCG


On branch probability:
I've actually created a patch to add more probabilities in the recent past 
which included
probabilities on CmmSwitches. Although it made little difference when only 
tagging error
branches.

Partially that also run into issues with code layout which is why I put that on 
ice for now.

The full patch is here:
https://phabricator.haskell.org/D4327

I think this really has to be done back to front as GHC currently throws
away all likelyhood information before we get to block layout.
Which makes it very hard to take advantage of this information.

Code Layout:

That seems exactly like the kind of pointers I was looking for!
I do wonder how well some of them aged. For example [5] and [6] use at most two 
way associative cache.
But as you said it should be at least a good starting point.

I will put your links into the ticket so they are easily found once I (or 
someone else!) has time to look
deeper into this.

Cheers
Andreas



Kavon Farvardin
Sonntag, 6. Mai 2018 20:17

Does anyone have good hints for literature on basic block layout

algorithms?



Here are some thoughts:



* Branch Probability *



Any good code layout algorithm should take branch probabilities into

account.  From what I've seen, we already have a few likely-branch

heuristics baked into the generation/transformation of Cmm, though

perhaps it's worth doing more to add probabilities as in [1,3].  The

richer type information in STG could come in handy.



I think the first step in leveraging branch proability information is

to use a Float to represent the magnitude of likeliness instead of a

Bool.



Target probabilities on CmmSwitches could also help create smarter

SwitchPlans. Slides 20-21 in [2] demonstrate a lower-cost decision tree

based on these probabilities.





* Code Layout *



The best all-in-one source for static code positioning I've seen is in

[5], and might be a good starting point for exploring that space.  More

importantly, [5] talks about function positioning, which is something I

think we're missing.  A more sophisticated extension to [5]'s function

positioning can be found in [6].



Keeping in mind that LLVM is tuned to optimize loops within functions,

at at high-level LLVM does the following [4]:



The algorithm works from the inner-most loop within a

function outward, and at each stage walks through the

basic blocks, trying to coalesce them into sequential

chains where allowed by the CFG (or demanded by heavy

probabilities). Finally, it walks the blocks in

topological order, and the first time it reaches a

chain of basic blocks, it schedules them in the

function in-order.



There are also plenty of heuristics such as "tail duplication" to deal

with diamonds and other odd cases in the CFG that are harder to layout.

  Unfortunately, there don't seem to be any sources cited.  We may want

to develop our own heuristics to modify the CFG for better layout as

well.







[1] Thomas Ball, James R. Larus. Branch Prediction for Free (https://do

i.org/10.1145/173262.155119)



[2] Hans Wennborg. The recent switch lowering improvements. (http://llv

m.org/devmtg/2015-10/slides/Wennborg-SwitchLowering.pdf) See also: http

s://www.youtube.com/watch?v=gMqSinyL8uk



[3] James E. Smith. A study of branch prediction strategies (https://dl

.acm.org/citation.cfm?id=801871)



[4] http://llvm.org/doxygen/MachineBlockPlacement_8cpp_source.html



[5] Karl Pettis, Robert C. Hansen. Profile guided code positioning. (ht

tps://doi.org/10.1145/93542.93550)



[6] Hashemi et al. Efficient procedure mapping using cache line

coloring (https://doi.org/10.1145/258915.258931)





~kavon





On Sat, 2018-05-05 at 21:23 +0200, Andreas Klebinger wrote:

Does anyone have good hints for literature on basic block layout

algorithms?

I've run into a few examples where the current algorithm falls apart

while working on Cmm.



There is a trac ticket 
https://ghc.haskell.org/trac/ghc/ticket/15124#

ticket

where I tracked some of the issues I ran into.



As it stands some cmm optimizations are far out weighted by

accidental changes they cause in the layout of basic blocks.



The main problem seems to be that the current codegen only considers

the

last jump

in a basic block as relevant for code layout.



This works well for linear chains of control flow but behaves badly

and

somewhat

unpredictable when dealing with branch heavy code where blocks have

more

than

one successor or calls.



In particular if we have a loop



A jmp B call C call D



which we enter into at block B from Block E

we would like