Re: [GHC] #10052: Panic (something to do with floatExpr?)

2015-02-04 Thread Peter Wortmann



We are clearly trying to float past a breakpoint here, which is simply 
impossible. Pretty sure this would have been a panic before my changes 
too (it would have tried to mkNoCount the breakpoint). Guess I was 
wrong reading a breakpoints don't appear here invariant out of that...


The quick fix would be to drop all floats in-place:

  -- scoped, counting and unsplittable, can't be floated through
  | otherwise
  = floatBody tOP_LEVEL expr

This fixes the panic, but is a bit awkward. Probably better to change 
SetLevels? Not a piece of code I'm very familiar with...


Greetings,
  Peter

On 04/02/2015 13:31, Simon Peyton Jones wrote:

Peter:

Here's a bad crash, due to you.   (Doing this by email because I'm offline.)

The (Tick t e) case of FloatOut.floatExpr is incomplete.  It simply panics in 
some cases.

Could you fix this please?  Either that case shouldn't happen, in which case 
Core Lint should check for it, and whoever is generating it should be fixed.  
Or it should happen, in which case floatExpr should do the right thing.

Could you leave a Note to explain what is happening in the floatExpr (Tick ...) 
cases?

Thanks

Simon

| -Original Message-
| From: ghc-tickets [mailto:ghc-tickets-boun...@haskell.org] On Behalf Of
| GHC
| Sent: 31 January 2015 17:38
| Cc: ghc-tick...@haskell.org
| Subject: [GHC] #10052: Panic (something to do with floatExpr?)
|
| #10052: Panic (something to do with floatExpr?)
| -+---
| --
|   Reporter:  edsko   | Owner:
|   Type:  bug |Status:  new
|   Priority:  normal  | Milestone:
|  Component:  Compiler|   Version:  7.10.1-rc2
|   Keywords:  |  Operating System:
| Unknown/Multiple
|   Architecture:  |   Type of failure:  None/Unknown
|   Unknown/Multiple   |Blocked By:
|  Test Case:  |   Related Tickets:
|   Blocking:  |
| Differential Revisions:  |
| -+---
| --
|  Loading
|
|  {{{
|  main = let (x :: String) = hello in putStrLn x
|  }}}
|
|  using a very simple driver for the GHC API (see T145.hs) causes a ghc
|  panic:
|
|  {{{
|  [1 of 1] Compiling Main ( T145-input.hs, interpreted )
|  T145: T145: panic! (the 'impossible' happened)
|(GHC version 7.10.0.20150128 for x86_64-apple-darwin):
|  floatExpr tick
|  details unavailable
|
|  Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
|  }}}
|
|  This panic is arising in our test case for #8333, so it may be related
| to
|  that bug.
|
| --
| Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10052
| GHC http://www.haskell.org/ghc/
| The Glasgow Haskell Compiler
| ___
| ghc-tickets mailing list
| ghc-tick...@haskell.org
| http://www.haskell.org/mailman/listinfo/ghc-tickets



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Shipping core libraries with debug symbols

2015-01-09 Thread Peter Wortmann



Yes - strip will catch everything.

Greetings,
  Peter


On 09/01/2015 17:11, Simon Marlow wrote:

I've been building the RTS with debug symbols for our internal GHC build
at FB, because it makes investigating problems a lot easier.  I should
probably upstream this patch.

Shipping libraries with debug symbols should be fine, as long as they
can be stripped - Peter, does stripping remove everything that -g creates?

Cheers,
Simon

On 02/01/2015 23:18, Johan Tibell wrote:

Hi!

We are now able to generate DWARF debug info, by passing -g to GHC. This
will allow for better debugging (e.g. using GDB) and profiling (e.g.
using Linux perf events). To make this feature more user accessible we
need to ship debug info for the core libraries (and perhaps the RTS).
The reason we need to ship debug info is that it's difficult, or
impossible in the case of base, for the user to rebuild these
libraries.The question is, how do we do this well? I don't think our
way solution works very well. It causes us to recompile too much and
GHC doesn't know which ways have been built or not.

I believe other compilers, e.g. GCC, ship debug symbols in separate
files (https://packages.debian.org/sid/libc-dbg) that e.g. GDB can then
look up.

-- Johan



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Shipping core libraries with debug symbols

2015-01-08 Thread Peter Wortmann


(sorry for late answer)

Yes, that's pretty much what this would boil down to. The patch is trivial:

https://github.com/scpmw/ghc/commit/29acc#diff-1

I think this is a good idea anyways. We can always re-introduce the data 
for higher -gn levels.


Greetings,
  Peter


On 05/01/2015 00:59, Johan Tibell wrote:

What about keeping exactly what -g1 keeps for gcc (i.e. functions,
external variables, and line number tables)?

On Sun, Jan 4, 2015 at 5:48 PM, Peter Wortmann sc...@leeds.ac.uk
mailto:sc...@leeds.ac.uk wrote:



Okay, I ran a little experiment - here's the size of the debug
sections that Fission would keep (for base library):

   .debug_abbrev:  8932 - 0.06%
   .debug_line:  374134 - 2.6%
   .debug_frame: 671200 - 4.5%

Not that much. On the other hand, .debug_info is a significant
contributor:

   .debug_info(full):   4527391 - 30%

Here's what this contains: All procs get a corresponding DWARF
entry, and we declare all Cmm blocks as lexical blocks. The latter
isn't actually required right now - to my knowledge, GDB simply
ignores it, while LLDB shows it as inlined routines. In either
case, it just shows yet more GHC-generated names, so it's really
only useful for profiling tools that know Cmm block names.

So here's what we get if we strip out block information:

   .debug_info(!block): 1688410 - 11%

This eliminates a good chunk of information, and might therefore be
a good idea for -g1 at minimum. If we want this as default for
7.10, this would make the total overhead about 18%. Acceptable? I
can supply a patch if needed.

Just for comparison - for Fission we'd strip proc records as well,
which would cause even more extreme savings:

   .debug_info(!proc):36081 - 0.2%

At this point the overhead would be just about 7% - but without
doing Fission properly this would most certainly affect debuggers.

Greetings,
   Peter

On 03/01/2015 21:22, Johan Tibell wrote:
 How much debug info (as a percentage) do we currently generate? Could we 
just keep it in there in the release?

_
ghc-devs mailing list
ghc-devs@haskell.org mailto:ghc-devs@haskell.org
http://www.haskell.org/__mailman/listinfo/ghc-devs
http://www.haskell.org/mailman/listinfo/ghc-devs




___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Shipping core libraries with debug symbols

2015-01-04 Thread Peter Wortmann



Okay, I ran a little experiment - here's the size of the debug sections 
that Fission would keep (for base library):


  .debug_abbrev:  8932 - 0.06%
  .debug_line:  374134 - 2.6%
  .debug_frame: 671200 - 4.5%

Not that much. On the other hand, .debug_info is a significant contributor:

  .debug_info(full):   4527391 - 30%

Here's what this contains: All procs get a corresponding DWARF entry, 
and we declare all Cmm blocks as lexical blocks. The latter isn't 
actually required right now - to my knowledge, GDB simply ignores it, 
while LLDB shows it as inlined routines. In either case, it just shows 
yet more GHC-generated names, so it's really only useful for profiling 
tools that know Cmm block names.


So here's what we get if we strip out block information:

  .debug_info(!block): 1688410 - 11%

This eliminates a good chunk of information, and might therefore be a 
good idea for -g1 at minimum. If we want this as default for 7.10, 
this would make the total overhead about 18%. Acceptable? I can supply a 
patch if needed.


Just for comparison - for Fission we'd strip proc records as well, which 
would cause even more extreme savings:


  .debug_info(!proc):36081 - 0.2%

At this point the overhead would be just about 7% - but without doing 
Fission properly this would most certainly affect debuggers.


Greetings,
  Peter

On 03/01/2015 21:22, Johan Tibell wrote:
 How much debug info (as a percentage) do we currently generate? Could 
we just keep it in there in the release?


___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Shipping core libraries with debug symbols

2015-01-03 Thread Peter Wortmann



The debian package seems to simply put un-stripped libraries into a 
special path (/usr/lib/debug/...). This should be relatively 
straight-forward to implement. Note though that from a look at the RPM 
infrastructure, they have a tool in there (dwarfread) which actually 
parses through DWARF information and updates paths, so there is possibly 
more going on here.


On the other hand, supporting -gsplit-dwarf seems to be a different 
mechanism, called Fission[1]. I haven't looked too much at the 
implementation yet, but to me it looks like it means generating copies 
of debug sections (such as .debug-line.dwo) which will then be extracted 
using objcopy --extract-dwo. This might take a bit more work to 
implement, both on DWARF generation code as well as infrastructure.


Interestingly enough, doing this kind of splitting will actually buy us 
next to nothing - with Fission both .debug_line and .debug_frame would 
remain in the binary unchanged, so all we'd export would be some fairly 
inconsequential data from .debug_info. In contrast to other programming 
languages, we just don't have that much debug information in the first 
place. Well, at least not yet.


Greetings,
  Peter

[1] https://gcc.gnu.org/wiki/DebugFission


On 03/01/2015 00:18, Johan Tibell wrote:

Hi!

We are now able to generate DWARF debug info, by passing -g to GHC. This
will allow for better debugging (e.g. using GDB) and profiling (e.g.
using Linux perf events). To make this feature more user accessible we
need to ship debug info for the core libraries (and perhaps the RTS).
The reason we need to ship debug info is that it's difficult, or
impossible in the case of base, for the user to rebuild these
libraries.The question is, how do we do this well? I don't think our
way solution works very well. It causes us to recompile too much and
GHC doesn't know which ways have been built or not.

I believe other compilers, e.g. GCC, ship debug symbols in separate
files (https://packages.debian.org/sid/libc-dbg) that e.g. GDB can then
look up.

-- Johan



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: How's the integration of DWARF support coming along?

2014-08-13 Thread Peter Wortmann


At this point I have a bit more time on my hands again (modulo post-thesis 
vacations), but we are basically still in “review hell”.

I think “just” for perf_events support we’d need the following patches[1]:
1. Source notes (Core support)
2. Source notes (CorePrep  Stg support)
3. Source notes (Cmm support)
4. Tick scopes
5. Debug data extraction (NCG support)
6. Generate .loc/.file directives

We have a basic “okay” from the Simons up to number 2 (conditional on better 
documentation). Number 4 sticks out because Simon Marlow wanted to have a 
closer look at it - this is basically about how to maintain source ticks in a 
robust fashion on the Cmm level (see also section 5.5 of my thesis[2]).

Meanwhile I have ported NCG DWARF generation over to Mac Os, and am working on 
reviving LLVM support. My plan was to check that I didn’t accidentally break 
Linux support, then push for review again in a week or so (Phab?).

Greetings,
  Peter

[1] https://github.com/scpmw/ghc/commits/profiling-import
[2] http://www.personal.leeds.ac.uk/~scpmw/static/thesis.pdf

On 13 Aug 2014, at 20:01, Johan Tibell 
johan.tib...@gmail.commailto:johan.tib...@gmail.com wrote:

What's the minimal amount of work we need to do to just get the dwarf data in 
the codegen by 7.10 (RC late december) so we can start using e.g. linux perf 
events to profile Haskell programs?


On Wed, Aug 13, 2014 at 7:31 PM, Arash Rouhani 
rar...@student.chalmers.semailto:rar...@student.chalmers.se wrote:
Hi Johan!

I haven't done much (just been lazy) lately, I've tried to benchmark my results 
but I don't get any sensible results at all yet.

Last time Peter said he's working on a more portable way to read dwarf 
information that doesn't require Linux. But I'm sure he'll give a more acurate 
update than me soon in this mail thread.

As for stack traces, I don't think there's any big tasks left, but I summarize 
what I have in mind:

 *   The haskell interface is done and I've iterated on it a bit, so it's in a 
decent shape at least. Some parts still need testing.
 *   I wish I could implement the `forceCaseContinuation` that I've described 
in my thesis. If someone is good with code generation (I just suck at it, it's 
probably simple) and is willing to assist me a bit, please say so. :)
 *   I tried benchmarking, I gave up after not getting any useful results.
 *   I'm unfortunately totally incapable to help out with dwarf debug data 
generation, only Peter knows that part, particularly I never grasped his 
theoretical framework of causality in Haskell.
 *   Peter and I have finally agreed on a simple and sensible way to implement 
`catchWithStack` that have all most good properties you would like. I just need 
to implement it and test it. I can definitely man up and implement this. :)

Here's my master thesis btw [1], it should answer Ömer's question of how we 
retrieve a stack from a language you think won't have a stack. :)

Cheers,
Arash

[1]: http://arashrouhani.com/papers/master-thesis.pdf





On 2014-08-13 17:02, Johan Tibell wrote:
Hi,

How's the integration of DWARF support coming along? It's probably one of the 
most important improvements to the runtime in quite some time since unlocks 
*two* important features, namely

 * trustworthy profiling (using e.g. Linux perf events and other low-overhead, 
code preserving, sampling profilers), and
 * stack traces.

The former is really important to move our core libraries performance up a 
notch. Right now -prof is too invasive for it to be useful when evaluating the 
hotspots in these libraries (which are already often heavily tuned).

The latter one is really important for real life Haskell on the server, where 
you can sometimes can get some crash that only happens once a day under very 
specific conditions. Knowing where the crash happens is then *very* useful.

-- Johan




___
ghc-devs mailing list
ghc-devs@haskell.orgmailto:ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.orgmailto:ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: How's the integration of DWARF support coming along?

2014-08-13 Thread Peter Wortmann

Johan Tibell wrote:
Do you mind expanding on what tick scopes are. It sounds scarily like something 
that happens at runtime. :)

It’s a pretty basic problem - for Core we can always walk the tree upwards to 
find some source ticks that might be useful. Cmm on the other hand is flat: 
Given one block without any annotations on its own, there is no robust way we 
could look around for debugging information.

This is especially tricky because Cmm stages want to be able to liberally add 
or remove blocks. So let’s say we have an extra GC block added: Which source 
location should we see as associated with it? And if two blocks are combined 
using common block elimination: What is now the best source location? And how 
do we express all this in a way that won’t make code generation more 
complicated? The latter is an important consideration, because code generation 
is very irregular in how it treats code - often alternating between 
accumulating it in a monad and passing it around by hand.

I have found it quite tricky to find a good solution in this design space - the 
current idea is that we associate every piece of generated Cmm with a “tick 
scope”, which decides how far a tick will “apply”. So for example a GC block 
would be generated using the same tick scope as the function’s entry block, and 
therefore will get all ticks associated with the function’s top level, which is 
probably the best choice. On the other hand, for merging blocks we can 
“combine” the scopes in a way that guarantees that we find (at least) the same 
ticks as before, therefore losing no information.

And yes, this design could be simplified somewhat for pure DWARF generation. 
After all, for that particular purpose every tick scope will just boil down to 
a single source location anyway. So we could simply replace scopes with the 
source link right away. But I think it would come down to about the same code 
complexity, plus having a robust structure around makes it easier to carry 
along extra information such as unwind information, extra source ticks or the 
generating Core.

Greetings,
  Peter

On Wed, Aug 13, 2014 at 8:49 PM, Peter Wortmann 
sc...@leeds.ac.ukmailto:sc...@leeds.ac.uk wrote:


At this point I have a bit more time on my hands again (modulo post-thesis 
vacations), but we are basically still in “review hell”.

I think “just” for perf_events support we’d need the following patches[1]:
1. Source notes (Core support)
2. Source notes (CorePrep  Stg support)
3. Source notes (Cmm support)
4. Tick scopes
5. Debug data extraction (NCG support)
6. Generate .loc/.file directives

We have a basic “okay” from the Simons up to number 2 (conditional on better 
documentation). Number 4 sticks out because Simon Marlow wanted to have a 
closer look at it - this is basically about how to maintain source ticks in a 
robust fashion on the Cmm level (see also section 5.5 of my thesis[2]).

Meanwhile I have ported NCG DWARF generation over to Mac Os, and am working on 
reviving LLVM support. My plan was to check that I didn’t accidentally break 
Linux support, then push for review again in a week or so (Phab?).

Greetings,
  Peter

[1] https://github.com/scpmw/ghc/commits/profiling-import
[2] http://www.personal.leeds.ac.uk/~scpmw/static/thesis.pdf

On 13 Aug 2014, at 20:01, Johan Tibell 
johan.tib...@gmail.commailto:johan.tib...@gmail.commailto:johan.tib...@gmail.commailto:johan.tib...@gmail.com
 wrote:

What's the minimal amount of work we need to do to just get the dwarf data in 
the codegen by 7.10 (RC late december) so we can start using e.g. linux perf 
events to profile Haskell programs?


On Wed, Aug 13, 2014 at 7:31 PM, Arash Rouhani 
rar...@student.chalmers.semailto:rar...@student.chalmers.semailto:rar...@student.chalmers.semailto:rar...@student.chalmers.se
 wrote:
Hi Johan!

I haven't done much (just been lazy) lately, I've tried to benchmark my results 
but I don't get any sensible results at all yet.

Last time Peter said he's working on a more portable way to read dwarf 
information that doesn't require Linux. But I'm sure he'll give a more acurate 
update than me soon in this mail thread.

As for stack traces, I don't think there's any big tasks left, but I summarize 
what I have in mind:

 *   The haskell interface is done and I've iterated on it a bit, so it's in a 
decent shape at least. Some parts still need testing.
 *   I wish I could implement the `forceCaseContinuation` that I've described 
in my thesis. If someone is good with code generation (I just suck at it, it's 
probably simple) and is willing to assist me a bit, please say so. :)
 *   I tried benchmarking, I gave up after not getting any useful results.
 *   I'm unfortunately totally incapable to help out with dwarf debug data 
generation, only Peter knows that part, particularly I never grasped his 
theoretical framework of causality in Haskell.
 *   Peter and I have finally agreed on a simple and sensible way to implement 
`catchWithStack` that have

Re: Questions about time and space profiling for non-strict langs paper and cost centre impl. of GHC

2014-05-18 Thread Peter Wortmann

Ömer Sinan Ağacan wrote: 
 (off-topic: I'm wondering why an empty tuple is passed to `getCurrentCSS`?)

See the comment on getCurrentCCS# in compiler/prelude/primops.txt.pp -
it's a token to prevent GHC optimisations from floating the primop up
(which obviously might change the call stack).

 Now my first question is: Assuming stack traces are implemented as
 explained by Simon Marlow in his talk and slides, can we say that
 costs are always assigned to top cost-centre in the stack?

Cost is attributed to the current cost-centre *stack*. So a/f !=
b/f.

 As far as I can understand, current implementation is different from
 what's explained in Sansom and Jones, for example

The papers actually never introduce cost-centre stacks, as far as I
know. It's generally better to check Sansom's PhD thesis, the GHC source
code or the other sources I mentioned. There's been quite a bit of work
on this...

 * I can't see SUB cost-centre in list of built-in cost-centres in
 `rts/Profiling.c`.

As I understand it, the SUB cost centre refers to situations where the
cost-centre stack does *not* get updated on function entry. So it never
exists physically.

 is annotated with `CCS_DONT_CARE`. Why is that?

That is an annotation on lambdas, and refers to what cost-centre stack
their allocation cost should be counted on. As top-level functions and
static constructors are allocated statically, they don't count for the
heap profile, therefore don't care.

See the definition of GenStgRhs in compiler/stgSyn/StgSyn.lhs.

 Also, I can see `_push_` and `_tick_`(what's this?) instructions
 placed in generated STG but no `_call_` instructions.

This is what actually manages cost-centre stack - _push_ pushes the
given cost-centre on top of the cost centre stack, whereas _tick just
increases entry count. These two aspects have slightly different
properties as far as transformations are concerned, and therefore often
end up getting separated during optimisations.

Not sure what _call_ is suppose to be. What's the context?

 There is also something like `CCCS` in generated STG but I have no
 ideas about this.

That's simply current cost-centre stack. I like to think that the hint
of silliness was intentional.

 So in short I'm trying to understand how cost-centre related
 annotations are generated and how are they used.

Sure. Better place for quick queries might be #ghc on FreeNode though -
my nick is petermw.

Greetings,
  Peter Wortmann


___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: Questions about time and space profiling for non-strict langs paper and cost centre impl. of GHC

2014-05-15 Thread Peter Wortmann

Ömer Sinan Ağacan wrote:
 To me it looks like that there should be two costs attributed in
 application rules. First cost is evaluation of the function
 part(which should be attributed in `app1` rule) and second is
 substitution part, which should be attributed in `app2` rule but
 according to the paper we only have the former cost, and the latter is
 free(e.g. no costs are attributed in second rule).

Well, sort of. app1 and app2 are just the two stages of applying a
function: First the function expression gets evaluated, then the actual
function gets called. In this case A stands for both costs, even if
they don't necessarily happen right after each other. As we are only
interested in the cost sum, this is the easiest way of doing it.

 It would be appreciated if someone could help me clarifying this. I'm
 also looking for more recent papers about GHC's profiler
 implementation. I'm especially interested profiling in multi-threaded
 RTS.

There have been no new papers that I know of, but we had a talk by Simon
Marlow[1] about improvements to cost-centre stacks, as well as a more
precise description of the modern semantics by Edward Z. Yang[2].

[1] https://www.youtube.com/watch?v=J0c4L-AURDQ
[2] http://blog.ezyang.com/2013/09/cost-semantics-for-stg-in-modern-ghc

 Lastly, it'd be appreciated if someone could explain me two Bool
 fields in `StgSCC` constructor of STG syntax.

Cost centres have two somewhat separate profiling mechanisms:
1. Cost gets attributed
2. Entry counts are counted

Sometimes it can be beneficial for the optimiser to separate the two.
For example, if we have something like

  case e of
D - scc... let f = e1 in e2

and we want to float the f binding outwards we would do:

  let f = scc... e1 in
case e of
  D - scc... e2

However, if we just implemented it like this, we would see the
entry-count to the cost-centre increase quite a bit, because now we are
also counting every entry to f. This is why the compiler can mark the
duplicated tick as non-counting.

 UPDATE: I was reading the paper one more time before sending my
 question and I found one more confusing part. In 4.1 it says Lexical
 scoping will subsume all the costs of applying `and1` to its call
 sites... but in 2.4 it says Lexical scoping: Attribute the cost to
 fun, the cost centre which lexically encloses the abstraction. so
 this two definitions of same thing is actually different. To me it
 looks like the definition in 4.1 is wrong,, it should be definition
 site, not call site. What am I missing here?

This is a special case introduced in 2.3.2: Top-level functions get the
cost-centre SUB, which always makes them take the cost-centre of the
caller. You are right in that this doesn't quite match the literal
interpretation of lexical scoping.

Greetings,
  Peter Wortmann



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: GHC status report

2014-04-30 Thread Peter Wortmann


Added a few sentences about DWARF support - we should really aim to get
this done for 7.10.

Greetings,
  Peter Wortmann



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: What does the DWARF information generated by your GHC branch look like?

2014-03-03 Thread Peter Wortmann

Nathan Howell wrote:
 I did get a language ID assigned a couple years ago, it should be in DWARF
 5.
 
 Accepted:  DW_LANG_Haskell assigned value 0x18. -- April 18.2012

Nice work. We'll start using that one then :)

Greetings,
  Peter Wortmann



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: What does the DWARF information generated by your GHC branch look like?

2014-02-28 Thread Peter Wortmann

Roman Cheplyaka wrote:
 Or he's not subscribed to this list and his messages do not come through

Ah thanks, that's probably it. I accumulated lots of error mails from
the mailing list, which however didn't mention subscribing. Sorry about
the confusion...

Greetings,
  Peter Wortmann



___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: What does the DWARF information generated by your GHC branch look like?

2014-02-28 Thread Peter Wortmann

Johan Tibell wrote:
 Do we follow the best practices here: 
 http://wiki.dwarfstd.org/index.php?title=Best_Practices

Not quite sure what exactly you are referring to, here's the current
state:

 For DW_TAG_compilation_unit and DW_TAG_partial_unit DIEs, the name
 attribute should contain the path name of the primary source file from
 which the compilation unit was derived (see Section 3.1.1).

Yes, we do that.

 If the compiler was invoked with a full path name, it is recommended
 to use the path name as given to the compiler, although it is
 considered acceptable to convert the path name to an equivalent path
 where none of the components is a symbolic link.

I am simply using ModLocation for this. The results make sense, even
though I haven't tried crazy symbolic link combinations yet. If we find
something to improve we should probably do it for GHC as a whole.

 combining the compilation directory (see DW_AT_comp_dir) with the
 relative path name.

We set this attribute, albeit simply using getCurrentDirectory. This
might be an oversight, but I couldn't see a location where GHC stores
the compilation directory path.

 For modules, subroutines, variables, parameters, constants, types, and
 labels, the DW_AT_name attribute should contain the name of the
 corresponding program object as it appears in the source code

We make a best effort to provide a suitable name for every single
procedure. Note that a single function in Haskell might become multiple
subroutines in DWARF - or not appear at all due to in-lining.

 In general, the value of DW_AT_name should be such that a
 fully-qualified name constructed from the DW_AT_name attributes of the
 object and its containing objects will uniquely represent that object
 in a form natural to the source language.

This would probably require us to have a DIE for modules. Not quite sure
how we would approach that.

 The producer may also generate a DW_AT_linkage_name attribute for
 program objects

We do that.

 In many cases, however, it is expensive for a consumer to parse the
 hierarchy, and the presence of the mangled name may be beneficial to
 performance.

This might be the underlying reason why it shows mangled names for
languages with unknown IDs (such as Haskell). We'll see whether Johan's
query to the GDB team brings some light into that.

Greetings,
  Peter Wortmann


___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs


Re: What does the DWARF information generated by your GHC branch look like?

2014-02-28 Thread Peter Wortmann

[copy of the dropped reply, for anybody interested]

Johan Tibell wrote:
 I enjoyed reading your paper [1] and I have some questions.

Thanks! The DWARF patches are currently under review for Trac #3693. Any
feedback would be very appreciated:
https://github.com/scpmw/ghc/commits/profiling-import

  * What does the generated DWARF information look like?

So far we generate:
- .debug_info: Information about all generated procedures and blocks.
- .debug_line: Source-code links for all generated code
- .debug_frame: Unwind information for the GHC stack
- .debug_ghc: Everything we can't properly represent as DWARF

  will you fill in the .debug_line section so that standard tools like
 perf report and gprof can be used on Haskell code?

Yes, even though from a few quick tests the results of perf report
aren't too useful, as source code links are pretty coarse and jump
around a lot - especially for optimised Haskell code. There's the option
to instead annotate with source code links to a generated .dump-simpl
file, which might turn out to be more useful.

 Code pointers would be appreciated.

Is this about how .debug_line information is generated? We take the same
approach as LLVM (and GCC, I think) and simply annotate the assembly
with suitable .file  .loc directives. That way we can leave all the
heavy lifting to the assembler.

Current patch is here:
https://github.com/scpmw/ghc/commit/c5294576

  * Does your GHC allow DWARF information to be generated without
 actually using any of the RTS (e.g. eventlog) machinery?

The RTS just serves as a DWARF interpreter for its own executable (+
libraries) in this, so yes, it's fully independent. On the other hand,
having special code allows us to avoid a few subtleties about Haskell
code that are hard to communicate to standard debugging tools
(especially concerning stack tracing).

 Another way to ask the same question, do you have a ghc -g flag that
 has no implication for the runtime settings?

Right now -g does not affect the RTS at all. We might want to change
that at some point though so we can get rid of the libdwarf dependency.

  * Do you generate DW_TAG_subprogram sections in the .debug_info
 section so that other tools can figure out the name of Haskell
 functions?

Yes, we are setting the name attribute to a suitable Haskell name.
Sadly, at least GDB seems to ignore it and falls back to the symbol
name. I investigated this some time ago, and I think the reason was that
it doesn't recognize the Haskell language ID (which isn't standardized,
obviously). Simply pretending to be C(++) might fix this, but I would be
a bit scared of other side-effects.

Greetings,
  Peter Wortmann




___
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs