Re: Does GHC still support x87 floating point math?

2012-12-06 Thread Simon Marlow

On 06/12/12 11:01, Herbert Valerio Riedel wrote:

Ben Lippmeier b...@ouroborus.net writes:


On 06/12/2012, at 12:12 , Johan Tibell wrote:


I'm currently trying to implement word2Double#. Other such primops
support both x87 and sse floating point math. Do we still support x87
fp math? Which compiler flag enables it?


It's on by default unless you use the -sse2 flag. The x87 support is
horribly slow though. I don't think anyone would notice if you deleted
the x87 code and made SSE the default, especially now that we have the
LLVM code generator. SSE has been the way to go for over 10 years now.


btw, iirc GHC uses SSE2 for x86-64 code generation by default, and that
the -msse2 option has only an effect when generating x86(-32) code


Yes, because all x86_64 CPUs support SSE2.  Chips older than P4 don't 
support it.  I imagine there aren't too many of those around that people 
want to run GHC on, and as Ben says, there's always -fllvm.


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Emitting constants to the .data section from the NatM monad

2012-12-06 Thread Simon Marlow

On 06/12/12 00:29, Johan Tibell wrote:

Hi!

I'm trying to implement word2Double# and I've looked at how e.g. LLVM
does it. LLVM outputs quite clever branchless code that uses two
predefined constants in the .data section. Is it possible to add
contents to the current .data section from a function in the NatM
monad e.g.

 coerceWord2FP :: Width - Width - CmmExpr - NatM Register

?


Yes, you can emit data.  Look at the LDATA instruction in the X86 
backend, for example, and see how we generate things like table jumps.


So are you going to add the two missing MachOps, MO_UF_Conv  MO_FU_Conv?

Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Dynamic libraries by default and GHC 7.8

2012-12-06 Thread Simon Marlow

On 06/12/12 21:35, Brandon Allbery wrote:

On Thu, Dec 6, 2012 at 4:04 PM, Simon Marlow marlo...@gmail.com
mailto:marlo...@gmail.com wrote:

On 05/12/12 15:17, Brandon Allbery wrote:

Probably none; on most platforms you're actually generating
different
code (dynamic libraries require generation of position-independent

Sure there's a lot of differences in the generated code, but inside
GHC these differences only appear at the very last stage of the
pipeline, native code generation (or LLVM).  All the stages up to
that can be shared, which accounts for roughly 80% of compilation
time (IIRC).


I was assuming it would be difficult to separate those stages of the
internal compilation pipeline out, given previous discussions of how
said pipeline works.  (In particular I was under the impression
saving/restoring state in the pipeline to rerun the final phase with
multiple code generators was not really possible, and multithreading
them concurrently even less so.)


I don't think there's any problem (unless I've forgotten something).  In 
fact, the current architecture should let us compile one function at a 
time both ways, so we don't get a space leak by retaining all the Cmm code.


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Patch to enable GHC runtime system with thr_debug_p options...

2012-12-04 Thread Simon Marlow

On 03/12/12 20:11, Joachim Breitner wrote:

Dear Michał,

Am Sonntag, den 02.12.2012, 22:44 +0100 schrieb Michał J. Gajda:

On 12/02/2012 09:20 PM, Joachim Breitner wrote:

I noticed that Ubuntu, as well as Debian and original packages come
without some variants of threaded debugging binaries.
A recent change added of printing a stack trace with -xc option requires
using both -ticky and profiling compile options,
which in turn forces program to be compiled in a -debug RTS way.
Since stack trace looks like indispensable debugging tool, and
convenient parallelization is strength of Haskell,
I wonder is there any remaining reason to leave beginners with a cryptic
error message
when they try to debug a parallel or threaded application, and want to
take advantage of stack trace?

The resulting ghc-prof package would be increased by less than 1%.

Here is a patch for Ubuntu/Debian GHC 7.4.2 package, as well as upstream


--- ghc-7.4.2-orig/mk/config.mk.in  2012-06-06 19:10:25.0 +0200
+++ ghc-7.4.2/mk/config.mk.in   2012-12-01 00:22:29.055003842 +0100
@@ -256,7 +256,7 @@
  #   l   : event logging
  #   thr_l   : threaded and event logging
  #
-GhcRTSWays=l
+GhcRTSWays=l thr_debug_p thr_debug

  # Usually want the debug version
  ifeq $(BootingFromHc) NO


I notice that your patch modifies the defaults of GHC as shipped by
upstream, and I wonder if there is a reason why these ways are not
enabled by default.

Dear GHC HQ: Would you advice against or for providing a RTS in the
thr_debug_p and thr_debug ways in the Debian package?


thr_debug is already enabled by default.  thr_debug_p is not currently 
enabled, but only because we very rarely need it, so there wouldn't be 
any problem with enabling it by default.


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Is the GHC release process documented?

2012-11-30 Thread Simon Marlow

On 30/11/12 03:54, Johan Tibell wrote:

While writing a new nofib benchmark today I found myself wondering
whether all the nofib benchmarks are run just before each release,
which the drove me to go look for a document describing the release
process. A quick search didn't turn up anything, so I thought I'd ask
instead. Is there a documented GHC release process? Does it include
running nofib? If not, may I propose that we do so before each release
and compare the result to the previous release*.

* This likely means that nofib has to be run for the upcoming release
and the prior release each time a release is made, as numbers don't
translate well between machines so storing the results somewhere is
likely not that useful.


I used to do this on an ad-hoc basis: the nightly builds at MSR spit out 
nofib results that I compared against previous releases.


In practice you want to do this much earlier than just before a release, 
because it can take time to investigate and squash any discrepancies.


On the subject of the release process, I believe Ian has a checklist 
that he keeps promising to put on the wiki (nudge :)).


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Dynamic libraries by default and GHC 7.8

2012-11-30 Thread Simon Marlow

On 27/11/12 14:52, Ian Lynagh wrote:

GHC HEAD now has support for using dynamic libraries by default (and in
particular, using dynamic libraries and the system linker in GHCi) for a
number of platforms.

This has some advantages and some disadvantages, so we need to make a
decision about what we want to do in GHC 7.8. There are also some policy
questions we need to answer about how Cabal will work with a GHC that
uses dynamic libraries by default. We would like to make these as soon
as possible, so that GHC 7.6.2 can ship with a Cabal that works
correctly.

The various issues are described in a wiki page here:
 http://hackage.haskell.org/trac/ghc/wiki/DynamicByDefault

If you have a few minutes to read it then we'd be glad to hear your
feedback, to help us in making our decisions


It's hard to know what the best course of action is, because all the 
options have downsides.


Current situation:
 * fast code and compiler
 * but there are bugs in GHCi that are hard to fix, and an ongoing
   maintenance problem (the RTS linker).
 * binaries are not broken by library updates

Switching to dynamic:
 * slower code and compiler (by varying amounts depending
   on the platform)
 * but several bugs in GHCi are fixed, no RTS linker needed
 * binaries can be broken by library updates
 * can't do it on Windows (as far as we know)

Perhaps we should look again at the option that we discarded: making 
-static the default, and require a special option to build objects for 
use in GHCi.  If we also build packages both static+dynamic at the same 
time in Cabal, this might be a good compromise.


Static by default, GHCi is dynamic:
 * fast code and compiler
 * GHCi bugs are fixed, no maintenance problems
 * binaries not broken by library updates
 * we have to build packages twice in Cabal (but can improve GHC to
   emit both objects from a single compilation)
 * BUT, objects built with 'ghc -c' cannot be loaded into GHCi unless
   also built with -dynamic.
 * still can't do this on Windows

Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Dynamic libraries by default and GHC 7.8

2012-11-29 Thread Simon Marlow

On 28/11/12 23:15, Johan Tibell wrote:

What does gcc do? Does it link statically or dynamically by default?
Does it depend on if it can find a dynamic version of libraries or
not?


If it finds a dynamic library first, it links against that.

Unlike GHC, with gcc you do not have to choose at compile-time whether 
you are later going to link statically or dynamically, although you do 
choose at compile-time to make an object for a shared library (-fPIC is 
needed).


When gcc links dynamically, it assumes the binary will be able to find 
its libraries at runtime, because they're usually in /lib or /usr/lib. 
Apps that ship with their own shared libraries and don't install into 
the standard locations typically have a wrapper script that sets 
LD_LIBRARY_PATH, or they use RPATH with $ORIGIN (a better solution).


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Dynamic libraries by default and GHC 7.8

2012-11-28 Thread Simon Marlow

On 27/11/12 23:28, Joachim Breitner wrote:

Hi,

Am Dienstag, den 27.11.2012, 14:52 + schrieb Ian Lynagh:

The various issues are described in a wiki page here:
 http://hackage.haskell.org/trac/ghc/wiki/DynamicByDefault

If you have a few minutes to read it then we'd be glad to hear your
feedback, to help us in making our decisions


here comes the obligatory butting in by the Debian Haskell Group:

Given the current sensitivity of the ABI hashes we really do not want to
have Programs written in Haskell have a runtime dependency on all the
included Haskell libraries. So I believe we should still link Haskell
programs statically in Debian.

Hence, Debian will continue to provide its libraries built the static
way.

Building them also in the dynamic way for the sake of GHCi users seems
possible.


So let me try to articulate the options, because I think there are some 
dependencies that aren't obvious here.  It's not a straightforward 
choice between -dynamic/-static being the default, because of the GHCi 
interaction.


Here are the 3 options:

(1) (the current situation) GHCi is statically linked, and -static is
the default.  Uses the RTS linker.

(2) (the proposal, at least for some platforms) GHCi is dynamically
linked, and -dynamic is the default.  Does not use the RTS linker.

(3) GHCi is dynamically linked, but -static is the default.  Does not
use the RTS linker.  Packages must be installed with -dynamic,
otherwise they cannot be loaded into GHCi, and only objects
compiled with -dynamic can be loaded into GHCi.

You seem to be saying that Debian would do (3), but we hadn't considered 
that as a viable option because of the extra hoops that GHCi users would 
have to jump through.  We consider it a prerequisite that GHCi continues 
to work without requiring any extra flags.


Cheers,
Simon





Open question: What should GHC on Debian do when building binaries,
given that all libraries are likely available in both ways – shared or
static. Shared means that all locally built binaries (e.g. xmonad!) will
suddenly break when the user upgrades its Haskell packages, as the
package management is ignorant of unpackaged, locally built programs.
I’d feel more comfortable if that could not happen.

Other open question: Should we put the dynamic libraries in the normal
libghc-*-dev package? Con: Package size doubles (and xmonad users are
already shocked by the size of stuff they need to install). Pro: It
cannot happen that I can build Foo.hs statically, but not load it in
GHCi, or vice-versa.

I still find it unfortunate that once cannot use the .so for static
linking as well, but that is a problem beyond the scope of GHC.

Greetings,
Joachim



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Dynamic libraries by default and GHC 7.8

2012-11-28 Thread Simon Marlow

On 27/11/12 14:52, Ian Lynagh wrote:


Hi all,

GHC HEAD now has support for using dynamic libraries by default (and in
particular, using dynamic libraries and the system linker in GHCi) for a
number of platforms.

This has some advantages and some disadvantages, so we need to make a
decision about what we want to do in GHC 7.8. There are also some policy
questions we need to answer about how Cabal will work with a GHC that
uses dynamic libraries by default. We would like to make these as soon
as possible, so that GHC 7.6.2 can ship with a Cabal that works
correctly.

The various issues are described in a wiki page here:
 http://hackage.haskell.org/trac/ghc/wiki/DynamicByDefault


Thanks for doing all the experiments and putting this page together, it
certainly helps us to make a more informed decision.


If you have a few minutes to read it then we'd be glad to hear your
feedback, to help us in making our decisions


My personal opinion is that we should switch to dynamic-by-default on 
all x86_64 platforms, and OS X x86. The performance penalty for 
x86/Linux is too high (30%), and there are fewer bugs affecting the 
linker on that platform than OS X.


I am slightly concerned about the GC overhead on x86_64/Linux (8%), but 
I think the benefits outweigh the penalty there, and I can probably 
investigate to find out where the overhead is coming from.


Cheers,
Simon




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Dynamic libraries by default and GHC 7.8

2012-11-28 Thread Simon Marlow

On 28/11/12 12:48, Ian Lynagh wrote:

On Wed, Nov 28, 2012 at 09:20:57AM +, Simon Marlow wrote:


My personal opinion is that we should switch to dynamic-by-default
on all x86_64 platforms, and OS X x86. The performance penalty for
x86/Linux is too high (30%),


FWIW, if they're able to move from x86 static to x86_64 dynamic then
there's only a ~15% difference overall:

Run Time
-1 s.d. -   -18.7%
+1 s.d. -   +60.5%
Average -   +14.2%

Mutator Time
-1 s.d. -   -29.0%
+1 s.d. -   +33.7%
Average -   -2.6%

GC Time
-1 s.d. -   +22.0%
+1 s.d. -   +116.1%
Average -   +62.4%


The figures on the wiki are different: x86 static - x86_64 dynamic has 
+2.3% runtime. What's going on here?


I'm not sure I buy the argument that it's ok to penalise x86/Linux users 
by 30% because they can use x86_64 instead, which is only 15% slower. 
Unlike OS X, Linux users using the 32-bit binaries probably have a 
32-bit Linux installation, which can't run 64-bit binaries (32-bit is 
still the recommended Ubuntu installation for desktops, FWIW).


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Leaving Microsoft

2012-11-22 Thread Simon Marlow

Today I'm announcing that I'm leaving Microsoft Research.

My plan is to take a break to finish the book on Parallel and
Concurrent Haskell for O'Reilly, before taking up a position at
Facebook in the UK in March 2013.

This is undoubtedly a big change, both for me and for the Haskell
community.  I'll be stepping back from full-time GHC development and
research and heading into industry, hopefully to use Haskell.  It's an
incredibly exciting opportunity for me, and one that I hope will
ultimately be a good thing for Haskell too.

What does this mean for GHC? Obviously I'll have much less time to
work on GHC, but I do hope to find time to fix a few bugs and keep
things working smoothly. Simon Peyton Jones will still be leading the
project, and we'll still have support from Ian Lynagh, and of course
the community of regular contributors. Things are in a reasonably
stable state - there haven't been any major architectural changes in
the RTS lately, and while we have just completed the switchover to the
new code generator, I've been working over the past few weeks to
squeeze out all the bugs I can find, and I'll continue to do that over
the coming months up to the 7.8.1 release.

In due course I hope that GHC can attract more of you talented hackers
to climb the learning curve and start working on the internals, in
particular the runtime and code generators, and I'll do my best to
help that happen.

Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


[Haskell] Leaving Microsoft

2012-11-22 Thread Simon Marlow

Today I'm announcing that I'm leaving Microsoft Research.

My plan is to take a break to finish the book on Parallel and
Concurrent Haskell for O'Reilly, before taking up a position at
Facebook in the UK in March 2013.

This is undoubtedly a big change, both for me and for the Haskell
community.  I'll be stepping back from full-time GHC development and
research and heading into industry, hopefully to use Haskell.  It's an
incredibly exciting opportunity for me, and one that I hope will
ultimately be a good thing for Haskell too.

What does this mean for GHC? Obviously I'll have much less time to
work on GHC, but I do hope to find time to fix a few bugs and keep
things working smoothly. Simon Peyton Jones will still be leading the
project, and we'll still have support from Ian Lynagh, and of course
the community of regular contributors. Things are in a reasonably
stable state - there haven't been any major architectural changes in
the RTS lately, and while we have just completed the switchover to the
new code generator, I've been working over the past few weeks to
squeeze out all the bugs I can find, and I'll continue to do that over
the coming months up to the 7.8.1 release.

In due course I hope that GHC can attract more of you talented hackers
to climb the learning curve and start working on the internals, in
particular the runtime and code generators, and I'll do my best to
help that happen.

Cheers,
Simon


___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: Undocumented(?) magic this package-id in PackageImports

2012-11-13 Thread Simon Marlow

Please submit a bug (ideally with a patch!).  It should be documented.

However, note that we don't really like people to use PackageImports.
It's not a carefully designed feature, we only hacked it in so we could 
build the base-3 wrapper package a while ago. It could well change in 
the future.


Cheers,
Simon

On 13/11/2012 12:30, Herbert Valerio Riedel wrote:

Hello Simon,

I just found out that in combination with the PackageImports extension
there's a special module name this which according to [1] always
refers to the current package. But I couldn't find this rather useful
feature mentioned in the GHC 7.6.1 Manual PackageImports section[2]. Has
this been omitted on purpose from the documentation?

Cheers,
   hvr

  [1]: 
https://github.com/ghc/ghc/commit/436a5fdbe0c9a466569abf1d501a6018aaa3e49e
  [2]: 
http://www.haskell.org/ghc/docs/latest/html/users_guide/syntax-extns.html#package-imports



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Using DeepSeq for exception ordering

2012-11-13 Thread Simon Marlow

On 12/11/2012 16:56, Simon Hengel wrote:

Did you try -fpedantic-bottoms?


I just tried.  The exception (or seq?) is still optimized away.

Here is what I tried:

 -- file Foo.hs
 import Control.Exception
 import Control.DeepSeq
 main = evaluate (('a' : undefined) `deepseq` return () :: IO ())

 $ ghc -fforce-recomp -fpedantic-bottoms -O Foo.hs  ./Foo  echo bar
 [1 of 1] Compiling Main ( Foo.hs, Foo.o )
 Linking Foo ...
 bar


Sounds like a bug, -fpedantic-bottoms should work here.  Please open a 
ticket.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Using DeepSeq for exception ordering

2012-11-12 Thread Simon Marlow

Did you try -fpedantic-bottoms?

Cheers,
Simon

On 08/11/2012 19:16, Edward Z. Yang wrote:

It looks like the optimizer is getting confused when the value being
evaluated is an IO action (nota bene: 'evaluate m' where m :: IO a
is pretty odd, as far as things go). File a bug?

Cheers,
Edward

Excerpts from Albert Y. C. Lai's message of Thu Nov 08 10:04:15 -0800 2012:

On 12-11-08 01:01 PM, Nicolas Frisby wrote:

And the important observation is: all of them throw A if interpreted in
ghci or compiled without -O, right?


Yes.



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Building GHC for BB10 (QNX)

2012-11-12 Thread Simon Marlow

On 10/11/2012 19:53, Stephen Paul Weber wrote:

Hey all,

I'm interesting in trying to get an initial port for BlackBerry 10 (QNX)
going.  It's a POSIXish environment with primary interest in two
architechtures: x86 (for simulator) and ARMv7 (for devices).

I'm wondering if
http://hackage.haskell.org/trac/ghc/wiki/Building/Porting is fairly
up-to-date or not?  Is there a better place I should be looking?

One big difference (which may turn out to be a problem) is that the
readily-available QNX compilers (gcc ports) are cross-compilers.  I
realise that GHC has no good support to act as a cross-compiler yet, and
anticipate complications arising from trying to build GHC using a
cross-compiler for bootstrapping (since that implies GHC acting as a
cross-compiler at some point in the bootstrapping).

Any suggestions would be very welcome.


Cross-compilation is the way to port GHC at the moment, although 
unfortunately our support for cross-compilation is currently under 
development and is not particularly robust.  Some people have managed to 
port GHC using this route in recent years (e.g. the iPhone port).  For 
the time being, you will need to be able to diagnose and fix problems 
yourself in the GHC build system to get GHC ported.


Ian Lynagh is currently looking into cross-compilation and should be 
able to tell you more.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Status of stack trace work

2012-11-08 Thread Simon Marlow

On 08/11/12 05:43, Johan Tibell wrote:


I can't wait until we have some for of stack traces in GHC. What's the
current status? Did the semantics you presented at HIW12 work out? Even
though the full bells and whistles of full stack traces is something I'd
really like to see, even their more impoverished cousins, the lexical
stack trace, would be helpful in tracking down just which call to head
gave rise to a head: empty list error.


The profiler currently uses the stack tracing scheme I described in that 
talk, and you can use it to chase head [] right now (with +RTS -xc).


You can also use the GHC.Stack.errorWithStackTrace function that I 
demonstrated in the talk; I added it to GHC.Stack after 7.6.1, but the 
code should work with 7.6.1 if you import GHC.Stack:


-- | Like the function 'error', but appends a stack trace to the error
-- message if one is available.
errorWithStackTrace :: String - a
errorWithStackTrace x = unsafeDupablePerformIO $ do
   stack - ccsToStrings = getCurrentCCS x
   if null stack
  then throwIO (ErrorCall x)
  else throwIO (ErrorCall (x ++ '\n' : renderStack stack))


I realise that compiling with profiling is not always practical, and not 
as immediate as we'd like, and also it doesn't work in GHCi.  What I 
think we should do is


 (a) add stack trace support to GHCi, so we would get stack traces
 for interpreted code

 (b) incorporate the work of Peter Wortmann and Nathan Howell to get
 DWARF information into GHC binaries, and use this to get
 execution stacks without needing to compile for profiling

I'd love to see people working on (b) especially, and I'll be happy to 
provide direction or pointers to anyone who's interested (I'm no DWARF 
expert though).  The stacks you get this way won't be as nice as the 
ones we get from the profiler, and there will be absolutely no 
guarantees about the quality of the information, or even that you'll get 
the same stack with -O as you get without.  But some information is 
better than no information for debugging purposes.



Once we do have some sort of stack traces, could we have throw
automatically attach it to the exception, so we can get a printed stack
trace upon crash? Is that how e.g. Java deals with that? Will that make
other uses of exceptions (such as throwing async exceptions to kill
threads) get much more expensive if we try to attach stack traces? A
frequent user of async exceptions are web servers that start a timeout
call per request.


One way to do this is to provide a variant of catch that grabs the stack 
trace, e.g.


  catchStack :: Exception e = IO a - (e - Stack - IO a) - IO a

and this could be implemented cheaply, because the stack only needs to 
be constructed when it is being caught by catchStack.  However, this 
doesn't work well when an exception is rethrown, such as when it passes 
through a nest of brackets.  To make that work, you really have to 
attach the stack trace to the exception.  We could change our 
SomeException type to include a stack trace, and that would be fairly 
cheap to use with profiling (because capturing the current stack is 
free), but it would be more expensive with DWARF stacks so you'd want a 
runtime option to enable it.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Why are arrays retainers?

2012-10-30 Thread Simon Marlow

On 24/10/2012 08:25, Akio Takano wrote:


Recently I was surprised to see that GHC's retainer profiler treated
boxed arrays as retainer objects. Why are they considered retainers?


I think the intention is to treat all mutable objects as retainers, 
where thunks are a kind of mutable object.  From the code it looks like 
both MutableArray# and Array# are treated as retainers though, which is 
probably a mistake.  I'll fix it so that Array# is not a retainer, and 
update the docs to mention that mutable objects are retainers.


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC compiling shared library linking problem

2012-10-22 Thread Simon Marlow

On 17/10/2012 08:43, Richard Zetterberg wrote:

I have two parts of my application; one which builds a cli application
which uses my module and one which builds a shared library which uses
the same module. I have no problems compiling my cli application. And if
I try to compile the shared library without the foreign export and the
functions in Shared.hs I get another linker error:

 /usr/bin/ld: ../src/Assembly.o: relocation R_X86_64_32S against
`stg_CAF_BLACKHOLE_info` can not be used when making a shared object;
recompile with -fPIC
 ../src/Assembly.o: could not read symbols: Bad value
collect2: error: ld returned 1 exit status

(I forgot to add that I'm using the haskell-platform package in Debian
sid (unstable). Here is the package and it's dependencies:
http://packages.debian.org/sid/haskell-platform.)


I think you're trying to make a shared library from Haskell code, right? 
 You don't say what platform, but it looks like x86_64/Linux or maybe 
OS X.  On these platforms, code that goes into a shared library must be 
compiled with -fPIC.  You will also need to compile with -dynamic, in 
order to link against the shared versions of the RTS and the other 
Haskell libraries.


Cheers,
Simon







Best regards
Richard

On 10/15/12 6:44 PM, Carter Schonwald wrote:


-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iQIcBAEBAgAGBQJQfmGPAAoJEP99n2Mvs+ma5KwP/1ZMZzZtNmQ7EUJSPg1mB3w7
Wh1KQtgLrfcCggfeMlA/XOtUE+pgtN66uarbi9iaqLULH4fLm/h6v12zSmaqG1uU
XKuaIRtr886+bcFG/fXO0pW7OGbaF/w0nQN06iRqbFrce0f/U3VGHp8BqJNZFhSK
qIHRM+WweM95LV9tgrCAeI3C2sGR4GkzhUunCCAOSZ8MfEwFxPV4OsmuCjKGcCcb
GZXXhOynGhbLa8mg29dQNytt01AMgBBiRSWLHVFW6IfUxPk7uuQp33Q6wvjOUyA4
kJIUz9BU9IiPUeVdO2sg+fBB1ehOV2qPiqHf0xoJ1mpH6qd3KnUcJXsXrTXK4piz
lo2lCOlqxspBiByX4HzyLE6pA+8OZcREO5GOHo5V4iI0RQAwkjaqARAU/6VVXHC0
fQbPJf8U1CQZZamkuoTgUfKcOHLFYIqVq9p8Ar1dykT74okAyMR+FU0ExWTbr/Xs
7oGD+Q44geh65FkkeLUoKIUD+aV35HQE6GL9O/OjKm1aMg3yGA5bM6UUiAw2FgqE
jfZoHc9frO/WMP1XgEkKQjtupUCH92ol/PpyPbFJqfnxMvvvI+lYEEIL90XPudmS
5ygqeinIVwBKGVb6D08rLC1OzaS0dFasOjOeWYM12epbZsy9WCzIl+U14TFgy/Ze
la4rqefI8sdBK1cQslYu
=RN8H
-END PGP SIGNATURE-

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Safe Haskell and instance coherence

2012-10-11 Thread Simon Marlow

On 08/10/2012 20:11, Mikhail Glushenkov wrote:

Hello,

It's a relatively well-known fact that GHC allows for multiple type
class instances for the same type to coexist in a single program. This
can be used, for example, to construct values of the type Data.Set.Set
that violate the data structure invariant. I was mildly surprised to
find out that this works even when Safe Haskell is turned on:

https://gist.github.com/3854294

Note that the warnings tell us that both instances are [safe] which
gives a false sense of security.

I couldn't find anything on the interplay between orphan instances and
Safe Haskell both in the Haskell'12 paper and online. Is this
something that the authors of Safe Haskell are aware of/are intending
to fix?


A fine point.  Arguably this violates the module abstraction guarantee, 
because you are able to discover something about the implementation of 
Set by violating its assumption that the Ord instance for a given type 
is always the same.


I don't know what we should do about this.  Disallowing orphan instances 
seems a bit heavy-handed. David, Simon, any thoughts?


(can someone forward this to David Mazieres? all the email addresses I 
have for him seem to have expired :-)


Cheers,
Simon




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Installing binary tarball fails on Linux

2012-10-09 Thread Simon Marlow

On 08/10/2012 12:57, Joachim Breitner wrote:

Hi,

Am Montag, den 08.10.2012, 12:08 +0100 schrieb Simon Marlow:

On 01/10/2012 13:00, Ganesh Sittampalam wrote:

On 01/10/2012 12:05, Simon Marlow wrote:


This probably means that you have packages installed in your ~/.cabal
from a 32-bit GHC and you're using a 64-bit one, or vice-versa.  To
avoid this problem you can configure cabal to put built packages into a
directory containing the platform name.


How does one do this? I ran into this problem a while ago and couldn't
figure it out:
http://stackoverflow.com/questions/12393750/how-can-i-configure-cabal-to-use-different-folders-for-32-bit-and-64-bit-package


I do this at work where I share the same home dir between several
different machines, and my .cabal/config contains

install-dirs user
prefix: /home/simonmar/.cabal
bindir: $prefix/bin/$arch-$os
-- libdir: $prefix/lib
libsubdir: $pkgid/$compiler/$arch-$os
-- libexecdir: $prefix/libexec
-- datadir: $prefix/share
-- datasubdir: $pkgid
-- docdir: $datadir/doc/$pkgid
-- htmldir: $docdir/html
-- haddockdir: $htmldir


any chance of making that the default, at least for libsubdir? I also
stumble over it when I use a i386 schroot to test stuff.


Head over to https://github.com/haskell/cabal/ and create an issue (or a 
pull request!).


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: copyArray# bug

2012-10-09 Thread Simon Marlow

On 09/10/2012 15:58, Johan Tibell wrote:

On Tue, Oct 9, 2012 at 1:26 AM, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:

Johan Tibell wrote:

Hi,

I did quite a bit of work to make sure copyArray# and friends get
unrolled if the number of elements to copy is a constant. Does this
still work with the extra branch?


I would expect it to but I don't know. Does the testsuite check for this?


Simon, the assembly testing support I added would be very useful now.
It would tell us if this change preserved unrolling or not.


Yes yes, I need to get to that :-)

Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Installing binary tarball fails on Linux

2012-10-08 Thread Simon Marlow

On 01/10/2012 13:00, Ganesh Sittampalam wrote:

On 01/10/2012 12:05, Simon Marlow wrote:


This probably means that you have packages installed in your ~/.cabal
from a 32-bit GHC and you're using a 64-bit one, or vice-versa.  To
avoid this problem you can configure cabal to put built packages into a
directory containing the platform name.


How does one do this? I ran into this problem a while ago and couldn't
figure it out:
http://stackoverflow.com/questions/12393750/how-can-i-configure-cabal-to-use-different-folders-for-32-bit-and-64-bit-package


I do this at work where I share the same home dir between several 
different machines, and my .cabal/config contains


install-dirs user
  prefix: /home/simonmar/.cabal
  bindir: $prefix/bin/$arch-$os
  -- libdir: $prefix/lib
  libsubdir: $pkgid/$compiler/$arch-$os
  -- libexecdir: $prefix/libexec
  -- datadir: $prefix/share
  -- datasubdir: $pkgid
  -- docdir: $datadir/doc/$pkgid
  -- htmldir: $docdir/html
  -- haddockdir: $htmldir


Hope this helps.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: copyArray# bug

2012-10-08 Thread Simon Marlow

On 06/10/2012 22:41, Roman Leshchinskiy wrote:

I've been chasing a segfault in the dev version of vector and I think I
finally traced it to a bug in the implementation of copyArray# and
copyMutableArray#. More specifically, I think emitSetCards in
StgCmmPrim.hs (and CgPrimOp.hs) will sometimes fail to mark the last
card as dirty because in the current implementation, the number of cards
to mark is computed solely from the number of copied elements while it
really depends on which cards the first and the last elements belong to.
That is, the number of elements to copy might be less than the number of
elements per card but the copied range might still span two cards.

The attached patch fixes this (and the segfault in vector) and also
makes copyArray# return immediately if the number of elements to copy is
0. Could someone who is familiar with the code please review it and tell
me if it looks sensible. If it does, I'll make the same modification to
CgPrimOp.hs (which has exactly the same code) and commit. Unfortunately,
I have no idea how to write a testcase for this since the bug is only
triggered in very specific circumstances.

It seems that all released versions of GHC that implement
copyArray#/copyMutableArray# have this problem. At least, vector's
testsuite now segfaults with all of them in roughly the same place after
recent modifications I've made (which involve calling copyArray# a lot).
If I'm right then I would suggest not to use copyArray# and
copyMutableArray# for GHC  7.8.


Nice catch!

Just to make sure I'm understanding: the conditional you added is not 
just an optimisation, it is required because otherwise the memset() call 
will attempt to mark a single card. (this was the bug I fixed last 
time I touched this code, but I think I might have inadverdently 
introduced the bug you just fixed)


Please go ahead and commit.  Note that CgPrimOp is scheduled for 
demolition very shortly, but the bug will need to be fixed there in the 
7.6 branch.


Cheers,
Simon





Roman


patch


diff --git a/compiler/codeGen/StgCmmPrim.hs b/compiler/codeGen/StgCmmPrim.hs
index cbb2aa7..6c291f1 100644
--- a/compiler/codeGen/StgCmmPrim.hs
+++ b/compiler/codeGen/StgCmmPrim.hs
@@ -1069,27 +1069,30 @@ emitCopyArray :: (CmmExpr - CmmExpr - CmmExpr - CmmExpr 
- CmmExpr
- FCode ()
  emitCopyArray copy src0 src_off0 dst0 dst_off0 n0 = do
  dflags - getDynFlags
--- Passed as arguments (be careful)
-src - assignTempE src0
-src_off - assignTempE src_off0
-dst - assignTempE dst0
-dst_off - assignTempE dst_off0
  n   - assignTempE n0
+nonzero - getCode $ do
+-- Passed as arguments (be careful)
+src - assignTempE src0
+src_off - assignTempE src_off0
+dst - assignTempE dst0
+dst_off - assignTempE dst_off0

--- Set the dirty bit in the header.
-emit (setInfo dst (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel)))
+-- Set the dirty bit in the header.
+emit (setInfo dst (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel)))

-dst_elems_p - assignTempE $ cmmOffsetB dflags dst (arrPtrsHdrSize dflags)
-dst_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p dst_off
-src_p - assignTempE $ cmmOffsetExprW dflags (cmmOffsetB dflags src 
(arrPtrsHdrSize dflags)) src_off
-bytes - assignTempE $ cmmMulWord dflags n (mkIntExpr dflags (wORD_SIZE 
dflags))
+dst_elems_p - assignTempE $ cmmOffsetB dflags dst (arrPtrsHdrSize 
dflags)
+dst_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p dst_off
+src_p - assignTempE $ cmmOffsetExprW dflags (cmmOffsetB dflags src 
(arrPtrsHdrSize dflags)) src_off
+bytes - assignTempE $ cmmMulWord dflags n (mkIntExpr dflags 
(wORD_SIZE dflags))

-copy src dst dst_p src_p bytes
+copy src dst dst_p src_p bytes

--- The base address of the destination card table
-dst_cards_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p 
(loadArrPtrsSize dflags dst)
+-- The base address of the destination card table
+dst_cards_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p 
(loadArrPtrsSize dflags dst)

-emitSetCards dst_off dst_cards_p n
+emitSetCards dst_off dst_cards_p n
+
+emit = mkCmmIfThen (cmmNeWord dflags n (mkIntExpr dflags 0)) nonzero

  -- | Takes an info table label, a register to return the newly
  -- allocated array in, a source array, an offset in the source array,
@@ -1142,10 +1145,11 @@ emitSetCards :: CmmExpr - CmmExpr - CmmExpr - FCode 
()
  emitSetCards dst_start dst_cards_start n = do
  dflags - getDynFlags
  start_card - assignTempE $ card dflags dst_start
+end_card - assignTempE $ card dflags (cmmSubWord dflags (cmmAddWord 
dflags dst_start n) (mkIntExpr dflags 1))
  emitMemsetCall (cmmAddWord dflags dst_cards_start start_card)
-(mkIntExpr dflags 1)
-(cardRoundUp dflags n)
-(mkIntExpr dflags 1) -- no alignment (1 byte)
+   

Re: Comments on current TypeHoles implementation

2012-10-05 Thread Simon Marlow


On 04/10/2012 10:40, Simon Peyton-Jones wrote:


I have a proposal.  Someone has already suggested on
hackage.haskell.org/trac/ghc/ticket/5910 that an un-bound variable
behaves like a hole.  Thus, if you say

   f x = y

GHC says “Error: y is not in scope”.  But (idea) with -XTypeHoles

f x = y

might generate

1.(renamer) *Warning*: y is not in scope

2.(type) *Error*: Hole “y” has type

So that’s like a named hole, in effect.

If you say

f x = 4

GHC warns about the unused binding for x.  But if you say

f _x = 4

the unused-binding warning is suppressed.  So (idea) if you say

   f x = _y

maybe we can suppress warning (1).  And, voila, named holes.

Moreover if you add –fdefer-type-errors you can keep going and run the
program.

Any comments?  This is pretty easy to do.


It's a great idea.  I suggest that we have a separate flag that controls 
whether an unbound variable results in a warning or an error, rather 
than piggybacking on -fdefer-type-errors.  Perhaps 
-fdefer-unbound-errors or something.


What I'm aiming at is that eventually we can have -fdefer-errors that 
expands to -fdefer-type-errors, -fdefer-unbound-errors, 
-fdefer-parse-errors, etc.


Cheers,
Simon



(I’m unhappy that –XTypeHoles is a language pragma while
–fdefer-type-errors is a compiler flag.  Maybe we should have
–XDeferTypeErrors?)

Simon

*From:*sean.leat...@gmail.com [mailto:sean.leat...@gmail.com] *On Behalf
Of *Sean Leather
*Sent:* 03 October 2012 16:45
*To:* Simon Peyton-Jones
*Cc:* GHC Users List; Thijs Alkemade
*Subject:* Comments on current TypeHoles implementation

Hi Simon,

Thanks for all your work in getting TypeHoles into HEAD. We really
appreciate it.

I was playing around with HEAD today and wanted to share a few observations.

(1) One of the ideas we had was that a hole `_' would be like
`undefined' but with information about the type and bindings. But in the
current version, there doesn't appear to be that connection. This mainly
applies to ambiguous type variables.

Consider:

  f = show _

The hole has type a0.

But with

  f = show undefined

there is a type error because a0 is ambiguous.

We were thinking that it would be better to report the ambiguous type
variable first, rather than the hole. In that case, tou can use
-fdefer-type-errors to defer the error. Currently, you don't have that
option. I can see the argument either way, however, and I'm not sure
which is better.

(2) There is a strange case where an error is not reported for a missing
type class instance, even though there is no (apparent) relation between
the missing instance and the hole. (This also relates to the connection
to `undefined', but less directly.)

We have the following declaration:

  data T = T Int {- no Show instance -}

With a hole in the field

  g = show (T _)

we get a message that the hole has type Int.

With

  g = show (T undefined)

we get an error for the missing instance of `Show T'.

(3) In GHCi, I see that the type of the hole now defaults. This is not
necessarily bad, though it's maybe not as useful as it could be.

ghci :t show _

reports that the hole has type ().

(4) In GHCi, sometimes a hole throws an exception, and sometimes it does
not.

ghci show _

throws an exception with the hole warning message

ghci show (T _)

and

ghci _ + 42

cause GHCi to panic.

(5) There are some places where unnecessary parentheses are used when
pretty-printing the code:

ghci :t _ _

interactive:1:1: Warning:

 Found hole `_' with type t0 - t

 Where: `t0' is a free type variable

`t' is a rigid type variable bound by

the inferred type of it :: t at Top level

 In the expression: _

 In the expression: _ (_)

interactive:1:3: Warning:

 Found hole `_' with type t0

 Where: `t0' is a free type variable

 In the first argument of `_', namely `_'

 In the expression: _ (_)

_ _ :: t

The argument `_' does not need to be printed as `(_)'.

There is also the small matter, in this example, of distinguishing which
`_' is which. The description works, but you have to think about it. I
don't have an immediate and simple solution to this. Perhaps the
addition of unique labels (e.g. _$1 _$2). But this is not a major
problem. It can even wait until some future development/expansion on
TypeHoles.

Regards,

Sean



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: memory fragmentation with ghc-7.6.1

2012-10-01 Thread Simon Marlow

Hi Ben,

My guess would be that you're running into some kind of memory 
bottleneck.  Three common ones are:


  (1) total memory bandwidth
  (2) cache ping-ponging
  (3) NUMA overheads

You would run into (1) if you were using an allocation area size (-A or 
-H) larger than the L2 cache.  Your stats seem to indicate that you're 
running with a large heap - could that be the case?


(2) happens if you share data a lot between cores.  It can also happen 
if the RTS shares data between cores, but I've tried to squash as much 
of that as I can.


(3) is sadly something that happens on these large AMD machines (and to 
some extent large multicore Intel boxes too).  Improving our NUMA 
support is something we really need to do.  NUMA overheads tend to 
manifest as very unpredictable runtimes.


I suggest using perf to gather some low-level stats about cache misses 
and suchlike.


  http://hackage.haskell.org/trac/ghc/wiki/Debugging/LowLevelProfiling/Perf

Cheers,
Simon


On 29/09/2012 07:47, Ben Gamari wrote:

Simon Marlow marlo...@gmail.com writes:


On 28/09/12 17:36, Ben Gamari wrote:

Unfortunately, after poking around I found a few obvious problems with
both the code and my testing configuration which explained the
performance drop. Things seem to be back to normal now. Sorry for the
noise! Great job on the new codegen.


That's good to hear, thanks for letting me know!


Of course!

That being said, I have run in to a bit of a performance issue which
could be related to the runtime system. In particular, as I scale up in
thread count (from 6 up to 48, the core count of the machine) in my
program[1] (test data available), I'm seeing the total runtime increase,
as well as a corresponding increase in CPU-seconds used. This despite
the RTS claiming consistently high (~94%) productivity. Meanwhile
Threadscope shows that nearly all of my threads are working busily with
very few STM retries and no idle time. This in an application which
should scale reasonably well (or so I believe). Attached below you will
find a crude listing of various runtime statistics over a variety of
thread counts (ranging into what should probably be regarded as the
absurdly large).

The application is a parallel Gibbs sampler for learning probabilistic
graphical models. It involves a set of worker threads (updateWorkers)
pulling work units off of a common TQueue. After grabbing a work unit,
the thread will read a reference to the current global state from an
IORef. It will then begin a long-running calculation, resulting in a
small value (forced to normal form with deepseq) which it then
communicates back to a global update thread (diffWorker) in the form of
a lambda through another TQueue. The global update thread then maps the
global state (the same as was read from the IORef earlier) through this
lambda with atomicModifyIORef'. This is all implemented in [2].

I do understand that I'm asking a lot of the language and I have been
quite impressed by how well Haskell and GHC have stood up to the
challenge thusfar. That being said, the behavior I'm seeing seems a bit
strange. If synchronization overhead were the culprit, I'd expect to
observe STM retries or thread blocking, which I do not see (by eye it
seems that STM retries occur on the order of 5/second and worker threads
otherwise appear to run uninterrupted except for GC; GHC event log
from a 16 thread run available here[3]). If GC were the problem, I would
expect this to be manifested in the productivity, which it is clearly
not. Do you have any idea what else might be causing such extreme
performance degradation with higher thread counts? I would appreciate
any input you would have to offer.

Thanks for all of your work!

Cheers,

- Ben


[1] https://github.com/bgamari/bayes-stack/v2
[2] https://github.com/bgamari/bayes-stack/blob/v2/BayesStack/Core/Gibbs.hs
[3] http://goldnerlab.physics.umass.edu/~bgamari/RunCI.eventlog



Performance of Citation Influence model on lda-handcraft data set
1115 arcs, 702 nodes, 50 items per node average
100 sweeps in blocks of 10, 200 topics
Running with +RTS -A1G
ghc-7.7 9c15249e082642f9c4c0113133afd78f07f1ade2

Cores  User time (s)  Walltime (s)   CPU %   Productivity
== =  =  ==  =
2  488.66 269.41 188%93.7%
3  533.43 195.28 281%94.1%
4  603.92 166.94 374%94.3%
5  622.40 138.16 466%93.8%
6  663.73 123.00 558%94.2%
7  713.96 114.17 647%94.0%
8  724.66 101.98 736%93.7%
9  802.75 100.59 826%.
10 865.05 97.69  917%.
11 966.97 99.09  1010%   .
12 1238.42114.28 1117%
13 1242.43106.53 1206%
14 1428.59112.48

Re: memory fragmentation with ghc-7.6.1

2012-09-26 Thread Simon Marlow

On 26/09/2012 05:42, Ben Gamari wrote:

Simon Marlow marlo...@gmail.com writes:


On 21/09/2012 04:07, John Lato wrote:

Yes, that's my current understanding.  I see this with ByteString and
Data.Vector.Storable, but not
Data.Vector/Data.Vector.Unboxed/Data.Text.  As ByteStrings are pretty
widely used for IO, I expected that somebody else would have
experienced this too.

I would expect some memory fragmentation with pinned memory, but the
change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to
several GB).


This was a side-effect of the improvements we made to the allocation of
pinned objects, which ironically was made to avoid fragmentation of a
different kind.  What is happening is that the memory for the pinned
objects is now taken from the nursery, and so the nursery has to be
replenished after GC.  When we allocate memory for the nursery we like
to allocate it in big contiguous chunks, because that works better with
automatic prefecthing, but the memory is horribly fragmented due to all
the pinned objects, so the large allocation has to be satisfied from the OS.


It seems that I was bit badly by this bug with productivity being reduced to
30% with 8 threads. While the fix on HEAD has brought productivity back up to 
the
mid-90% mark, runtime for my program has regressed by nearly 40%
compared to 7.4.1. It's been suggested that this is the result of the
new code generator. How should I proceed from here? It would be nice to
test with the old code generator to verify that the new codegen is in
fact the culprit, yet it doesn't seem there is a flag to accomplish
this. Ideas?


I removed the flag yesterday, so as long as you have a GHC before 
yesterday you can use -fno-new-codegen to get the old codegen.  You 
might need to compile libraries with the flag too, depending on where 
the problem is.


I'd be very interested to find out whether the regression really is due 
to the new code generator, because in all the benchmarking I've done the 
worst case I found is a program that goes 4% slower, and on average 
performance is the same as the old codegen.  It is likely that by 7.8.1 
with some tweaking we should be beating the old codegen consistently.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How do I build GHC 7.6 from source?

2012-09-21 Thread Simon Marlow

On 20/09/2012 16:25, Iavor Diatchki wrote:

perhaps we should have a well-defined place in the repo where we keep
the finger-prints associated with tags and branches in the main repo?
This would make it a lot easier to get to a fully defined
previous/different state.


We do have tags for releases, so you can say

 ./sync-all checkout ghc-7.6.1-release

and get the exact 7.6.1 sources.

I wouldn't object to also having fingerprints in the repo too though.

Cheers,
Simon




On this note, could someone send the link to the 7.6 fingerprint?  Ian
said that it is somewhere in the nightly build logs but I don't where to
look.

-Iavor



On Thu, Sep 20, 2012 at 7:20 AM, Simon Marlow marlo...@gmail.com
mailto:marlo...@gmail.com wrote:

On 19/09/2012 02:15, Iavor Diatchki wrote:

exactly what git's submodule machinery does, so it seems
pointless to

   implement the functionality which is already there with
a standard
   interface.  Thoughts?


http://hackage.haskell.org/__trac/ghc/wiki/DarcsConversion#__Theperspectiveonsubmodules

http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Theperspectiveonsubmodules


I have seen this.  Our custom fingerprint solution has the
exact same
drawbacks (because it does the exact same thing as sub-modules),
and in
addition it has the drawback of
1. being a custom non-standard solution,
2. it is not obvious where to find the fingerprint
associated with
a particular branch (which is what lead to my question in the
first place).



Well, it doesn't quite have the same drawbacks as submodules,
because our solution places a burden only on someone who wants to
recover a particular repository state, rather than on everyone doing
development.

I think it's worth keeping an eye on submodules in case they fix the
gotchas in the UI, but at the moment it looks like we'd have a lot
of confused developers, lost work and accidental breakages due to
people not understanding how submodules work or forgetting to jump
through the correct hoops.

I'm not saying fingerprints are a good solution, obviously they only
solve a part of the problem, but the current tooling for submodules
leaves a lot to be desired.

Cheers,
 Simon





___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: memory fragmentation with ghc-7.6.1

2012-09-21 Thread Simon Marlow

On 21/09/2012 04:07, John Lato wrote:

Yes, that's my current understanding.  I see this with ByteString and
Data.Vector.Storable, but not
Data.Vector/Data.Vector.Unboxed/Data.Text.  As ByteStrings are pretty
widely used for IO, I expected that somebody else would have
experienced this too.

I would expect some memory fragmentation with pinned memory, but the
change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to
several GB).


This was a side-effect of the improvements we made to the allocation of 
pinned objects, which ironically was made to avoid fragmentation of a 
different kind.  What is happening is that the memory for the pinned 
objects is now taken from the nursery, and so the nursery has to be 
replenished after GC.  When we allocate memory for the nursery we like 
to allocate it in big contiguous chunks, because that works better with 
automatic prefecthing, but the memory is horribly fragmented due to all 
the pinned objects, so the large allocation has to be satisfied from the OS.


The fix is not to allocate large chunks for the nursery unless there are 
no small chunks to use up, so I've implemented that.


Happily I also found two other bugs while looking for this one, one of 
which was a performance bug which caused this benchmark to run 10x 
slower than it should have been!  The other bug was a recent regression 
causing it to misreport the amount of allocated memory.


Thanks for the report.

Cheers,
Simon




John L.

On Fri, Sep 21, 2012 at 10:53 AM, Carter Schonwald
carter.schonw...@gmail.com wrote:

So the problem is only with the data structures on the heap that are pinned
in place to play nice with C?

I'd be curious to understand the change too, though per se pinned memory (a
la storable or or bytestring) will by definition cause memory fragmentation
in a gc'd lang as a rule,  (or at least one like Haskell).
-Carter

On Thu, Sep 20, 2012 at 8:59 PM, John Lato jwl...@gmail.com wrote:


Hello,

We've noticed that some applications exhibit significantly worse
memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading
to out of memory errors in some cases.  Running one app with +RTS -s,
I see this:

ghc-7.4
  525,451,699,736 bytes allocated in the heap
   53,404,833,048 bytes copied during GC
   39,097,600 bytes maximum residency (2439 sample(s))
1,547,040 bytes maximum slop
  628 MB total memory in use (0 MB lost due to fragmentation)

ghc-7.6
512,535,907,752 bytes allocated in the heap
   53,327,184,712 bytes copied during GC
   40,038,584 bytes maximum residency (2391 sample(s))
1,456,472 bytes maximum slop
 3414 MB total memory in use (2744 MB lost due to
fragmentation)

The total memory in use (consistent with 'top's output) is much higher
when built with ghc-7.6, due entirely to fragmentation.

I've filed a bug report
(http://hackage.haskell.org/trac/ghc/ticket/7257,
http://hpaste.org/74987), but I was wondering if anyone else has
noticed this?  I'm not entirely sure what's triggering this behavior
(some applications work fine), although I suspect it has to do with
allocation of pinned memory.

John L.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users





___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How do I build GHC 7.6 from source?

2012-09-20 Thread Simon Marlow

On 19/09/2012 02:15, Iavor Diatchki wrote:


   exactly what git's submodule machinery does, so it seems pointless to

  implement the functionality which is already there with a standard
  interface.  Thoughts?


http://hackage.haskell.org/trac/ghc/wiki/DarcsConversion#Theperspectiveonsubmodules


I have seen this.  Our custom fingerprint solution has the exact same
drawbacks (because it does the exact same thing as sub-modules), and in
addition it has the drawback of
   1. being a custom non-standard solution,
   2. it is not obvious where to find the fingerprint associated with
a particular branch (which is what lead to my question in the first place).



Well, it doesn't quite have the same drawbacks as submodules, because 
our solution places a burden only on someone who wants to recover a 
particular repository state, rather than on everyone doing development.


I think it's worth keeping an eye on submodules in case they fix the 
gotchas in the UI, but at the moment it looks like we'd have a lot of 
confused developers, lost work and accidental breakages due to people 
not understanding how submodules work or forgetting to jump through the 
correct hoops.


I'm not saying fingerprints are a good solution, obviously they only 
solve a part of the problem, but the current tooling for submodules 
leaves a lot to be desired.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell] ANNOUNCE: GHC version 7.6.1

2012-09-07 Thread Simon Marlow

On 06/09/2012 21:10, Christian Hoener zu Siederdissen wrote:

Hi Ian,

thanks for the info about 7.8. Just to be clear, the new codegen
apparently saved my runtimes for the presentation on tuesday. \My\ new
code was slower than my old code. The new code generator fixed that,
giving me equal running times with much cooler features. I currently
assume (without having checked at all) due to dead variable elimination.

So if it is getting better, I'd be really really happy.


Just to be clear - you're using -fnew-codegen, with GHC 7.6.1?

There were a handful of bugfixes to the new codegen path that didn't 
make it into 7.6.1, so I wouldn't rely on it.


Cheers,
Simon



Gruss,
Christian

* Ian Lynagh i...@well-typed.com [06.09.2012 22:00]:

On Thu, Sep 06, 2012 at 06:32:38PM +0200, Christian Hoener zu Siederdissen 
wrote:

Awesome,

I have been playing with GHC 7.6.0 until today and been very happy. Btw.
isn't this the version that officially includes -fnew-codegen / HOOPL?

Because the new codegen is optimizing the my ADPfusion library nicely.
I lost 50% speed with new features, gained 100% with new codegen,
meaning new features come for free ;-)


I suspect that you'll find that the new codegen doesn't work 100%
perfectly in 7.6, although I don't know the details - perhaps it just
isn't as fast as it could be. It'll be the default in 7.8, though.


Thanks
Ian


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: Adding support for an API-since-version-attribute to Haddock?

2012-09-05 Thread Simon Marlow

On 05/09/2012 09:10, Herbert Valerio Riedel wrote:

Evan Laforge qdun...@gmail.com writes:


Would such an enhancement to Haddock be worthwhile or is it a bad idea?
Has such a proposal come up in the past already? Are there alternative
approaches to consider?


It would be even cooler to automatically figure them out from the
hackage history.


I don't think this can ever be reliable if it is to detect more than
mere additions of new functions at the source-level.

Just modifying a function (w/o changing the type-signature) doesn't mean
its semantics have to change -- could be just refactoring or optimizing
that lead to the implementation (but not the semantics) changing.

Also, /not/ modifying a function doesn't necessarily mean that its
semantics are unchanged, as it could be just some semantic change to a
function outside the function (but depended upon by the function) that
could cause the function to modify its semantics.

Thus, IMHO it will always require some manual intervention by the
developer (although it may be surely aided by tools helping with
analysing the source-code versioning history and pointing out possible
candidates)


If the semantics of a function changes, then you'll want to note that in 
the documentation for the function, rather than just slapping on a 
since attribute.  I think automatically deriving since and changed 
information from the module interface makes perfect sense, indeed it is 
something we've had on the Haddock todo list for a long time.


You could do it based on the information in the Haddock interface files, 
but of course you need to have the interfaces for all versions 
available.  Perhaps it should be standard practice to check these into 
the repo when doing a release.  In any case it wouldn't be hard to 
generate the Haddock interfaces for old releases of a package from the 
Hackage archives.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Small Int and Char closures in GHCi

2012-08-31 Thread Simon Marlow

On 30/08/2012 12:29, Joachim Breitner wrote:

Hi,

I am preparing a talk about the details of how data and programs look in
memory in Haskell (well, GHC). When explaining the memory consumption of
a large String, I wanted to show the effect of short-int-replacement
that happens in
http://hackage.haskell.org/trac/ghc/browser/rts/sm/Evac.c#L550

I use my ghc-heap-view-package to observe the heap. This programs shows
the effect:

 import GHC.HeapView
 import System.Mem

 main = do
 let hallo = hallo
 mapM_ (\x - putStrLn $ show x ++ :  ++ show (asBox x))
 hallo
 performGC
 mapM_ (\x - putStrLn $ show x ++ :  ++ show (asBox x))
 hallo

gives, as expected:

$ ./SmallChar
'h': 0x7f2811e042a8/1
'a': 0x7f2811e08128/1
'l': 0x7f2811e09ef0/1
'l': 0x7f2811e0bcd8/1
'o': 0x7f2811e0db10/1
'h': 0x006d9bd0/1
'a': 0x006d9b60/1
'l': 0x006d9c10/1
'l': 0x006d9c10/1
'o': 0x006d9c40/1

but in GHCi, it does not work:

$ runhaskell SmallChar.hs
'h': 0x7f5334623d58/1
'a': 0x7f5334626208/1
'l': 0x7f5334627fc0/1
'l': 0x7f5334629dc0/1
'o': 0x7f533462bba8/1
'h': 0x7f533381a1c8/1
'a': 0x7f5333672e30/1
'l': 0x7f533381a408/1
'l': 0x7f533381a6b8/1
'o': 0x7f533389c5d0/1

Note that the GC does evacuate the closures, as the pointers change. Why
are these not replaced by the static ones here?


Probably because GHCi has a dynamically loaded copy of the base package, 
so the pointer comparisons that the GC is doing do not match the 
dynamically-loaded I# and C# constructors.


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Comparing StableNames of different type

2012-08-28 Thread Simon Marlow

On 24/08/2012 10:39, Emil Axelsson wrote:

2012-08-24 11:18, Emil Axelsson skrev:

2012-08-24 11:08, Simon Marlow skrev:

On 24/08/2012 07:39, Emil Axelsson wrote:

Hi!

Are there any dangers in comparing two StableNames of different type?

   stEq :: StableName a - StableName b - Bool
   stEq a b = a == (unsafeCoerce b)

I could guard the coercion by first comparing the type representations,
but that would give me a `Typeable` constraint that would spread
throughout the code.


I think that's probably OK.


OK, good! How about putting this function in the library so that people
don't have to hack it up themselves?


Oops, I did not intend to sound suggestive :) I was more wondering if
people think it would be a good idea. If so, I can make a proposal.


Ok, I've added it.  It will be in GHC 7.8.1.

Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Comparing StableNames of different type

2012-08-24 Thread Simon Marlow

On 24/08/2012 07:39, Emil Axelsson wrote:

Hi!

Are there any dangers in comparing two StableNames of different type?

   stEq :: StableName a - StableName b - Bool
   stEq a b = a == (unsafeCoerce b)

I could guard the coercion by first comparing the type representations,
but that would give me a `Typeable` constraint that would spread
throughout the code.


I think that's probably OK.  It should be safe even if the types are 
different, but I presume you expect the types to be the same, since 
otherwise the comparison would be guaranteed to return False, right?


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: minor errors in Haskell 2010 report

2012-08-24 Thread Simon Marlow

On 23/08/2012 17:09, Ramana Kumar wrote:


M is not the current module, in which case the only way that an
entity could be in scope in the current module is if it was exported
by M and subsequently imported by the current module, so adding
exported by module M is superfluous.


In this case, what you said is not quite correct: an entity could be in
scope in the current module if it was defined in the current module, or
if it was imported from some other module (not M). These are the two
kinds of entity I thought of when I first read the sentence, and was
expecting clarification that only ones imported from M are to be considered.


That wouldn't be a clarification, it would be a change in the 
definition.  Remember that entities that are in scope as M.x might not 
come from module M.  Consider:


import X as M

now saying module M in the export list will export everything from X. 
 Furthermore, we can export many modules at the same time:


import X as M
import Y as M
import M

and then saying module M in the export list will export all of the 
entities from modules X, Y and M.


There was lots of discussion about this in the past, for some tricky 
issues see e.g.


http://www.haskell.org/pipermail/haskell/2001-August/007767.html
http://www.haskell.org/pipermail/cvs-ghc/2002-November/015880.html
http://www.haskell.org/pipermail/haskell/2002-November/010662.html

Cheers,
Simon


___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Long compilation times when profiling a large CAF

2012-08-22 Thread Simon Marlow

On 21/08/2012 19:14, Conal Elliott wrote:

I'm looking for help with crazy-long compile times when using GHC with
profiling. A source file at work has a single 10k line top-level
definition, which is a CAF. With -prof auto-all or an explicit SCC,
compilation runs for 8 hours on a fast machine with the heap growing to
13GB before being killed. Without profiling, it compiles in a few minutes.

The big CAFs are auto-generated and not of my making, so I'm hoping for
a solution other than stop making big CAFs.


We could take a look.  Can you make a self-contained example that 
demonstrates the problem?


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: +RTS -S heap reporting oddity

2012-08-20 Thread Simon Marlow

On 17/08/2012 17:08, Wolfram Kahl wrote:

During one of my long Agda runs (with GHC-7.4.2), I observed the following
output, with run-time options

+RTS -S -H11G -M11G -K256M

:

7694558208  30623864 3833166176  0.11  0.11  234.75  234.7900  (Gen:  0)
7678904688  29295168 3847737784  0.11  0.11  242.04  242.0900  (Gen:  0)
7662481840  29195736 3861451856  0.11  0.11  249.31  249.3500  (Gen:  0)
7647989280  26482704 3872463688  0.12  0.12  256.64  256.6800  (Gen:  0)
4609865360  25764016 3886000448  0.09  0.09  261.04  261.0900  (Gen:  0)
4581294920  19435032 3891512272  0.07  0.07  265.37  265.4200  (Gen:  0)
4568757088  21095864 3902286000  0.08  0.08  269.70  269.7400  (Gen:  0)
4546421608  21618856 3913923976  0.09  0.09  274.04  274.0900  (Gen:  0)
452151 2894668056 3484748224  7.63  7.63  285.94  285.9800  (Gen:  
1)
8085358392  23776128 3499185336  0.11  0.11  293.49  293.5300  (Gen:  0)
8064630856  32055112 3515876576  0.13  0.13  300.91  300.9500  (Gen:  0)
8040500112  31477608 3528105088  0.12  0.12  308.37  308.4100  (Gen:  0)
8031456296  29641328 3540632456  0.11  0.11  315.83  315.8700  (Gen:  0)
8018447264  30187208 3554339600  0.12  0.12  323.26  323.3100  (Gen:  0)

To my untrained eye, this seems to be saying the following:
In the first 4 lines, the heap runs (almost) full before (minor) collections.
In lines 5 to 9 it apparently leaves 3G empty before collection,
but ``those 3G'' then appear on line 9 in the ``amount of data copied during 
(major) collection''
column, and after that it runs up to fill all 11G again before the next few 
minor collections.

What is really going on here?
(Previously I had never seen such big numbers in the second column on major 
collections.)


It looks like on line 5, the GC thought it was going to do a major 
collection the next time, so it left 3G free to copy the contents of the 
old generation.  But then it didn't do a major GC until line 9.  I've 
just checked the code, and I think this might be due to a slight 
inaccuracy in the way that we estimate whether the next GC will be a 
major one, and at these huge sizes the discrepancy becomes significant. 
 Thanks for pointing it out, I'll fix it to use the same calculation in 
both places.


Cheers,
Simon





Wolfram


P.S.: Same effect again, but more dramatic, later during the same Agda run:

448829488   4864536 5710435424  0.02  0.02 1422.80 1422.9000  (Gen:  0)
445544064   3251712 5710248752  0.01  0.01 1423.23 1423.3200  (Gen:  0)
450236784   4148864 5712696848  0.02  0.02 1423.68 1423.7700  (Gen:  0)
445240152   3828120 5713606328  0.02  0.02 1424.10 1424.1900  (Gen:  0)
443285616   5906448 5717731864  0.02  0.02 1424.52 1424.6100  (Gen:  0)
430698248 4773500032 5363214440  9.30  9.30 1434.21 1434.3000  (Gen:  1)
6148455592  13490304 5374609848  0.07  0.07 1439.83 1439.9200  (Gen:  0)
6185350848  27419744 5389326896  0.11  0.11 1445.50 1445.5900  (Gen:  0)
6168805736  23069072 5398725784  0.11  0.11 1451.22 1451.3200  (Gen:  0)
6157744328  23451872 5408370152  0.09  0.09 1456.93 1457.0300  (Gen:  0)
6151715272  25739584 5421044592  0.11  0.11 1462.62 1462.7200  (Gen:  0)
6132589488  24541688 5428809632  0.10  0.10 1468.26 1468.3700  (Gen:  0)

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Platform Versioning Policy: upper bounds are not our friends

2012-08-20 Thread Simon Marlow

On 15/08/2012 21:44, Johan Tibell wrote:

On Wed, Aug 15, 2012 at 1:02 PM, Brandon Allbery allber...@gmail.com wrote:

So we are certain that the rounds of failures that led to their being
*added* will never happen again?


It would be useful to have some examples of these. I'm not sure we had
any when we wrote the policy (but Duncan would know more), but rather
reasoned our way to the current policy by saying that things can
theoretically break if we don't have upper bounds, therefore we need
them.


I haven't read the whole thread (yet), but the main motivating example 
for upper bounds was when we split the base package (GHC 6.8) - 
virtually every package on Hackage broke.  Now at the time having upper 
bounds wouldn't have helped, because you would have got a depsolver 
failure instead of a type error.  But following the uproar about this we 
did two things: the next release of GHC (6.10) came with two versions of 
base, *and* we recommended that people add upper bounds.  As a result, 
packages with upper bounds survived the changes.


Now, you could argue that we're unlikely to do this again.  But the main 
reason we aren't likely to do this again is because it was so painful, 
even with upper bounds and compatibility libraries.  With better 
infrastructure and tools, *and* good dependency information, it should 
be possible to do significant reorganisations of the core packages.


As I said in my comments on Reddit[1], I'm not sure that removing upper 
bounds will help overall.  It removes one kind of failure, but 
introduces a new kind - and the new kind is scary, because existing 
working packages can suddenly become broken as a result of a change to a 
different package.  Will it be worse or better overall?  I have no idea. 
 What I'd rather see instead though is some work put into 
infrastructure on Hackage to make it easy to change the depdendencies on 
existing packages.


Cheers,
Simon

[1] 
http://www.reddit.com/r/haskell/comments/ydkcq/pvp_upper_bounds_are_not_our_friends/c5uqohi


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Non-updateable thunks

2012-08-08 Thread Simon Marlow

On 03/08/2012 10:29, Joachim Breitner wrote:

Hi Simon,

Am Freitag, den 03.08.2012, 09:28 +0100 schrieb Simon Marlow:

My question is: Has anybody worked in that direction? And are there any
fundamental problems with the current RTS implementation and such
closures?


Long ago GHC used to have an update analyser which would detect some
thunks that would never be re-entered and omit the update frame on them.
   I wrote a paper about this many years ago, and there were other people
working on similar ideas, some using types (e.g. linear types) - google
for update avoidance.  As I understand it you want to omit doing some
updates in order to avoid space leaks, which is slightly different.


Thanks for the pointers, I will have a look. Why was the update analyser
removed from GHC?


It was expensive and had very little benefit - most thunks that were 
deemed non-updatable by the analysis were also strict, so update 
analysis did very little in the presence of strictness analysis.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: API for looking-up/retrieving Haddock comments?

2012-08-08 Thread Simon Marlow

On 04/08/2012 08:33, Herbert Valerio Riedel wrote:

Simon Hengel s...@typeful.net writes:

[...]


I have the following in my .ghci:

 -- hoogle integration
 :def hoogle \q - return $ :! hoogle --color=true --count=15   \ ++ q ++ 
\
 :def doc\q - return $ :! hoogle --color=true --info   \ ++ q ++ 
\


[...]

thanks, this already looks very promising; there's just a few minor issues
I'm a bit dissatisfied with the GHCi integration:

  1. it doesn't take into account the currently visible module namespaces that 
GHCi
 has currently loaded (as opposed to `:info` and `:type`):

,
| Prelude import Data.IntMap
|
| Prelude Data.IntMap :info fromList
| fromList :: [(Key, a)] - IntMap a -- Defined in `Data.IntMap'
|
| Prelude Data.IntMap :type  fromList
| fromList :: [(Key, a)] - IntMap a
|
| Prelude Data.IntMap :doc fromList
| Searching for: fromList
| Data.HashTable fromList :: Eq key = (key - Int32) - [(key, val)] - IO 
(HashTable key val)
|
| Convert a list of key/value pairs into a hash table. Equality on keys
| is taken from the Eq instance for the key type.
|
| From package base
| fromList :: Eq key = (key - Int32) - [(key, val)] - IO (HashTable key val)
|
| Prelude Data.IntMap
`

  2. tab-completion (as it works for `:type` and `:doc`) doesn't extend
 to `:doc`


I guess both items could be improved upon by extending GHCi to provide
an additional `:def` facility tailored to Haskell symbols allowing to
pass more meta-information (such as package and module information) into
the resulting command string... would something like that have any
chance of being accepted upstream?


I think it would make more sense to just add :doc to the GHCi front-end, 
relying on the user having already installed hoogle.  We could give a 
sensible error message if you don't have Hoogle installed.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Non-updateable thunks

2012-08-03 Thread Simon Marlow

On 01/08/2012 11:38, Joachim Breitner wrote:

Hello,

I’m still working on issues of performance vs. sharing; I must assume
some of the people here on the list must have seen my dup-paper¹ as
referees.

I’m now wondering about a approach where the compiler (either
automatically or by user annotation; I’ll leave that question for later)
would mark some thunks as reentrant, i.e. simply skip the blackholing
and update frame pushing. A quick test showed that this should work
quite well, take the usual example:

 import System.Environment
 main = do
 a - getArgs
 let n = length a
 print n
 let l = [n..3000]
 print $ last l + last l

This obviously leaks memory:

 $ ./Test +RTS -t
 0
 6000
 ghc: 2400054760 bytes, 4596 GCs, 169560494/935354240 avg/max
 bytes residency (11 samples), 2121M in use, 0.00 INIT (0.00
 elapsed), 0.63 MUT (0.63 elapsed), 4.28 GC (4.29 elapsed) :ghc


I then modified the the assembly (a crude but effective way of testing
this ;-)) to not push a stack frame:

$ diff -u Test.s Test-modified.s
--- Test.s  2012-08-01 11:30:00.0 +0200
+++ Test-modified.s 2012-08-01 11:29:40.0 +0200
@@ -56,20 +56,20 @@
leaq -40(%rbp),%rax
cmpq %r15,%rax
jb .LcpZ
-   addq $16,%r12
-   cmpq 144(%r13),%r12
-   ja .Lcq1
-   movq $stg_upd_frame_info,-16(%rbp)
-   movq %rbx,-8(%rbp)
+   //addq $16,%r12
+   //cmpq 144(%r13),%r12
+   //ja .Lcq1
+   //movq $stg_upd_frame_info,-16(%rbp)
+   //movq %rbx,-8(%rbp)
movq $ghczmprim_GHCziTypes_Izh_con_info,-8(%r12)
movq $3000,0(%r12)
leaq -7(%r12),%rax
-   movq %rax,-24(%rbp)
+   movq %rax,-8(%rbp)
movq 16(%rbx),%rax
-   movq %rax,-32(%rbp)
-   movq $stg_ap_pp_info,-40(%rbp)
+   movq %rax,-16(%rbp)
+   movq $stg_ap_pp_info,-24(%rbp)
movl $base_GHCziEnum_zdfEnumInt_closure,%r14d
-   addq $-40,%rbp
+   addq $-24,%rbp
jmp base_GHCziEnum_enumFromTo_info
  .Lcq1:
movq $16,192(%r13)

Now it runs fast and slim (and did not crash on the first try, which I
find surprising after hand-modifying the assembly code):

 $ ./Test +RTS -t
 0
 6000
 ghc: 4800054840 bytes, 9192 GCs, 28632/28632 avg/max bytes
 residency (1 samples), 1M in use, 0.00 INIT (0.00 elapsed), 0.73
 MUT (0.73 elapsed), 0.04 GC (0.04 elapsed) :ghc


My question is: Has anybody worked in that direction? And are there any
fundamental problems with the current RTS implementation and such
closures?


Long ago GHC used to have an update analyser which would detect some 
thunks that would never be re-entered and omit the update frame on them. 
 I wrote a paper about this many years ago, and there were other people 
working on similar ideas, some using types (e.g. linear types) - google 
for update avoidance.  As I understand it you want to omit doing some 
updates in order to avoid space leaks, which is slightly different.


The StgSyn abstract syntax has an UpdateFlag on each StgRhs which lets 
you turn off the update, and I believe the code generator will respect 
it although it isn't actually ever turned off at the moment.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Access to the Parse Tree for Expressions

2012-07-30 Thread Simon Marlow

On 27/07/2012 15:18, Simon Fowler wrote:

Dear all,

I'm currently working on a project which would benefit from access to
the parse tree - ideally, we would like to deconstruct an expression
into its constituent types. Currently we are using the exprType function
to return the type of an expression, but this is limited as it only
works for one 'level', per se.

An example would be as follows. Currently, if we were to type in map,
we would be given a result of type Type, which we could further
deconstruct into the various subtypes, as so:

map :: (a - b) - [a] - [b]
split into:
 (a - b)
 [a]
 [b]


This is sufficient for the basics of the project at the moment, but
ideally, we would like to use the parse tree to analyse the structure of
expressions and thereby the types of the corresponding sub-expressions.
Take foldr drop for example; we can determine the different types for
the different functions with exprType:

foldr :: (a - b - b) - b - [a] - [b]
drop :: Int - [c] - [c]

Now we can call exprType on the application foldr drop:

foldr drop :: (Int - [c] - [c]) -  [c] - [Int] - [c]
which would be split, according to our current code, into:
 arg1: (Int - [c] - [c])
 arg2: [c]
 arg3: [Int]
 result: [c]

The problem here is that we are unable to separate the drop from the
foldr. The project we are working on involves composing and
decomposing expressions, and it is important that we can decompose the
type of foldr drop into the types of the sub-expressions foldr and
drop recursively.

Ideally, we would like to construct a data structure which is much more
akin to a parse tree with type annotations, in this case:
 PTreeApp
 (PTreeExpr foldr  [| (a - b - b) - b - [a] - [b] |])
 (PTreeExpr drop   [| Int - [c] - [c] |])
 [| (Int - [c] - [c]) - [c] - [Int] - [c] |]
where the types in semantic brackets are a structural representation
(e.g. TypeRep.Type) of the given types.


Looking at the code of exprType, a call is firstly made to hscTcExpr,
which in turn makes a call to hscParseStmt to return the parse tree.
This would seem to provide the functionality that we would require, in
that it would give access to a type-checkable parsed statement, but it
doesn't seem to be exported by HscMain. Is there another function, which
is accessible through the API, that would support this or something similar?

I am far from an expert using the GHC API, so apologies if I am doing
something grossly wrong or have missed something blatantly obvious.


What you want is unfortunately not provided by the GHC API at the 
moment.  exprType only returns the Type; I think what you want is the 
typechecked abstract syntax for the expression, so that you can traverse 
and analyse it.


It wouldn't be hard to do this.  TcRnDriver.tcRnExpr needs to return the 
LHsExpr Id (maybe it needs to be zonked first), and then we need to 
expose an API to provide access to this via the GHC module.  Feel free 
to have a go yourself, or make a ticket and we'll hopefully get around 
to it.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Simplifying Core using GHC API

2012-07-30 Thread Simon Marlow
You can use compileToCoreSimplified to get the optimised Core for the 
module, although that includes the other steps.  We ought to have a 
separate API to go from ModGuts to CoreModule, but currently that 
doesn't exist (it's built into compileToCoreSimplified).


Cheers,
Simon

On 28/07/2012 06:06, Ranjit Jhala wrote:

ps: I should add I already know how to get from source to CoreExpr e.g. by:

mod_guts - coreModule `fmap` (desugarModuleWithLoc =
typecheckModule = parseModule modSummary)

Its the simplification, in particular, inlining steps that I'm after.
Thanks! Ranjit.

On Fri, Jul 27, 2012 at 10:04 PM, Ranjit Jhala jh...@cs.ucsd.edu
mailto:jh...@cs.ucsd.edu wrote:

Hi all,

can anyone point me to the GHC API functions that I can use to
trigger the
various inlining simplifications? (i.e. to get the inlined CoreExprs
that one
gets with the -ddump-simpl flag?)

Many thanks in advance!,

Ranjit Jhala.




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users





___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Building GHC on NetBSD/amd64

2012-07-30 Thread Simon Marlow

On 29/07/2012 07:41, iquiw wrote:

I am trying to build GHC on NetBSD/amd64.

First, I built GHC-6.12.3 by porting from OpenBSD/amd64.
After that, trying to build several versions (6.12.3, 7.0.4, 7.4.2) of
GHC by the stage2 compiler.

Build itself succeeded and compiling by the ghc seems no problem so far.
However, ghci (all versions) crashes always by segmentation fault.

-
$ ghci
GHCi, version 7.4.2: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
zsh: segmentation fault (core dumped)  /usr/local/ghc-7.4.2/bin/ghci
-

backtrace by gdb shows only s2xW_info ().

ktrace (kernel trace) output is as follows.
-
   4494  1 ghc  CALL  __sigprocmask14(3,0x7f7f49e0,0)
   4494  1 ghc  RET   __sigprocmask14 0
   4494  1 ghc  CALL  _lwp_self
   4494  1 ghc  RET   _lwp_self 1
   4494  1 ghc  CALL  __sigprocmask14(1,0x7f7f4970,0x7f7f49e0)
   4494  1 ghc  RET   __sigprocmask14 0
   4494  1 ghc  CALL  __sigprocmask14(3,0x7f7f49e0,0)
   4494  1 ghc  RET   __sigprocmask14 0
   4494  1 ghc  CALL  issetugid
   4494  1 ghc  RET   issetugid 0
   4494  1 ghc  PSIG  SIGSEGV SIG_DFL: code=SEGV_MAPERR,
addr=0xf610bfa5, trap=6)
   4494  2 ghc  RET   __kevent50 -1 errno 4 Interrupted system call
   4494  3 ghc  RET   ___lwp_park50 -1 errno 4 Interrupted system call
   4494  1 ghc  NAMI  ghc.core
-


What should I check next?


I expect the dynamic linker needs something specific NetBSD/amd64, but 
it's hard to tell exactly what.  Someone needs to debug the failure and 
analyse the problem.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: thread blocked indefinitely in an MVar operation in unsafePerformIO

2012-07-30 Thread Simon Marlow

On 30/07/2012 15:30, Marco Túlio Gontijo e Silva wrote:

Hi.

I'm having a problem calling logM from hsLogger inside
unsafePerformIO.  I have described the problem in Haskell-cafe, so
I'll avoid repeating it here:

http://www.haskell.org/pipermail/haskell-cafe/2012-July/102545.html

I've had this problem both with GHC 7.4.1 and 7.4.2.  Do you have any
suggestion?


Is it possible that the String you are passing to uLog contains 
unevaluated calls to uLog itself, which would thereby create a deadlock 
as uLog tries to take the MVar that is already being held by the same 
thread?


We once had this problem with hPutStr where if the argument string 
contained a call to trace, which is unsafePerformIO $ hPutStr, the 
result would be deadlock.  Now hPutStr has to go to some trouble to 
evaluate the input string while not holding the lock.


Cheers,
Simon




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How to describe this bug?

2012-07-11 Thread Simon Marlow

On 11/07/2012 08:36, Christian Maeder wrote:

Hi,

I think this bug is serious and should be turned into a ticket on
http://hackage.haskell.org/trac/ghc/
Would you do so Sönke?

The abstraction of floats (Float or Double) is broken if equality
considers (random and invisible) excess bits that are not part of the
ordinary sign, exponent and fraction representation.

It should also hold: show f1 == show f2  = f1 == f2
and: read (show f) == f
(apart from NaN)

Why do you doubt that we'll ever fix this, Simon?


Several reasons:

 - the fix hurts performance badly, because you have to store floats
   into memory after every operation. (c.f. gcc's -ffloat-store option)
 - the fix is complicated
 - good workarounds exist (-msse2)
 - it is rarely a problem


What is the problem to disable -fexcess-precision or enable -msse2 (on
most machines) by default?


-fexcess-precision cannot be disabled on x86 (that is the bug).

-msse2 is not supported on all processors, so we can't enable it by default.

Cheers,
Simon




Cheers Christian

Am 10.07.2012 14:33, schrieb Simon Marlow:

On 10/07/2012 12:21, Aleksey Khudyakov wrote:

On Tue, Jul 10, 2012 at 3:06 PM, Sönke Hahn sh...@cs.tu-berlin.de
wrote:

I've attached the code. The code does not make direct use of
unsafePerformIO. It uses QuickCheck, but I don't think, this is a
QuickCheck bug. The used Eq-instance is the one for Float.

I've only managed to reproduce this bug on 32-bit-linux with ghc-7.4.2
when compiling with -O2.


It's expected behaviour with floats. Calculations in FPU are done in
maximul precision available.  If one evaluation result is kept in
registers
and another has been moved to memory and rounded and move back to
registers
number will be not the same indeed.

In short. Never compare floating point number for equality unless you
really know
what are you doing.


I consider it a bug, because as the original poster pointed out it is a
violation of referential transparency.  What's more, it is *not* an
inherent property of floating point arithmetic, because if the compiler
is careful to do all the operations at the correct precision then you
can get determinstic results.  This is why GHC has the
-fexcess-precision flag: you have to explicitly ask to break referential
transparency.

The bug is that the x86 native code generator behaves as if
-fexcess-precision is always on.  I seriously doubt that we'll ever fix
this bug: you can get correct behaviour by enabling -msse2, or using a
64-bit machine.  I don't off-hand know what the LLVM backend does here,
but I would guess that it has the same bug.

Cheers,
 Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users






___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Fwd: Memory corruption issues when using newAlignedPinnedByteArray, GC kicking in?]

2012-07-11 Thread Simon Marlow

On 10/07/2012 23:03, Nicolas Trangez wrote:

All,

I sent this mail to Haskell Cafe earlier today, and was pointed [1] at
this list. As such...

Any help/advice would be greatly appreciated!


It looks like you're making a ForeignPtr from the Addr# or Ptr that 
points to the contents of the ByteArray#.  Since this ForeignPtr doesn't 
keep the original ByteArray# alive, the GC will collect it.  You need to 
keep a reference to the ByteArray# too.


Basically you need a version of mallocForeignPtrBytes that has supports 
alignment.  Unfortunately it's not possible to write one because the 
internals of ForeignPtrContents are not exported - we had a recent 
ticket about that (http://hackage.haskell.org/trac/ghc/ticket/7012) and 
in 7.6.1 we will export the necessary internals.  If you want we could 
also add mallocForeignPtrAlignedBytes - please send a patch.


Cheers,
Simon





Thanks,

Nicolas

[1] http://www.haskell.org/pipermail/haskell-cafe/2012-July/102242.html

 Forwarded Message 

From: Nicolas Trangez nico...@incubaid.com
To: haskell-c...@haskell.org
Cc: Roman Leshchinskiy r...@cse.unsw.edu.au
Subject: Memory corruption issues when using
newAlignedPinnedByteArray, GC kicking in?
Date: Tue, 10 Jul 2012 19:20:01 +0200

All,

While working on my vector-simd library, I noticed somehow memory I'm
using gets corrupted/overwritten. I reworked this into a test case, and
would love to get some help on how to fix this.

Previously I used some custom FFI calls to C to allocate aligned memory,
which yields correct results, but this has a significant (+- 10x)
performance impact on my benchmarks. Later on I discovered the
newAlignedPinnedByteArray# function, and wrote some code using this.

Here's what I did in the test case: I created an MVector instance, with
the exact same implementation as vector's
Data.Vector.Storable.Mutable.MVector instance, except for basicUnsafeNew
where I pass one more argument to mallocVector [1].

I also use 3 different versions of mallocVector (depending on
compile-time flags):

mallocVectorOrig [2]: This is the upstream version, discarding the
integer argument I added.

Then here's my first attempt, very similar to the implementation of
mallocPlainForeignPtrBytes [3] at [4] using GHC.* libraries.

Finally there's something similar at [5] which uses the 'primitive'
library.

The test case creates vectors of increasing size, then checks whether
they contain the expected values. For the default implementation this
works correctly. For both others it fails at some random size, and the
values stored in the vector are not exactly what they should be.

I don't understand what's going on here. I suspect I lack a reference
(or something along those lines) so GC kicks in, or maybe the buffer
gets relocated, whilst it shouldn't.

Basically I'd need something like

GHC.ForeignPtr.mallocPlainAlignedForeignPtrBytes :: Int - Int - IO
(ForeignPtr a)

Thanks,

Nicolas

[1] https://gist.github.com/3084806#LC37
[2] https://gist.github.com/3084806#LC119
[3]
http://hackage.haskell.org/packages/archive/base/latest/doc/html/src/GHC-ForeignPtr.html
[4] https://gist.github.com/3084806#LC100
[5] https://gist.github.com/3084806#LC81






___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users





___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How to describe this bug?

2012-07-11 Thread Simon Marlow

On 11/07/2012 09:51, Christian Maeder wrote:

Am 11.07.2012 10:25, schrieb Simon Marlow:

On 11/07/2012 08:36, Christian Maeder wrote:

Hi,

I think this bug is serious and should be turned into a ticket on
http://hackage.haskell.org/trac/ghc/
Would you do so Sönke?

The abstraction of floats (Float or Double) is broken if equality
considers (random and invisible) excess bits that are not part of the
ordinary sign, exponent and fraction representation.

It should also hold: show f1 == show f2  = f1 == f2
and: read (show f) == f
(apart from NaN)

Why do you doubt that we'll ever fix this, Simon?


Several reasons:

  - the fix hurts performance badly, because you have to store floats
into memory after every operation. (c.f. gcc's -ffloat-store option)


If we sacrifice correctness for performance then we should clearly
document this!


I will document it in the User's Guide along with the other known bugs.


What is the problem to disable -fexcess-precision or enable -msse2 (on
most machines) by default?


-fexcess-precision cannot be disabled on x86 (that is the bug).

-msse2 is not supported on all processors, so we can't enable it by
default.


Can't configure find this out?


Configure will detect whether the machine you're building on supports 
-msse2, but not whether the machine that you will eventually *run* the 
code on does.  For instance, when building GHC for distribution we have 
to assume that the target machine does not support SSE2, so all the 
libraries must be built without -msse2.


Cheers,
Simon




C.


Cheers,
 Simon




Cheers Christian

Am 10.07.2012 14:33, schrieb Simon Marlow:

On 10/07/2012 12:21, Aleksey Khudyakov wrote:

On Tue, Jul 10, 2012 at 3:06 PM, Sönke Hahn sh...@cs.tu-berlin.de
wrote:

I've attached the code. The code does not make direct use of
unsafePerformIO. It uses QuickCheck, but I don't think, this is a
QuickCheck bug. The used Eq-instance is the one for Float.

I've only managed to reproduce this bug on 32-bit-linux with
ghc-7.4.2
when compiling with -O2.


It's expected behaviour with floats. Calculations in FPU are done in
maximul precision available.  If one evaluation result is kept in
registers
and another has been moved to memory and rounded and move back to
registers
number will be not the same indeed.

In short. Never compare floating point number for equality unless you
really know
what are you doing.


I consider it a bug, because as the original poster pointed out it is a
violation of referential transparency.  What's more, it is *not* an
inherent property of floating point arithmetic, because if the compiler
is careful to do all the operations at the correct precision then you
can get determinstic results.  This is why GHC has the
-fexcess-precision flag: you have to explicitly ask to break
referential
transparency.

The bug is that the x86 native code generator behaves as if
-fexcess-precision is always on.  I seriously doubt that we'll ever fix
this bug: you can get correct behaviour by enabling -msse2, or
using a
64-bit machine.  I don't off-hand know what the LLVM backend does here,
but I would guess that it has the same bug.

Cheers,
 Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users












___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Call to arms: lambda-case is stuck and needs your help

2012-07-10 Thread Simon Marlow

On 09/07/2012 17:32, Mikhail Vorozhtsov wrote:

On 07/09/2012 09:49 PM, Simon Marlow wrote:

On 09/07/2012 15:04, Mikhail Vorozhtsov wrote:

and respectively

\case
   P1, P2 - ...
   P3, P4 - ...

as sugar for

\x y - case x, y of
   P1, P2 - ...
   P3, P4 - ...


That looks a bit strange to me, because I would expect

  \case
 P1, P2 - ...
 P3, P4 - ...

to be a function of type (# a, b #) - ...

Hm, maybe I put it slightly wrong. Desugaring is really only a means of
implementation here.


I think the desugaring is helpful - after all, most of the syntactic 
sugar in Haskell is already specified by its desugaring.  And in this 
case, the desugaring helps to explain why the multi-argument version is 
strange.


 Would you still expect tuples for \case if you

didn't see the way `case x, y of ...` was implemented (or thought that
it is a primitive construct)?


Yes, I still think it's strange.  We don't separate arguments by commas 
anywhere else in the syntax; arguments are always separated by whitespace.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Imported foreign functions should be strict

2012-07-10 Thread Simon Marlow

On 07/07/2012 05:06, Favonia wrote:

Hi all,

Recently I am tuning one of our incomplete libraries that uses FFI.
After dumping the interface file I realized strictness/demand analysis
failed for imported foreign functions---that is, they are not inferred
to be strict in their arguments. In my naive understanding all
imported foreign functions are strict! Here's a minimum example (with
GHC 7.4.2):

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where
import Foreign.C
foreign import ccall unsafe sin sin' :: CDouble - CDouble

where in the interface file the function sin' will have strictness
U(L) (meaning Unpackable(Lazy)).


This is fine - it means the CDouble is unpacked into its unboxed Double# 
component.  An unboxed value is always represented by L in strictness 
signatures.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Call to arms: lambda-case is stuck and needs your help

2012-07-10 Thread Simon Marlow

On 10/07/2012 07:33, Mikhail Vorozhtsov wrote:

On 07/10/2012 01:09 AM, Bardur Arantsson wrote:

On 07/09/2012 06:01 PM, Mikhail Vorozhtsov wrote:

On 07/09/2012 09:52 PM, Twan van Laarhoven wrote:

On 09/07/12 14:44, Simon Marlow wrote:

I now think '\' is too quiet to introduce a new layout context.  The
pressing
need is really for a combination of '\' and 'case', that is
single-argument so
that we don't have to write parentheses.  I think '\case' does the job
perfectly.  If you want a multi-clause multi-argument function, then
give it a
name.


There is an advantage here for \of in favor of \case, namely that
of already introduces layout, while case does not.

Do you think that adding \ + case as a layout herald would
complicate the language spec and/or confuse users? Because it certainly
does not complicate the implementation (there is a patch for \case
already).


Just being anal here, but: The existence of a patch to implement X does
not mean that X doesn't complicate the implemenatation.

In general, yes. But that particular patch[1] uses ~20 lines of pretty
straightforward (if I'm allowed to say that about the code I wrote
myself) code to handle layout. Which in my book is not complex at all.

[1]
http://hackage.haskell.org/trac/ghc/attachment/ticket/4359/one-arg-lambda-case.patch


The need to keep track of the previous token in the lexer *is* ugly though.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How to describe this bug?

2012-07-10 Thread Simon Marlow

On 10/07/2012 12:21, Aleksey Khudyakov wrote:

On Tue, Jul 10, 2012 at 3:06 PM, Sönke Hahn sh...@cs.tu-berlin.de wrote:

I've attached the code. The code does not make direct use of
unsafePerformIO. It uses QuickCheck, but I don't think, this is a
QuickCheck bug. The used Eq-instance is the one for Float.

I've only managed to reproduce this bug on 32-bit-linux with ghc-7.4.2
when compiling with -O2.


It's expected behaviour with floats. Calculations in FPU are done in
maximul precision available.  If one evaluation result is kept in registers
and another has been moved to memory and rounded and move back to registers
number will be not the same indeed.

In short. Never compare floating point number for equality unless you
really know
what are you doing.


I consider it a bug, because as the original poster pointed out it is a 
violation of referential transparency.  What's more, it is *not* an 
inherent property of floating point arithmetic, because if the compiler 
is careful to do all the operations at the correct precision then you 
can get determinstic results.  This is why GHC has the 
-fexcess-precision flag: you have to explicitly ask to break referential 
transparency.


The bug is that the x86 native code generator behaves as if 
-fexcess-precision is always on.  I seriously doubt that we'll ever fix 
this bug: you can get correct behaviour by enabling -msse2, or using a 
64-bit machine.  I don't off-hand know what the LLVM backend does here, 
but I would guess that it has the same bug.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Call to arms: lambda-case is stuck and needs your help

2012-07-09 Thread Simon Marlow

On 07/07/2012 05:08, Tyson Whitehead wrote:


PS:  To be fully precise, the modified layout decoder in 9.3 would be

   L (n:ts) i (m:ms) = ; : (L ts n (m:ms))   if m = n
   = } : (L (n:ts) n ms) if n  m
   L (n:ts) i ms = L ts n ms
   L ({n}:n:ts) i ms = { : (L ts n (n:ms))   if n  i (new rule)
   L ({n}:ts) i (m:ms) = { : (L ts i (n:m:ms)) if n  m  (Note 1)
   L ({n}:ts) i [] = { : (L ts i [n])  if n  0  (Note 1)
   L ({n}:ts) i ms = { : } : (L (n:ts) i ms)   (Note 2)
   L (}:ts)   i (0:ms) = } : (L ts i ms) (Note 3)
   L (}:ts)   i ms = parse-error (Note 3)
   L ({:ts)   i ms = { : (L ts i (0:ms)) (Note 4)
   L (t:ts)   i (m:ms) = } : (L (t:ts) i ms)   if m /= 0 and parse-error(t)
(Note 5)
   L (t:ts)   i ms = t : (L ts i ms)
   L []   i [] = []
   L []   i (m:ms) = } : L [] i ms if m /= 0 (Note 6)

   http://www.haskell.org/onlinereport/syntax-iso.html

As before, the function 'L' maps a layout-sensitive augmented token stream to
a non-layout-sensitive token stream, where the augmented token stream includes
'n' and '{n}' to, respectively, give the indentation level of the first token
on a new line and that following a grouping token not followed by '{'.

This time though, we allow the '{n}' 'n' sequence (before it was supressed
to just '{n}').  We also add a new state variable 'i' to track the indentation
of the  current line.  The new rule now opens a grouping over a newline so
long as the indentation is greater than the current line.

Upon a less indented line, it will then close all currently open groups with
an indentation less than the new line.


It's a little hard to evaluate this without trying it for real to see 
whether it breaks any existing code.  However, unless there are very 
strong reasons to do so, I would argue against making the layout rule 
*more* complicated.  I find the current rule behaves quite intuitively, 
even though its description is hard to understand and it is virtually 
impossible to implement.


I now think '\' is too quiet to introduce a new layout context.  The 
pressing need is really for a combination of '\' and 'case', that is 
single-argument so that we don't have to write parentheses.  I think 
'\case' does the job perfectly.  If you want a multi-clause 
multi-argument function, then give it a name.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Call to arms: lambda-case is stuck and needs your help

2012-07-09 Thread Simon Marlow

On 07/07/2012 16:07, Strake wrote:

On 07/07/2012, Jonas Almström Duregård jonas.dureg...@chalmers.se wrote:

Couldn't we use \\ for multi-case lambdas with layout?

If not, these are my preferences in order (all are single argument
versions):
1: Omission: case of. There seems to be some support for this but it
was not included in the summary.
2: Omission with clarification: \case of
3: \of  - but I think this is a little weird. It's nice to have
short keywords but not at the expense of intuition. The goal here is
to drop the variable name not the case keyword, right?

Regards,
Jonas


Well, since this is now suddenly a ranked-choice election, I shall
re-cast my vote:


I think some misunderstanding has crept in - we're not planning to count 
votes or anything here.  If you have new suggestions or know of reasons 
for/against existing proposals then please post, otherwise there's no 
need to post just to express your personal preference.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Call to arms: lambda-case is stuck and needs your help

2012-07-09 Thread Simon Marlow

On 09/07/2012 15:04, Mikhail Vorozhtsov wrote:

Hi Simon.

On 07/09/2012 08:23 PM, Simon Marlow wrote:

On 07/07/2012 16:07, Strake wrote:

On 07/07/2012, Jonas Almström Duregård jonas.dureg...@chalmers.se
wrote:

Couldn't we use \\ for multi-case lambdas with layout?

If not, these are my preferences in order (all are single argument
versions):
1: Omission: case of. There seems to be some support for this but it
was not included in the summary.
2: Omission with clarification: \case of
3: \of  - but I think this is a little weird. It's nice to have
short keywords but not at the expense of intuition. The goal here is
to drop the variable name not the case keyword, right?

Regards,
Jonas


Well, since this is now suddenly a ranked-choice election, I shall
re-cast my vote:


I think some misunderstanding has crept in - we're not planning to count
votes or anything here.  If you have new suggestions or know of reasons
for/against existing proposals then please post, otherwise there's no
need to post just to express your personal preference.

Could you express your opinion on the case comma sugar, i.e.

case x, y of
   P1, P2 - ...
   P3, P4 - ...

as sugar for

case (# x, y #) of
   (# P1, P2 #) - ...
   (# P3, P4 #) - ...


I like this.


and respectively

\case
   P1, P2 - ...
   P3, P4 - ...

as sugar for

\x y - case x, y of
   P1, P2 - ...
   P3, P4 - ...


That looks a bit strange to me, because I would expect

 \case
P1, P2 - ...
P3, P4 - ...

to be a function of type (# a, b #) - ...

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Call to arms: lambda-case is stuck and needs your help

2012-07-06 Thread Simon Marlow

On 05/07/2012 20:31, Tyson Whitehead wrote:

On July 5, 2012 10:42:53 Mikhail Vorozhtsov wrote:

After 21 months of occasional arguing the lambda-case proposal(s) is in
danger of being buried under its own trac ticket comments. We need fresh
blood to finally reach an agreement on the syntax. Read the wiki
page[1], take a look at the ticket[2], vote and comment on the proposals!


If I understand correctly, we currently we have

   \ apat1 ... apatn - exp

The possibility using '\' as a layout herald (like let, do, etc.)

   \ { apat1 ... apatn - exp; ... }

is suggested on the wiki, but rejected because code like so

   mask $ \restore - do
 stmt1
 ...

by translating it into (Section 9.3 of the 98 Report)

   mask $ \ { restore - do { }
 } stmt1

   http://www.haskell.org/onlinereport/syntax-iso.html

The reason for this is

1 - the layout level for '\' is the column of the 'restore' token

2 - the layout level for 'do' would be the column of the first token of 'stmt1'

3 - the '\' level is greater than the potential 'do' level so the fall through
'{}' insertion rule fires instead of the desired '{' insertion rule

4 - the '\' level is greater than the identation level for the first token of
'stms1' (now established to not be part of the 'do') so the '}' rule fires

Why not just let enclosed scopes be less indented than their outer ones?


I think this is undesirable.  You get strange effects like

  f x y = x + y
where  -- I just left this where here by accident

  g x = ...

parses as

  f x y = x + y
where { -- I just left this empty where here by accident

  g x = ...
  }

and

  instance Exception Foo where
  instance Exception Bar

parses as

  instance Exception Foo where {
instance Exception Bar
  }


That is, layout contexts that should really be empty end up surprisingly 
swallowing the rest of the file.


Cheers,
Simon




It would then correctly translate the above.  This of course implies that any
code that requires the original translation (i.e., where the last of multiple
enclosing blocks should be an empty block) would no longer work.

Is any code actually relying on this though?  It seems like a pretty esoteric
corner case.  If not, my vote would be to relax this rule and go with '\'
being a layout hearld with full case style matching (i.e., guards too).

Cheers!  -Tyson

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users





___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Compiling ghcjs

2012-07-02 Thread Simon Marlow

On 27/06/2012 13:24, Nathan Hüsken wrote:


I hope this is the correct list to ask this question.

I am trying to compile the ghcjs compiler. I am on ubuntu 12.04 and have
ghc-7.4.1 installed (via apt-get).

I am following the instruction I found here: https://github.com/ghcjs/ghcjs

The first trouble comes with git pull ghcjs. I get:

remote: Counting objects: 42, done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 26 (delta 22), reused 21 (delta 17)
Unpacking objects: 100% (26/26), done.
 From github.com:ghcjs/packages-Cabal
  * [new branch]  encoding   -  ghcjs/encoding
  * [new branch]  ghc-7.2-  ghcjs/ghc-7.2
  * [new branch]  ghc-7.4-  ghcjs/ghc-7.4
  * [new branch]  master -  ghcjs/master
You asked to pull from the remote 'ghcjs', but did not specify
a branch. Because this is not the default configured remote
for your current branch, you must specify a branch on the command line.

So I am doing:

git pull ghcjs ghc-7.4

Then git branch ghc-7.4 ghcjs/ghc-7.4 gives me:

fatal: A branch named 'ghc-7.4' already exists.

And finaly perl boot fails with:

Error: libraries/extensible-exceptions/LICENSE doesn't exist.
Maybe you haven't done './sync-all get'? at boot line 74,PACKAGES
line 57.

What can I do?


I suggest contacting the author of ghcjs, I don't know which branch(es) 
of ghcjs are supposed to work and/or whether there are any 
ghcjs-specific build requirements.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Strange behavior when using stable names inside ghci?

2012-06-29 Thread Simon Marlow

On 27/06/12 22:41, Facundo Domínguez wrote:

Hi,
   The program below when loaded in ghci prints always False, and when
compiled with ghc it prints True. I'm using ghc-7.4.1 and I cannot
quite explain such behavior. Any hints?

Thanks in advance,
Facundo

{-# LANGUAGE GADTs #-}
import System.Mem.StableName
import Unsafe.Coerce
import GHC.Conc

data D where
D :: a -  b -  D

main = do
   putStr type enter
   s- getLine
   let i = fromEnum$ head$ s++0
   d = D i i
   case d of
 D a b -  do
 let a' = a
 sn0- pseq a'$ makeStableName a'
 sn1- pseq b$ makeStableName b
 print (sn0==unsafeCoerce sn1)


GHCi adds some extra annotations around certain subexpressions to 
support the debugger.  This will make some things that would have equal 
StableNames when compiled have unequal StableNames in GHCi.  You would 
see the same problem if you compile with -fhpc, which adds annotations 
around every subexpression.


For your intended use of StableNames I imagine you can probably just 
live with this limitation - others are doing the same (e.g. Accelerate 
and Kansas Lava).


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: parallel garbage collection performance

2012-06-27 Thread Simon Marlow

On 26/06/2012 00:42, Ryan Newton wrote:

However, the parallel GC will be a problem if one or more of your
cores is being used by other process(es) on the machine.  In that
case, the GC synchronisation will stall and performance will go down
the drain.  You can often see this on a ThreadScope profile as a big
delay during GC while the other cores wait for the delayed core.
  Make sure your machine is quiet and/or use one fewer cores than
the total available.  It's not usually a good idea to use
hyperthreaded cores either.


Does it ever help to set the number of GC threads greater than
numCapabilities to over-partition the GC work?  The idea would be to
enable some load balancing in the face of perturbation from external
load on the machine...

It looks like GHC 6.10 had a -g flag for this that later went away?


The GC threads map one-to-one onto mutator threads now (since 6.12). 
This change was crucial for performance, before that we hardly ever got 
any speedup from parallel GC because there was no guarantee of locality.


I don't think it would help to have more threads.  The load-balancing is 
already done with work-stealing, it isn't statically partitioned.


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Cannot build user's guide

2012-06-25 Thread Simon Marlow

On 25/06/12 14:34, José Pedro Magalhães wrote:

Hi,

I cannot build the user's guide. Doing make html stage=0 FAST=YES under
docs/users_guide, I get:

===--- building final phase
make -r --no-print-directory -f ghc.mk http://ghc.mk phase=final
html_docs/users_guide
inplace/bin/mkUserGuidePart docs/users_guide/users_guide.xml
inplace/bin/mkUserGuidePart docs/users_guide/what_glasgow_exts_does.gen.xml
rm -rf docs/users_guide/users_guide/
 --stringparam base.dir docs/users_guide/users_guide/ --stringparam
use.id.as.filename 1 --stringparam html.stylesheet fptools.css --nonet
--stringparam toc.section.depth 3 --stringparam section.autolabel 1
--stringparam section.label.includes.component.label 1
http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl
docs/users_guide/users_guide.xml
/bin/sh: 1: : Permission denied
make[2]: *** [docs/users_guide/users_guide/index.html] Error 127
make[1]: *** [html_docs/users_guide] Error 2

It seems like make is just calling an empty program name. What could I
be missing?

(Btw, validate goes through. Does validate not build the user's guide?)


What did ./configure say about DocBook tools?

Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: parallel garbage collection performance

2012-06-25 Thread Simon Marlow

On 19/06/12 02:32, John Lato wrote:

Thanks for the suggestions.  I'll try them and report back.  Although
I've since found that out of 3 not-identical systems, this problem
only occurs on one.  So I may try different kernel/system libs and see
where that gets me.

-qg is funny.  My interpretation from the results so far is that, when
the parallel collector doesn't get stalled, it results in a big win.
But when parGC does stall, it's slower than disabling parallel gc
entirely.


Parallel GC is usually a win for idiomatic Haskell code, it may or may 
not be a good idea for things like Repa - I haven't done much analysis 
of those types of programs yet.  Experiment with the -A flag, e.g. -A1m 
is often better than the default if your processor has a large cache.


However, the parallel GC will be a problem if one or more of your cores 
is being used by other process(es) on the machine.  In that case, the GC 
synchronisation will stall and performance will go down the drain.  You 
can often see this on a ThreadScope profile as a big delay during GC 
while the other cores wait for the delayed core.  Make sure your machine 
is quiet and/or use one fewer cores than the total available.  It's not 
usually a good idea to use hyperthreaded cores either.


I'm also seeing unpredictable performance on a 32-core AMD machine with 
NUMA.  I'd avoid NUMA for Haskell for the time being if you can.  Indeed 
you get unpredictable performance on this machine even for 
single-threaded code, because it makes a difference on which node the 
pages of your executable are cached (I heard a rumour that Linux has 
some kind of a fix for this in the pipeline, but I don't know the details).



I had thought the last core parallel slowdown problem was fixed a
while ago, but apparently not?


We improved matters by inserting some yields into the spinlock loops. 
 This helped a lot, but the problem still exists.


Cheers,
Simon




Thanks,
John

On Tue, Jun 19, 2012 at 8:49 AM, Ben Lippmeierb...@ouroborus.net  wrote:


On 19/06/2012, at 24:48 , Tyson Whitehead wrote:


On June 18, 2012 04:20:51 John Lato wrote:

Given this, can anyone suggest any likely causes of this issue, or
anything I might want to look for?  Also, should I be concerned about
the much larger gc_alloc_block_sync level for the slow run?  Does that
indicate the allocator waiting to alloc a new block, or is it
something else?  Am I on completely the wrong track?


A total shot in the dark here, but wasn't there something about really bad
performance when you used all the CPUs on your machine under Linux?

Presumably very tight coupling that is causing all the threads to stall
everytime the OS needs to do something or something?


This can be a problem for data parallel computations (like in Repa). In Repa 
all threads in the gang are supposed to run for the same time, but if one gets 
swapped out by the OS then the whole gang is stalled.

I tend to get best results using -N7 for an 8 core machine.

It is also important to enable thread affinity (with the -qa) flag.

For a Repa program on an 8 core machine I use +RTS -N7 -qa -qg

Ben.




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Is stderr flushed automatically after exceptions are printed

2012-06-13 Thread Simon Marlow

On 13/06/2012 03:35, Johan Tibell wrote:

Hi,

If a program throws an exception that will cause it to be terminated
(i.e. the exception isn't caught), will the code that prints out the
error message to stderr make sure to flush stderr before terminating
the process?


Yes: 
https://github.com/ghc/packages-base/blob/master/GHC/TopHandler.lhs#L152


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: problem with FFI and libsane

2012-06-07 Thread Simon Marlow

On 06/06/2012 06:59, Ganesh Sittampalam wrote:


I'm having some trouble making Haskell bindings to libsane (a scanner
access library: http://www.sane-project.org/)

When I build the cut down sample of my code (below) with GHC 7.4.1 with
the non-threaded runtime, it hangs at runtime in the call to
c'sane_get_devices (after printing go). Pressing ^C causes it to
continue and print done before exiting.

If I compile with GHC 7.2.2 non-threaded, it doesn't hang, printing
first go then done after a few seconds. That pause is expected, as
it's also seen in the equivalent C version (also below).

If I switch to the threaded runtime, then things go wrong differently.
Most of the time there's a hang after go and after pressing ^C they
just exit immediately, without printing done. This doesn't change
between 7.2.2 and 7.4.1. Occasionally, things do work and I get go
then done.

All these symptoms seem to point to some kind of threading problem, and
I believe that libsane is using pthreads, although ldd doesn't report it
and strace only shows it loading the library.

The platform is Linux x86, and I've reproduced the behaviour on two
different machines (Debian and Ubuntu). I've also tried with GHC
7.4.1.20120508, the most recent i386 snapshot I could fine.

Is there anything obvious I'm doing wrong, or something I could try next?


I don't completely understand what is going wrong here, but it looks 
like an interaction between the RTS's use of a timer signal and the 
libsane library.  You can make it work by turning off GHC's timer with 
+RTS -V0.


There were some changes in this area in 7.4.1, in particular that the 
non-threaded RTS now uses a realtime interval timer, whereas in previous 
versions it used a CPU time interval timer.  The threaded RTS has always 
used a realtime timer.


The signal has always been SIGVTALRM, as far as I can tell.  Which is 
confusing - if the signal had changed, I could understand that being the 
cause of the difference in behaviour.  Perhaps it is just that system 
calls are being interrupted by the signal more often than they were 
before, and libsane does not properly handle EINTR.  I looked at the 
strace output and can't see any use of a signal by libsane.


Cheers,
Simon




Cheers,

Ganesh

Haskell code:

{-# LANGUAGE ForeignFunctionInterface #-}
import Foreign.Marshal.Alloc
import Foreign.Ptr
import System.IO
import Foreign.C.Types

foreign import ccall sane_init c'sane_init
   :: Ptr CInt -  Callback -  IO CUInt

type Callback = FunPtr (Ptr CChar -  Ptr CChar -  Ptr CChar -  IO ())

foreign import ccall sane_exit c'sane_exit
   :: IO ()

-- the () in the ptr type is incorrect, but in
-- this cut-down example we never try to dereference it
foreign import ccall sane_get_devices c'sane_get_devices
   :: Ptr (Ptr (Ptr ())) -  CInt -  IO CUInt


main :: IO ()
main = do
hSetBuffering stdout NoBuffering
_- c'sane_init nullPtr nullFunPtr
putStrLn go
ptr- malloc
_- c'sane_get_devices ptr 0
putStrLn done
c'sane_exit


C code:

#includesane/sane.h
#includestdlib.h

int main()
{
sane_init(NULL, NULL);
puts(go);
const SANE_Device **ptr;
sane_get_devices(ptr, 0);
puts(done);
sane_exit();
}

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Source Location of DataCon objects in GHC 7.4.1 API

2012-06-04 Thread Simon Marlow

On 01/06/2012 10:24, JP Moresmau wrote:

Hello

I have a failing test in BuildWrapper when moving from GHC 7.0.4 to
7.4.1. As far I can tell, in the TypecheckedSource I get DataCon
objects that have no location info, and hence I can't retrieve them by
location... Which is useful in a IDE (tell me what's under my mouse
cursor, tell me where else it's used).

Given the simple data declaration:
data DataT=MkData {name :: String}

In 7.0.4 I obtain a hierarchy that ends in FunBind (on a Var called
name)/MatchGroup/Match/ConPatOut and the contained DataCon named
MkData has a SrcSpan associated with it, and so do the Var,
MatchGroup and Match.
In 7.4.1 I have the same hierarchy but the DataCon tells me no
location info. The Var name has a location, but the MatchGroup and
Match don't either.

Is it a normal change? Do I need to change something in the way I load
the module? Is it a regression?


It sounds like a regression.  Please create a ticket and we'll look into it.

Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Potential GSoC proposal: Reduce the speed gap between 'ghc -c' and 'ghc --make'

2012-05-24 Thread Simon Marlow

On 23/05/12 21:11, Ryan Newton wrote:

the.dead.shall.r...@gmail.com
mailto:the.dead.shall.r...@gmail.com wrote:

Thanks. I'll look into how to optimise .hi loading by more

traditional

means, then.


Lennart is working on speeding up the binary package (which I
believe is used to decode the .hi files.) His work might benefit this
effort.


Last time I tested it, mmap still offered better performance than
fread on linux.  In addition to improving the deserialization code it
would seem like a good idea to mmap the whole file at the outset as
well.

It seems like readBinMem is the relevant function (readIFace -
readBinIFace - readBinMem), which occurs here:

https://github.com/ghc/ghc/blob/08894f96407635781a233145435a78f144accab0/compiler/utils/Binary.hs#L222

 Currently it does one big hGetBuf to read the file.  Since the
interface files aren't changing dynamically, I think it's safe to
just replace this code with an mmap.


I honestly don't think it will make much difference, because reading the
files is not the bottleneck, but we'll happily accept a patch.  Adding a
new package dependency just for this doesn't seem worthwhile though.

Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


[Haskell] ANNOUNCE: forthcoming O'Reilly book on Parallel and Concurrent Haskell

2012-05-17 Thread Simon Marlow
I'm delighted to announce that O'Reilly have agreed to publish a book on 
Parallel and Concurrent Haskell authored by me.  The plan is to make a 
significantly revised and extended version of the Parallel and 
Concurrent Haskell tutorial from CEFP'11:


http://community.haskell.org/~simonmar/bib/par-tutorial-cefp-2012_abstract.html

The book will be published in both hardcopy and electronic formats, and 
will also be available online under a Creative Commons license 
(Attribution-NonCommercial-NoDerivs 3.0).  There will be some mechanism 
for people to see and comment on early drafts, but I don't know the 
details yet.


When will it be done?  I can't say for sure, but the tentative date for 
completion is March 2013.


I'm really keen for this to be a book that will be useful to people both 
learning about parallelism and concurrency in Haskell, and coding stuff 
for real-world use.  If there are topics or application areas that you'd 
like to see covered, or any other suggestions, please let me know.  All 
contributions will be acknowledged, of course!


Cheers,
Simon

___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: Explicit calls to the garbage collector.

2012-05-08 Thread Simon Marlow

On 07/05/2012 14:33, Jurriaan Hage wrote:

LS.

I have a very memory intensive application. It seems that the timing of my 
application
depend very much on the precise setting of -H...M in the runtime system (-H2000M
seems to work best, computation time becomes a third of what I get when I pass 
no
-H option).  I conjecture that this good behaviour is the result of gc 
happening at the right time.
So I wondered: if I can one when is the right time, is it possible then to 
trigger
GC explicitly from within the Haskell code?


It is more likely that you are trading extra memory for better 
performance, rather than triggering the GC at a good time.  GC is 
basically a space/time tradeoff, see:


http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-collection/3172704#3172704

If you think the program has points where residency is very low and it 
would be good to trigger a GC, I would first confirm the hypothesis by 
doing a heap profile.  GC can be triggered with System.Mem.performGC.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Can Haskell outperform C++?

2012-05-08 Thread Simon Marlow

On 06/05/2012 07:40, Janek S. wrote:

a couple of times I've encountered a statement that Haskell programs
can have performance comparable to programs in C/C++. I've even read
that thanks to functional nature of Haskell, compiler can reason and
make guarantess about the code and use that knowledge to
automatically parallelize the program without any explicit
parallelizing commands in the code. I haven't seen any sort of
evidence that would support such claims. Can anyone provide a code in
Haskell that performs better in terms of execution speed than a
well-written C/C++ program? Both Haskell and C programs should
implement the same algorithm (merge sort in Haskell outperforming
bubble sort in C doesn't count), though I guess that using
Haskell-specific idioms and optimizations is of course allowed.


On the subject of parallelism in particular, there is no fully implicit
parallelism in Haskell at the moment.  When people ask about this I
typically point to this paper:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.145.8183

which shows that it is possible to get some gains doing this, but it is
hard and the gains were not great.


However, what Haskell does give you is a guarantee that you can safely
call any function in parallel (with itself or with something else), as
long as the function does not have an IO type.  This is about as close
to automatic parallelisation as you can practically get.  Take any pure
function from a library that someone else wrote, and you can use it with
parMap or call it from multiple threads, and reasonably expect to get a
speedup.  Furthermore with parMap you're guaranteed that the result will
be deterministic, and there are no race conditions or deadlocks to worry
about.

Cheers,
Simon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Weird behavior of the NonTermination exception

2012-05-04 Thread Simon Marlow

On 03/05/2012 17:14, Bas van Dijk wrote:

On 3 May 2012 17:31, Edward Z. Yangezy...@mit.edu  wrote:

Excerpts from Bas van Dijk's message of Thu May 03 11:10:38 -0400 2012:

As can be seen, the putMVar is executed successfully. So why do I get
the message: thread blocked indefinitely in an MVar operation?


GHC will send BlockedIndefinitelyOnMVar to all threads involved
in the deadlock, so it's not unusual that this can interact with
error handlers to cause the system to become undeadlocked.


But why is the BlockedIndefinitelyOnMVar thrown in the first place?
According to the its documentation and your very enlightening article
it is thrown when:

The thread is blocked on an MVar, but there are no other references
to the MVar so it can't ever continue.

The first condition holds for the main thread since it's executing
takeMVar. But the second condition doesn't hold since the forked
thread still has a reference to the MVar.


The forked thread is deadlocked, so the MVar is considered unreachable 
and the main thread is also unreachable.  Hence both threads get sent 
the exception.


The RTS does this analysis using the GC, tracing the reachable objects 
starting from the roots.  It then send an exception to any threads which 
were not reachable, which in this case is both the main thread and the 
child, since neither is reachable.


We (the user) knows that waking up the child thread will unblock the 
main thread, but the RTS doesn't know this, and it's not clear how it 
could find out easily (i.e. without multiple scans of the heap).


Cheers,
Simon





I just tried delaying the thread before the putMVar:

-
main :: IO ()
main = do
   mv- newEmptyMVar
   _- forkIO $ do
  catch action
(\e -  putStrLn $ I solved the Halting Problem:  ++
  show (e :: SomeException))
  putStrLn Delaying for 2 seconds...
  threadDelay 200
  putStrLn putting MVar...
  putMVar mv ()
  putStrLn putted MVar
   takeMVar mv
-

Now I get the following output:

loop: thread blocked indefinitely in an MVar operation
I solved the Halting Problem:loop
Delaying for 2 seconds...

Now it seems the thread is killed while delaying. But why is it
killed? It could be a BlockedIndefinitelyOnMVar that is thrown.
However I get the same output when I catch and print all exceptions in
the forked thread:

main :: IO ()
main = do
   mv- newEmptyMVar
   _- forkIO $
  handle (\e -  putStrLn $ Oh nooo: ++
   show (e :: SomeException)) $ do
catch action
  (\e -  putStrLn $ I solved the Halting Problem:  ++
show (e :: SomeException))
putStrLn Delaying for 2 seconds...
threadDelay 200
putStrLn putting MVar...
putMVar mv ()
putStrLn putted MVar
   takeMVar mv

Bas

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Potential GSoC proposal: Reduce the speed gap between 'ghc -c' and 'ghc --make'

2012-04-27 Thread Simon Marlow

On 26/04/2012 23:32, Johan Tibell wrote:

On Thu, Apr 26, 2012 at 2:34 PM, Mikhail Glushenkov
the.dead.shall.r...@gmail.com  wrote:

Thanks. I'll look into how to optimise .hi loading by more traditional
means, then.


Lennart is working on speeding up the binary package (which I believe
is used to decode the .hi files.) His work might benefit this effort.


We're still using our own Binary library in GHC.  There's no good reason 
for that, unless using the binary package would be a performance 
regression. (we don't know whether that's the case or not, with the 
current binary).


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Printing call site for partial functions

2012-04-26 Thread Simon Marlow

On 25/04/2012 17:28, Ozgur Akgun wrote:
 Hi,

 On 25 April 2012 16:36, Michael Snoyman mich...@snoyman.com
 mailto:mich...@snoyman.com wrote:

 Prelude.head: empty list


 Recent versions of GHC actually generate a very helpful stack trace, if
 the program is compiled with profiling turned on and run with -xc.

Right.  Also don't forget to add -fprof-auto.

There's an API to get access to the stack trace too:

http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.5.0.0/GHC-Stack.html

 See: http://community.haskell.org/~simonmar/slides/HIW11.pdf
 (Ironically titled Prelude.head: empty list)

A more recent talk about this with more details is here:

http://community.haskell.org/~simonmar/Stack-traces.pdf


Cheers,
Simon


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Potential GSoC proposal: Reduce the speed gap between 'ghc -c' and 'ghc --make'

2012-04-25 Thread Simon Marlow

On 25/04/2012 03:17, Mikhail Glushenkov wrote:

Hello Simon,

Sorry for the delay.

On Tue, Apr 10, 2012 at 1:03 PM, Simon Marlowmarlo...@gmail.com  wrote:



Questions:

Would implementing this optimisation be a worthwhile/realistic GSoC
project?
What are other potential ways to bring 'ghc -c' performance up to par
with 'ghc --make'?



My guess is that this won't have a significant impact on ghc -c compile
times.

The advantage of squashing the .hi files for a package together is that they
could share a string table, which would save a bit of space and time, but I
think the time saved is small compared to the cost of deserialising and
typechecking the declarations from the interface, which still has to be
done.  In fact it might make things worse, if the string table for the whole
base package is larger than the individual tables that would be read from
.hi files.  I don't think mmap() will buy very much over the current scheme
of just reading the file into a ByteArray.


Thank you for the answer.
I'll be working on another project during the summer, but I'm still
interested in making interface files load faster.

The idea that I currently like the most is to make it possible to save
and load objects in the GHC heap format. That way, deserialisation
could be done with a simple fread() and a fast pointer fixup pass,
which would hopefully make running many 'ghc -c' processes as fast as
a single 'ghc --make'. This trick is commonly employed in the games
industry to speed-up load times [1]. Given that Haskell is a
garbage-collected language, the implementation will be trickier than
in C++ and will have to be done on the RTS level.

Is this a good idea? How hard it would be to implement this optimisation?


I believe OCaml does something like this.

I think the main difficulty is that the data structures in the heap are 
not the same every time, because we allocate unique identifiers 
sequentially as each Name is created.  So to make this work you would 
have to make Names globally unique.  Maybe using a 64-bit hash instead 
of the sequentially-allocated uniques would work, but that would entail 
quite a performance hit on 32-bit platforms (GHC uses IntMap everywhere 
with Unique as the key).


On top of this there will be a *lot* of other complications (e.g. 
handling sharing well, mapping info pointers somehow).  Personally I 
think it's at best very ambitious, and at worst not at all practical.


Cheers,
Simon




Another idea (that I like less) is to implement a build server mode
for GHC. That way, instead of a single 'ghc --make' we could run
several ghc build servers in parallel. However, Evan Laforge's efforts
in this direction didn't bring the expected speedup. Perhaps it's
possible to improve on his work.

[1] 
http://www.gamasutra.com/view/feature/132376/delicious_data_baking.php?print=1



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Potential GSoC proposal: Reduce the speed gap between 'ghc -c' and 'ghc --make'

2012-04-25 Thread Simon Marlow

On 25/04/2012 08:57, Simon Marlow wrote:

On 25/04/2012 03:17, Mikhail Glushenkov wrote:

Hello Simon,

Sorry for the delay.

On Tue, Apr 10, 2012 at 1:03 PM, Simon Marlowmarlo...@gmail.com wrote:



Questions:

Would implementing this optimisation be a worthwhile/realistic GSoC
project?
What are other potential ways to bring 'ghc -c' performance up to par
with 'ghc --make'?



My guess is that this won't have a significant impact on ghc -c compile
times.

The advantage of squashing the .hi files for a package together is
that they
could share a string table, which would save a bit of space and time,
but I
think the time saved is small compared to the cost of deserialising and
typechecking the declarations from the interface, which still has to be
done. In fact it might make things worse, if the string table for the
whole
base package is larger than the individual tables that would be read
from
.hi files. I don't think mmap() will buy very much over the current
scheme
of just reading the file into a ByteArray.


Thank you for the answer.
I'll be working on another project during the summer, but I'm still
interested in making interface files load faster.

The idea that I currently like the most is to make it possible to save
and load objects in the GHC heap format. That way, deserialisation
could be done with a simple fread() and a fast pointer fixup pass,
which would hopefully make running many 'ghc -c' processes as fast as
a single 'ghc --make'. This trick is commonly employed in the games
industry to speed-up load times [1]. Given that Haskell is a
garbage-collected language, the implementation will be trickier than
in C++ and will have to be done on the RTS level.

Is this a good idea? How hard it would be to implement this optimisation?


I believe OCaml does something like this.

I think the main difficulty is that the data structures in the heap are
not the same every time, because we allocate unique identifiers
sequentially as each Name is created. So to make this work you would
have to make Names globally unique. Maybe using a 64-bit hash instead of
the sequentially-allocated uniques would work, but that would entail
quite a performance hit on 32-bit platforms (GHC uses IntMap everywhere
with Unique as the key).

On top of this there will be a *lot* of other complications (e.g.
handling sharing well, mapping info pointers somehow). Personally I
think it's at best very ambitious, and at worst not at all practical.


Oh, I also meant to add: the best thing we could do initially is to 
profile GHC and see if there are improvements that could be made in the 
.hi file deserialisation/typechecking.


Cheers,
Simon





Cheers,
Simon




Another idea (that I like less) is to implement a build server mode
for GHC. That way, instead of a single 'ghc --make' we could run
several ghc build servers in parallel. However, Evan Laforge's efforts
in this direction didn't bring the expected speedup. Perhaps it's
possible to improve on his work.

[1]
http://www.gamasutra.com/view/feature/132376/delicious_data_baking.php?print=1






___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: default instance for IsString

2012-04-24 Thread Simon Marlow

On 24/04/2012 11:08, Erik Hesselink wrote:

On Tue, Apr 24, 2012 at 10:55, Michael Snoymanmich...@snoyman.com  wrote:

On Tue, Apr 24, 2012 at 11:36 AM, Erik Hesselinkhessel...@gmail.com  wrote:

On Tue, Apr 24, 2012 at 08:32, Michael Snoymanmich...@snoyman.com  wrote:

Here's a theoretically simple solution to the problem. How about
adding a new method to the IsString typeclass:

isValidString :: String -  Bool


If you're going with this approach, why not evaluate the conversion
from String immediately? For either case you have to know the
monomorphic type, and converting at compile time is more efficient as
well. But we're getting pretty close to Template Haskell here.


I could be mistaken, but I think that would be much harder to
implement at the GHC level. GHC would then be responsible for taking a
compile-time value and having it available at runtime (i.e., lifting
in TH parlance). Of course, I'm no expert on GHC at all, so if someone
who actually knows what they're talking about says that this concern
is baseless, I agree that your approach is better.


But GHC already has all the infrastructure for this, right? You can do
exactly this with TH.


No, Michael is right.  The library writer would need to provide

  fromString :: String - Q Exp

since there's no way to take an aribtrary value and convert it into 
something we can compile.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: default instance for IsString

2012-04-24 Thread Simon Marlow

On 24/04/2012 14:14, Daniel Peebles wrote:

Why are potentially partial literals scarier than the fact that every
 value in the language could lead to an exception when forced?


My thoughts exactly.  In this thread people are using the term safe to
mean total.  We already overload safe too much, might it be a better
idea to use total instead?

(and FWIW I'm not sure I see what all the fuss is about either)

Cheers,
Simon




On Tue, Apr 24, 2012 at 5:35 AM, Yitzchak Gale g...@sefer.org
mailto:g...@sefer.org wrote:

Markus Läll wrote:

You do know, that you already *can* have safe Text and ByteString

from

an overloaded string literal.


Yes, the IsString instances for Text and ByteString are safe (I
hope).

But in order to use them, I have to turn on OverloadedStrings. That
could cause other string literals in the same module to throw
exceptions at run time.

-Yitz

___ Glasgow-haskell-users
mailing list Glasgow-haskell-users@haskell.org
mailto:Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users




___ Glasgow-haskell-users
mailing list Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: default instance for IsString

2012-04-24 Thread Simon Marlow

On 24/04/2012 15:19, Yitzchak Gale wrote:

Simon Marlow wrote:

In this thread people are using the term safe to
mean total.  We already overload safe too much, might it be a better
idea to use total instead?


I'm not sure what you're talking about. I don't see how
this thread has anything to do with total vs. partial
functions.


My apologies if I've misunderstood, but the problem that people seem to 
be worried about is fromString failing at runtime (i.e. it is a partial 
function), and this has been referred to as unsafe.



I'm saying that the static syntax of string literals should
be checked at compile time, not at run time. Isn't that
simple enough, and self-evident?


Well, the syntax of overloaded integers isn't checked at compile time, 
so why should strings be special?


I'm not arguing in favour of using OverloadedStrings for URLs or 
anything like that, but I'm not sure I see why it's bad for Text and 
ByteString.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: containing memory-consuming computations

2012-04-20 Thread Simon Marlow

On 19/04/2012 11:45, Herbert Valerio Riedel wrote:

 For the time-dimension, I'm already using functions such as
 System.Timeout.timeout which I can use to make sure that even a (forced)
 pure computation doesn't require (significantly) more wall-clock time
 than I expect it to.

Note that timeout uses wall-clock time, but you're really interested in 
CPU time (presumably).  If there are other threads running, then using 
timeout will not do what you want.


You could track allocation and CPU usage per thread, but note that 
laziness could present a problem: if a thread evaluates a lazy 
computation created by another thread, it will be charged to the thread 
that evaluated it, not the thread that created it.  To get around this 
you would need to use the profiling system, which tracks costs 
independently of lazy evaluation.


On 19/04/2012 17:04, Herbert Valerio Riedel wrote:


At least this seems easier than needing a per-computation or
per-IO-thread caps.


How hard would per-IO-thread caps be?


For tracking memory use, which I think is what you're asking for, it 
would be quite hard.  One problem is sharing: when a data structure is 
shared between multiple threads, which one should it be charged to?  Both?


To calculate the amount of memory use per thread you would need to run 
the GC multiple times, once per thread, and observe how much data is 
reachable.  I can't think of any fundamental difficulties with doing 
that, but it could be quite expensive.  There might be some tricky 
interactions with the reachability property of threads themselves: a 
blocked thread is only reachable if the object it is blocked on is also 
reachable.


Cheers,
Simon



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Invariants for GHC.Event ensureIOManagerIsRunning

2012-04-18 Thread Simon Marlow

On 14/04/2012 04:53, Edward Z. Yang wrote:

Hello all,

I recently ran into a rather reproduceable bug where I would
get this error from the event manager:

 /dev/null: hClose: user error (Pattern match failure in do expression at 
libraries/base/System/Event/Thread.hs:83:3-10)

The program was doing some rather strange things:

 - It was running the Haskell RTS inside another system (Urweb)
   which was making use of pthreads, sockets, etc.

 - The Haskell portion was linked against the threaded RTS, and doing
   communication with a process.

and is rather complicated (two compilers are involved).  But
the gist of the matter is that if I added a quick call to
ensureIOManagerIsRunning after hs_init, the error went away.

So, if the IO manager is not eagerly loaded at the call to hs_init,
how do we decided when it should be loaded?  It seems probably that
we missed a case.

Edward

P.S. I tried reproducing on a simple test case but couldn't manage it.


Looking at the code I can't see how that can happen, so if you do manage 
to reproduce it on a small example, please file a bug.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Another profiling question.

2012-04-18 Thread Simon Marlow

On 17/04/2012 16:22, Herbert Valerio Riedel wrote:

Jurriaan Hagej.h...@uu.nl  writes:


from the RTS option -s I get :

   INIT  time0.00s  (  0.00s elapsed)
   MUT   time  329.99s  (940.55s elapsed)
   GCtime  745.91s  (751.51s elapsed)
   RPtime  765.76s  (767.76s elapsed)
   PROF  time  359.95s  (362.12s elapsed)
   EXIT  time0.00s  (  0.00s elapsed)

I can guess what most components mean, but do not know what RP stands
for.


afaik RP stands for retainer profiling, see

  
http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/prof-heap.html#retainer-prof


Yes, RP is the amount of time the RTS spent doing retainer profiling. 
Retainer profiling is a separate pass over the heap in addition to the 
usual heap census, which is recorded as PROF in the stats.


Cheers,
Simon




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Code review for new primop's CMM code.

2012-04-10 Thread Simon Marlow

On 29/03/2012 05:56, Ryan Newton wrote:

Hi all,

In preparation for students working on concurrent data structures
GSOC(s), I wanted to make sure they could count on CAS for array
elements as well as IORefs.  The following patch represents my first
attempt:

https://github.com/rrnewton/ghc/commit/18ed460be111b47a759486677960093d71eef386

It passes a simple test [Appendix 2 below], but I am very unsure as to
whether the GC write barrier is correct.  Could someone do a code-review
on the following few lines of CMM:

if (GET_INFO(arr) == stg_MUT_ARR_PTRS_CLEAN_info) {
   SET_HDR(arr, stg_MUT_ARR_PTRS_DIRTY_info, CCCS);
   len = StgMutArrPtrs_ptrs(arr);
   // The write barrier.  We must write a byte into the mark table:
   I8[arr + SIZEOF_StgMutArrPtrs + WDS(len) + (ind 
MUT_ARR_PTRS_CARD_BITS )] = 1;
}


Remove the conditional.  You want to always set the header to 
stg_MUT_ARR_PTRS_CLEAN_info, and always update the mark table.


Cheers,
Simon




Thanks,
   -Ryan

-- Appendix 1: First draft code CMM definition for casArray#
---
stg_casArrayzh
/* MutableArray# s a - Int# - a - a - State# s - (# State# s, Int#,
a #) */
{
W_ arr, p, ind, old, new, h, len;
arr = R1; // anything else?
ind = R2;
old = R3;
new = R4;

p = arr + SIZEOF_StgMutArrPtrs + WDS(ind);
(h) = foreign C cas(p, old, new) [];

if (h != old) {
// Failure, return what was there instead of 'old':
RET_NP(1,h);
} else {
// Compare and Swap Succeeded:
if (GET_INFO(arr) == stg_MUT_ARR_PTRS_CLEAN_info) {
   SET_HDR(arr, stg_MUT_ARR_PTRS_DIRTY_info, CCCS);
   len = StgMutArrPtrs_ptrs(arr);
   // The write barrier.  We must write a byte into the mark table:
   I8[arr + SIZEOF_StgMutArrPtrs + WDS(len) + (ind 
MUT_ARR_PTRS_CARD_BITS )] = 1;
}
RET_NP(0,h);
}
}

-- Appendix 2:  Simple test file; when run it should print:
---
-- Perform a CAS within a MutableArray#
--   1st try should succeed: (True,33)
-- 2nd should fail: (False,44)
-- Printing array:
--   33  33  33  44  33
-- Done.
---
{-# Language MagicHash, UnboxedTuples  #-}

import GHC.IO http://GHC.IO
import GHC.IORef
import GHC.ST http://GHC.ST
import GHC.STRef
import GHC.Prim
import GHC.Base
import Data.Primitive.Array
import Control.Monad



-- -- | Write a value to the array at the given index:
casArrayST :: MutableArray s a - Int - a - a - ST s (Bool, a)
casArrayST (MutableArray arr#) (I# i#) old new = ST$ \s1# -
  case casArray# arr# i# old new s1# of
(# s2#, x#, res #) - (# s2#, (x# ==# 0#, res) #)


{-# NOINLINE mynum #-}
mynum :: Int
mynum = 33

main = do
  putStrLn Perform a CAS within a MutableArray#
  arr - newArray 5 mynum

  res - stToIO$ casArrayST arr 3 mynum 44
  res2 - stToIO$ casArrayST arr 3 mynum 44
  putStrLn$   1st try should succeed: ++show res
  putStrLn$ 2nd should fail: ++show res2

  putStrLn Printing array:
  forM_ [0..4] $ \ i - do
x - readArray arr i
putStr ( ++show x)
  putStrLn 
  putStrLn Done.



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Potential GSoC proposal: Reduce the speed gap between 'ghc -c' and 'ghc --make'

2012-04-10 Thread Simon Marlow

On 02/04/2012 07:37, Mikhail Glushenkov wrote:

Hi all,

[Hoping it's not too late.]

During my work on parallelising 'ghc --make' [1] I encountered a
stumbling block: running 'ghc --make' can be often much faster than
using separate compile ('ghc -c') and link stages, which means that
any parallel build tool built on top of 'ghc -c' will be significantly
handicapped [2]. As far as I understand, this is mainly due to the
effects of interface file caching - 'ghc --make' only needs to parse
and load them once. One potential improvement (suggested by Duncan
Coutts [3]) is to produce whole-package interface files and load them
in using mmap().

Questions:

Would implementing this optimisation be a worthwhile/realistic GSoC project?
What are other potential ways to bring 'ghc -c' performance up to par
with 'ghc --make'?


My guess is that this won't have a significant impact on ghc -c compile 
times.


The advantage of squashing the .hi files for a package together is that 
they could share a string table, which would save a bit of space and 
time, but I think the time saved is small compared to the cost of 
deserialising and typechecking the declarations from the interface, 
which still has to be done.  In fact it might make things worse, if the 
string table for the whole base package is larger than the individual 
tables that would be read from .hi files.  I don't think mmap() will buy 
very much over the current scheme of just reading the file into a ByteArray.


Of course this is all just (educated) guesswork without actual 
measurements, and I could be wrong...


Perhaps there are ways to optimise the reading of interface files.  A 
good first step would be to do some profiling and see where the hotspots 
are.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: What do the following numbers mean?

2012-04-10 Thread Simon Marlow

On 03/04/2012 00:46, Ben Lippmeier wrote:


On 02/04/2012, at 10:10 PM, Jurriaan Hage wrote:

Can anyone tell me what the exact difference is between
1,842,979,344 bytes maximum residency (219 sample(s)) and 4451 MB
total memory in use (0 MB lost due to fragmentation)

I could not find this information in the docs anywhere, but I may
have missed it.


The maximum residency is the peak amount of live data in the heap.
The total memory in use is the peak amount that the GHC runtime
requested from the operating system. Because the runtime system
ensures that the heap is always bigger than the size of the live
data, the second number will be larger.

The maximum residency is determined by performing a garbage
collection, which traces out the graph of live objects. This means
that the number reported may not be the exact peak memory use of the
program, because objects could be allocated and then become
unreachable before the next sample. If you want a more accurate
number then increase the frequency of the heap sampling with the
-isec  RTS flag.


To put it another way, the difference between maximum residency and 
total memory in use is the overhead imposed by the runtime's memory 
manager.


Typically for the default settings the total memory in use will be about 
three times the maximum residency, because the runtime is using copying 
GC.  If your maximum residency is L (for Live data), and we let the heap 
grow to size 2L before doing a GC (the 2 can be tuned with the -F flag), 
and we need another L to copy the live data into, then we need in total 3L.


This assumes that the live data remains constant, which it doesn't in 
practice, hence the overhead is not always exactly 3L.  Generational GC 
also adds some memory overhead, but with the default settings it is 
limited to at most 1MB (512KB for the nursery, and another 512KB for aging).


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [saj...@gmail.com: Google Summer of Code: a NUMA wishlist!]

2012-03-29 Thread Simon Marlow

On 28/03/2012 16:57, Tyson Whitehead wrote:

On March 28, 2012 04:41:16 Simon Marlow wrote:

Sure.  Do you have a NUMA machine to test on?


My understanding is non-NUMA machines went away when the AMD and Intel moved
away from frontside buses (FSB) and integrated the memory controllers on die.

Intel is more recent to this game.  I believe AMD's last non-NUMA machines
where the Athalon XP series and Intel's the Core 2 series.

An easy way to see what you've got is to see what 'numactl --hardware' says.
If the node distance matrix is not uniform, you have NUMA hardware.

As an example, on a 8 socket Opteron machine (32 cores) you get

$ numactl --hardware
available: 8 nodes (0-7)
node 0 size: 16140 MB
node 0 free: 3670 MB
node 1 size: 16160 MB
node 1 free: 3472 MB
node 2 size: 16160 MB
node 2 free: 4749 MB
node 3 size: 16160 MB
node 3 free: 4542 MB
node 4 size: 16160 MB
node 4 free: 3110 MB
node 5 size: 16160 MB
node 5 free: 1963 MB
node 6 size: 16160 MB
node 6 free: 1715 MB
node 7 size: 16160 MB
node 7 free: 2862 MB
node distances:
node   0   1   2   3   4   5   6   7
   0:  10  20  20  20  20  20  20  20
   1:  20  10  20  20  20  20  20  20
   2:  20  20  10  20  20  20  20  20
   3:  20  20  20  10  20  20  20  20
   4:  20  20  20  20  10  20  20  20
   5:  20  20  20  20  20  10  20  20
   6:  20  20  20  20  20  20  10  20
   7:  20  20  20  20  20  20  20  10


Well, you learn something new every day!  On the new 32-core Opteron box 
we have here:


available: 8 nodes (0-7)
node 0 cpus: 0 4 8 12
node 0 size: 8182 MB
node 0 free: 1994 MB
node 1 cpus: 16 20 24 28
node 1 size: 8192 MB
node 1 free: 2783 MB
node 2 cpus: 3 7 11 15
node 2 size: 8192 MB
node 2 free: 2961 MB
node 3 cpus: 19 23 27 31
node 3 size: 8192 MB
node 3 free: 5359 MB
node 4 cpus: 2 6 10 14
node 4 size: 8192 MB
node 4 free: 3030 MB
node 5 cpus: 18 22 26 30
node 5 size: 8192 MB
node 5 free: 4667 MB
node 6 cpus: 1 5 9 13
node 6 size: 8192 MB
node 6 free: 3240 MB
node 7 cpus: 17 21 25 29
node 7 size: 8192 MB
node 7 free: 4031 MB
node distances:
node   0   1   2   3   4   5   6   7
  0:  10  16  16  22  16  22  16  22
  1:  16  10  16  22  22  16  22  16
  2:  16  16  10  16  16  16  16  22
  3:  22  22  16  10  16  16  22  16
  4:  16  22  16  16  10  16  16  16
  5:  22  16  16  16  16  10  22  22
  6:  16  22  16  22  16  22  10  16
  7:  22  16  22  16  16  22  16  10

The node distances on this box are less uniform than yours.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [saj...@gmail.com: Google Summer of Code: a NUMA wishlist!]

2012-03-28 Thread Simon Marlow

On 27/03/2012 01:14, Sajith T S wrote:

Hi Simon,

Thanks for the reply.  It seems that forwarding the message here was a
very good idea!

Simon Marlowmarlo...@gmail.com  wrote:


  -- From a very recent discussion on parallel-haskell [4], we learn
 that RTS' NUMA support could be improved.  The hypothesis is that
 allocating nurseries per Capability might be a better plan than
 using global pool.  We might borrow/steal ideas from hwloc [5] for
 this.


I like this idea too (since I suggested it :-).


I guess you will also be available for eventual pestering about this
stuff, then? :)


Sure.  Do you have a NUMA machine to test on?

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Regarding Haskell FFI

2012-03-28 Thread Simon Marlow

On 27/03/2012 08:56, rajendra prasad wrote:

Hi,

I am trying to load the DLL(Wrapper.dll) in my code(Main.hs). When I am
placing the dll in local directory, I am able to load it through
following command:

ghci Main.hs -L. -lWrapper


But, I am not able to load it if I am putting it in some other
directory(../../bin). I used the following command:

ghci Main.hs -L../../bin/ -lWrapper

I doubt I am not using the correct way to specify the path of dll in the
command.

Please correct me if I am wrong.


What version of GHC is this?  We fixed some bugs in that area recently.

Cheers,
Simon



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Regarding Haskell FFI

2012-03-28 Thread Simon Marlow

I think the fix eventually made its way into 7.4.1.  This is the patch:

http://hackage.haskell.org/trac/ghc/changeset/d146fdbbf8941a8344f0ec300e79dbeabc08d1ea

Cheers,
Simon


On 28/03/2012 09:57, rajendra prasad wrote:

Hi,

I am using GHC version 7.0.4.

Thanks,
Rajendra




On Wed, Mar 28, 2012 at 2:09 PM, Simon Marlow marlo...@gmail.com
mailto:marlo...@gmail.com wrote:

On 27/03/2012 08:56, rajendra prasad wrote:

Hi,

I am trying to load the DLL(Wrapper.dll) in my code(Main.hs).
When I am
placing the dll in local directory, I am able to load it through
following command:

ghci Main.hs -L. -lWrapper


But, I am not able to load it if I am putting it in some other
directory(../../bin). I used the following command:

ghci Main.hs -L../../bin/ -lWrapper

I doubt I am not using the correct way to specify the path of
dll in the
command.

Please correct me if I am wrong.


What version of GHC is this?  We fixed some bugs in that area recently.

Cheers,
Simon






___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [saj...@gmail.com: Google Summer of Code: a NUMA wishlist!]

2012-03-26 Thread Simon Marlow

On 26/03/2012 04:25, Sajith T S wrote:


Date: Sun, 25 Mar 2012 22:49:52 -0400
From: Sajith T Ssaj...@gmail.com
To: The Haskell Cafehaskell-c...@haskell.org
Subject: Google Summer of Code: a NUMA wishlist!

Dear Cafe,

It's last minute-ish to bring this up (in my part of the world it's
still March 25), but graduate students are famously a busy and lazy
lot. :)  I study at Indiana University Bloomington, and I wish to
propose^W rush in this proposal and solicit feedback, mentors, etc
while I can.

Since student application deadline is April 6, I figure we can beat
this into a real proposal's shape by then.  This probably also falls
on the naive and ambitious side of things, and I might not even know
what I'm talking about, but let's see!  That's the idea of proposal,
yes?

Broadly, idea is to improve support for NUMA systems.  Specifically:

  -- Real physical processor affinity with forkOn [1].  Can we fire all
 CPUs if we want to?  (Currently, the number passed to forkOn is
 interpreted as number modulo the value returned by
 getNumCapabilities [2]).


You can get real processor affinity with +RTS -qa in combination with 
forkOn.



  -- Also kind of associated with the above: when launching processes,
 we might want to specify a list of CPUs rather than the number of
 CPUs.  Say, a -N [0,1,3] flag rather than -N 3 flag.  This shall
 enable us to gawk at real pretty htop [3] output.


I like that idea.


  -- From a very recent discussion on parallel-haskell [4], we learn
 that RTS' NUMA support could be improved.  The hypothesis is that
 allocating nurseries per Capability might be a better plan than
 using global pool.  We might borrow/steal ideas from hwloc [5] for
 this.


I like this idea too (since I suggested it :-).


  -- Finally, a logging/monitoring infrastructure to verify assumptions
 and determine if/how local work stays.


I'm not sure if you're suggesting a *new* logging/monitoring framework 
here, but in any case it would make much more sense to extend ghc-events 
and ThreadScope rather than building something new.  There is ongoing 
work to have ThreadScope understand the output of the Linux perf tool, 
which would give insight into CPU scheduling activity amongst other 
things.  Talk to Duncan Coutts dun...@well-typed.com about how far 
this is along and the best way for a GSoc project to help (usually it 
works best when the GSoc project is not dependent on, or depended on by, 
other ongoing projects - reducing synchronisation overhead and latency 
due to blocking is always good!).


Cheers,
Simon



(I would like to acknowledge my fellow conspirators and leave them
unnamed, lest they shall be embarrassed by my... naivete.)

Thanks,
Sajith.

[1] 
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#v:forkOn
[2] 
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#v:getNumCapabilities
[3] http://htop.sourceforge.net/
[4] 
http://groups.google.com/group/parallel-haskell/browse_thread/thread/7ec1ebc73dde8bbd
[5] http://www.open-mpi.org/projects/hwloc/




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: String != [Char]

2012-03-26 Thread Simon Marlow
 The primary argument is to not break something that works well for most
 purposes, including teaching, at a huge cost of backwards compatibility
 for marginal if any real benefits.

I'm persuaded by this argument.  And I'm glad that teachers are speaking up in 
this debate - it's hard to get a balanced discussion on an obscure mailing list.

So I'm far from convinced that [Char] is a bad default for the String type.  
But it's important that as far as possible Text should not be a second class 
citizen, so I'd support adding OverloadedStrings to the language, and maybe 
looking at overloading some of the String APIs in the standard libraries.

Remember that FilePath is not part of the debate, since neither [Char] nor Text 
are correct representations of FilePath.

If we want to do an evaluation of the pedagogical value of [Char] vs. Text, I 
suggest writing something like a regex matcher in both and comparing the two.
 
One more thing: historically, performance considerations have been given a 
fairly low priority in the language design process for Haskell, and rightly so. 
 That doesn't mean performance has been ignored altogether (for example, seq), 
but it is almost never the case that a concession in other language design 
principles (e.g. consistency, simplicity) is made for performance reasons 
alone.  We should remember, when thinking about changes to Haskell, that 
Haskell is the way it is because of this uncompromising attitude, and we should 
be glad that Haskell is not burdened with (many) legacy warts that were 
invented to work around performance problems that no longer exist.  I'm not 
saying that this means we should ignore Text as a performance hack, just that 
performance should not come at the expense of good language design.

Cheers,
Simon



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: GHCi and line numbers (with ghc-7.4.1)

2012-03-23 Thread Simon Marlow

On 22/03/2012 11:36, Christopher Done wrote:

On 22 March 2012 12:13, Simon Marlowmarlo...@gmail.com  wrote:

On 20/03/2012 20:12, Simon Hengel wrote:

They are now incremented with each evaluated expression.


Why *are* they incremented with each evaluation? Surely the only use
for line numbers would be in multi-line statements:


:{

Prelude| do x- [1..10]
Prelude|return y
Prelude| :}

interactive:6:11: Not in scope: `y'

Would it not make more sense to have

interactive:2:11: Not in scope: `y'

as it would do if compiling the file in a source file? From the older
GHCs, this always gives 1, indicating that multi-line statements are
somehow parsed and collapsed before being compiled, or maybe the line
number was just hard coded to 1.


One reason for having the line number is that it gets attached to 
declarations:


Prelude let x = 3
Prelude let y = 4
Prelude :show bindings
x :: Integer = 3
y :: Integer = 4
Prelude :i x
x :: Integer-- Defined at interactive:20:5
Prelude :i y
y :: Integer-- Defined at interactive:21:5

I think another reason we added it was so that we could tell when a 
declaration has been shadowed:


Prelude data T = A
Prelude :i T
data T = A  -- Defined at interactive:25:6
Prelude data T = B
Prelude :i T
data T = B  -- Defined at interactive:27:6
Prelude

admittedly it's not a super-useful feature, but if you're dealing with 
multiple bindings with the same name it does give you some confirmation 
that GHC is thinking about the same one that you are.


Cheers,
Simon




FWIW, in my Emacs mode (making good progress on adding to
haskell-mode) I use the column number in the REPL to highlight on the
line where the problem is (e.g. here
http://chrisdone.com/images/hs-repl-error-demo.png), for GHC 7.* with
proper multi-line support I will automatically wrap any multi-line
expressions entered in the REPL in :{ and :}, it would be cool for
line numbers in errors to be useful for that. (Arguably we should be
using the GHC API and Scion or something like it, but these change
quite often and are hard to support whereas interfacing with GHCi is
quite stable across around seven releases and just works.)

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHCi and line numbers (with ghc-7.4.1)

2012-03-22 Thread Simon Marlow

On 20/03/2012 20:12, Simon Hengel wrote:

Hi,
ghc --interactive now behaves different in regards to line numbers in
error messages than previous versions.

They are now incremented with each evaluated expression.

 $ ghc --interactive -ignore-dot-ghci
 Prelude  foo

 interactive:2:1: Not in scope: `foo'
 Prelude  bar

 interactive:3:1: Not in scope: `bar'

Is there a way to disable this (or alternatively reset the counter for a
running session)?


Sorry, there's no way to reset it at the moment.  What do you need that for?

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: String != [Char]

2012-03-20 Thread Simon Marlow
 On Mon, Mar 19, 2012 at 9:02 AM, Christian Siefkes christ...@siefkes.net
 wrote:
  On 03/19/2012 04:53 PM, Johan Tibell wrote:
  I've been thinking about this question as well. How about
 
  class IsString s where
      unpackCString :: Ptr Word8 - CSize - s
 
  What's the Ptr Word8 supposed to contain? A UTF-8 encoded string?
 
 Yes.
 
 We could make a distinction between byte and Unicode literals and have:
 
 class IsBytes a where
 unpackBytes :: Ptr Word8 - Int - a
 
 class IsText a where
 unpackText :: Ptr Word8 - Int - a
 
 In the latter the caller guarantees that the passed in pointer points to
 wellformed UTF-8 data.

Is there a reason not to put all these methods in the IsString class, with 
appropriate default definitions?  You would need a UTF-8 encoder ( decoder) of 
course, but it would reduce the burden on clients and improve backwards 
compatibility.

Cheers,
Simon



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


RE: What is a punctuation character?

2012-03-19 Thread Simon Marlow
 On Fri, Mar 16, 2012 at 6:49 PM, Ian Lynagh ig...@earth.li wrote:
  Hi Gaby,
 
  On Fri, Mar 16, 2012 at 06:29:24PM -0500, Gabriel Dos Reis wrote:
 
  OK, thanks!  I guess a take away from this discussion is that what is
  a punctuation is far less well defined than it appears...
 
  I'm not really sure what you're asking. Haskell's uniSymbol includes
  all Unicode characters (should that be codepoints? I'm not a Unicode
  expert) in the punctuation category; I'm not sure what the best
  reference is, but e.g. table 12 in
     http://www.unicode.org/reports/tr44/tr44-8.html#Property_Values
  lists a number of Px categories, and a meta-category P Punctuation.
 
 
  Thanks
  Ian
 
 
 Hi Ian,
 
 I guess what I am asking was partly summarized in Iavor's message.
 
 For me, the issue started with bullet number 4 in section 1.1
 
  http://www.haskell.org/onlinereport/intro.html#sect1.1
 
 which states that:
 
The lexical structure captures the concrete representation
of Haskell programs in text files.
 
 That combined with the opening section 2.1 (e.g. example of terminal
 syntax) and the fact that the grammar  routinely described two non-
 terminals ascXXX (for ASCII characters) and uniXXX for (Unicode character)
 suggested that the concrete syntax of Haskell programs in text files is in
 ASCII charset.  Note this does not conflict with the general statement
 that Haskell programs use the Unicode character because the uniXXX could
 use the ASCII charset to introduce Unicode characters -- this is not
 uncommon practice for programming languages using Unicode characters; see
 the link I gave earlier.
 
 However, if I understand Malcolm's message correctly, this is not the
 case.
 Contrary to what I quoted above, Chapter 2 does NOT specify the concrete
 representation of Haskell programs in text files.  What it does is to
 capture the structure of what is obtained from interpreting, *in some
 unspecified encoding or unspecified alphabet*,  the concrete
 representation of Haskell programs in text files.  This conclusion is
 unfortunate, but I believe it is correct.
 Since the encoding or the alphabet is unspecified, it is no longer
 necessarily the case that two Haskell implementations would agree on the
 same lexical interpretation when presented with the same exact text file
 containing  a Haskell program.
 
 In its current form, you are correct that the Report should say
 codepoint
 instead of characters.
 
 I join Iavor's request in clarifying the alphabet used in the grammar.

The report gives meaning to a sequence of codepoints only, it says nothing 
about how that sequence of codepoints is represented as a string of bytes in a 
file, nor does it say anything about what those files are called, or even 
whether there are files at all.

Perhaps some clarification is in order in a future revision, and we should use 
the correct terminology where appropriate.  We should also clarify that 
punctuation means exactly the Punctuation class.

With regards to normalisation and equivalence, my understanding is that Haskell 
does not support either: two identifiers are equal if and only if they are 
represented by the same sequence of codepoints.  Again, we could add a 
clarifying sentence to the report.

Cheers,
Simon



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: profiling and backtracing blues

2012-03-15 Thread Simon Marlow

On 14/03/12 22:32, Ranjit Jhala wrote:

Dear Simon,

I am indeed using the GHC API -- to crunch .hs source to CoreExpr,
which I then walk over to generate refinement type constraints and
so on.

In the past (with GHC 7.04) I *was* able to do some profiling -- to
hunt down a space leak. However, perhaps at that time I was not using
hscCompileCoreExpr but something else? However, it could also be
something silly like me not having built 7.4.1 with profiling support?

Specifically, here's I think, the key bits of GHC API code I'm using
(from the link you sent, I suspect 2 is the problem) but any clues
will be welcome!

1. To extract the mod_guts from the file fn

getGhcModGuts1 :: (GhcMonad m) =  FilePath -  m ModGuts
getGhcModGuts1 fn = do
liftIO $ deleteBinFiles fn
target- guessTarget fn Nothing
addTarget target
load LoadAllTargets
modGraph- depanal [] True
case find ((== fn) . msHsFilePath) modGraph of
  Just modSummary -  do
mod_guts- coreModule `fmap` (desugarModule =  typecheckModule =  
parseModule modSummary)
return mod_guts

2. To convert a raw string (e.g. map or zipWith to the corresponding Name 
inside GHC)
I suspect this is the bit that touches the Ghci code -- because thats where 
I extracted
it from -- Is this what is causing the problem?

stringToNameEnv :: HscEnv -  String -  IO Name
stringToNameEnv env s
 = do L _ rn- hscParseIdentifier env s
  (_, lookupres)- tcRnLookupRdrName env rn
  case lookupres of
Just (n:_) -  return n
_  -  errorstar $ Bare.lookupName cannot find name for:  
++ s


The code in (2) doesn't reach hscCompileCoreExpr.  In (1), the only way 
to get to hscCompileCoreExpr is by compiling a module that contains some 
Template Haskell or quasiquotes.  Could that be the case?  (the reason 
is that TH and QQ both need to compile some code and run it on the fly, 
which requires the interpreter, which is the bit that doesn't work with 
profiling).


Cheers,
Simon







On Mar 14, 2012, at 3:59 AM, Simon Marlow wrote:


On 13/03/2012 21:25, Ranjit Jhala wrote:

Hi all,

I'm trying to use the nifty backtracing mechanism in GHC 74.
AFAICT, this requires everything be built with profiling on),
but as a consequence, I hit this:

You can't call hscCompileCoreExpr in a profiled compiler

Any hints on whether there are work-arounds?


Can you give more details about what you're trying to do?  Are you using the 
GHC API in some way?

I'm afraid there's something of a deep limitation in that the interpreter that 
is used by GHCi and Template Haskell doesn't work with profiling:

  http://hackage.haskell.org/trac/ghc/ticket/3360

We think it is quite a lot of work to fix this.

Cheers,
Simon





___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: profiling and backtracing blues

2012-03-14 Thread Simon Marlow

On 13/03/2012 21:25, Ranjit Jhala wrote:

Hi all,

I'm trying to use the nifty backtracing mechanism in GHC 74.
AFAICT, this requires everything be built with profiling on),
but as a consequence, I hit this:

You can't call hscCompileCoreExpr in a profiled compiler

Any hints on whether there are work-arounds?


Can you give more details about what you're trying to do?  Are you using 
the GHC API in some way?


I'm afraid there's something of a deep limitation in that the 
interpreter that is used by GHCi and Template Haskell doesn't work with 
profiling:


  http://hackage.haskell.org/trac/ghc/ticket/3360

We think it is quite a lot of work to fix this.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Boxed foreign prim

2012-03-13 Thread Simon Marlow

On 12/03/2012 14:22, Edward Kmett wrote:

On Mon, Mar 12, 2012 at 6:45 AM, Simon Marlow marlo...@gmail.com
mailto:marlo...@gmail.com wrote:

But I can only pass unboxed types to foreign prim.

Is this an intrinsic limitation or just an artifact of the use cases
that have presented themselves to date?


It's an intrinsic limitation - the I# box is handled entirely at the
Haskell level, primitives only deal with primitive types.


Ah. I was reasoning by comparison to atomicModifyMutVar#, which deals
with unboxed polymorphic types, and even lies with a too general return
type. Though the result there is returned in an unboxed tuple, the
argument is passed unboxed.

Is that implemented specially?


I'm a little bit confused.

atomicModifyMutVar#
   :: MutVar# s a - (a - b) - State# s - (# State# s, c #)

Is the unboxed polymorphic type you're referring to the MutVar# s a? 
 Perhaps the confusion is around the term unboxed - we normally say 
that MutVar# is unlifted (no _|_), but it is not unboxed because its 
representation is a pointer to a heap object.



But anyway, I suspect your first definition of unsafeIndex will be
faster than the one using foreign import prim, because calling
out-of-line to do the indexing is slow.


Sure though, I suppose that balance of may shift as the side of the
short vector grows. (e.g. with Johan it'd probably be 16 items).

Also pseq is slow - use seq instead.


Of course. I was being paranoid at the time and trying to get it to work
at all. ;)

what you really want is built-in support for unsafeField#, which is
certainly do-able.  It's very similar to dataToTag# in the way that
the argument is required to be evaluated - this is the main
fragility, unfortunately GHC doesn't have a way to talk about things
that are unlifted (except for the primitive unlifted types).  But it
just about works if you make sure there's a seq in the right place.


I'd be happy even if I had to seq the argument myself before applying
it, as I was trying above.


The problem is, that can't be done reliably.  For dataToTag# the 
compiler automatically inserts the seq just before code generation if it 
can't prove that the argument is already evaluated, I think we would 
want to do the same thing for unsafeField#.


See CorePrep.saturateDataToTag in the GHC sources.

Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Boxed foreign prim

2012-03-12 Thread Simon Marlow

On 09/03/2012 04:12, Edward Kmett wrote:

I'm currently working with a lot of very short arrays of fixed length
and as a thought experiment I thought I would try to play with fast
numeric field accessors

In particular, I'd like to use something like foreign prim to do
something like

  foreign import prim cmm_getField unsafeField# :: a - Int# - b

  unsafeField :: a - Int - b
  unsafeField a (I# i) = a' `pseq` unsafeField# a' i -- the pseq could
be moved into the prim, I suppose.
where a' = unsafeCoerce a

  fst :: (a,b) - a
  fst = unsafeField 0

  snd :: (a,b) - b
  snd = unsafeField 1

This becomes more reasonable to consider when you are forced to make
something like

  data V4 a = V4 a a a a

using

  unsafeIndex (V4 a _ _ _) 0 = a
  unsafeIndex (V4 _ b _ _) 1 = b
  unsafeIndex (V4 _ _ c _) 2 = c
  unsafeIndex (V4 _ _ _ d) 3 = d

rather than

  unsafeIndex :: V4 a - Int - a
  unsafeIndex = unsafeField

But I can only pass unboxed types to foreign prim.

Is this an intrinsic limitation or just an artifact of the use cases
that have presented themselves to date?


It's an intrinsic limitation - the I# box is handled entirely at the 
Haskell level, primitives only deal with primitive types.


But anyway, I suspect your first definition of unsafeIndex will be 
faster than the one using foreign import prim, because calling 
out-of-line to do the indexing is slow.  Also pseq is slow - use seq 
instead.


what you really want is built-in support for unsafeField#, which is 
certainly do-able.  It's very similar to dataToTag# in the way that the 
argument is required to be evaluated - this is the main fragility, 
unfortunately GHC doesn't have a way to talk about things that are 
unlifted (except for the primitive unlifted types).  But it just about 
works if you make sure there's a seq in the right place.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Getting the file descriptor of a handle, without closing it

2012-03-12 Thread Simon Marlow

On 11/03/2012 01:31, Volker Wysk wrote:

Hi

This is an addition to my previous post.


This modified version of main seems to work:

main = do

fd- unsafeWithHandleFd stdin return
putStrLn (stdin: fd =  ++ show fd)

fd- unsafeWithHandleFd stdout return
putStrLn (stdout: fd =  ++ show fd)


The way I understand it, unsafeWithHandleFd's job is to keep a reference to
the hande, so it won't be garbage collected, while the action is still
running. Garbage collecting the handle would close it, as well as the
underlying file descriptor, while the latter is still in use by the action.
This can't happen as long as use of the file descriptor is encapsulated in the
action.

This encapsulation can be circumvented by returning the file descriptor, and
that's what the modified main function above does. This should usually never be
done.


Right.  The problem with this:

-- Blocks
   unsafeWithHandleFd stdout $ \fd -
  putStrLn (stdout: fd =  ++ show fd)

is that unsafeWithHandleFd is holding the lock on stdout, while you try 
to write to it with putStrLn.  The implementation of unsafeWithHandleFd 
could probably be fixed to avoid this - as you say, all it needs is to 
hold a reference to the Handle until the function has returned.  The 
usual way to hold a reference to something is to use touch#.



However, I want to use it with stdin, stdout and stderr, only.


Is there some reason you can't just use 0, 1, and 2?

 These three

should never be garbage collected, should they? I think it would be safe to
use unsafeWithHandleFd this way. Am I right?


I wouldn't do that, but you're probably right that it is safe right now. 
(but no guarantees that it will continue to work for ever.)


Cheers,
Simon


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How unsafe are unsafeThawArray# and unsafeFreezeArray#

2012-03-09 Thread Simon Marlow

On 09/03/2012 01:13, Johan Tibell wrote:

Hi,

I just ran across some code that calls unsafeThawArray#, writeArray#,
and unsafeFreezeArray#, in that order. How unsafe is that?

  * Is it unsafe in the sense that if someone has a reference to the
original Array# they will see the value of that pure array change?


Yes.


  * Is it unsafe in the sense things will crash and burn?


No. (at least, we would consider it a bug if a crash was the result)


I looked at the implementation of unsafeFreezeArray#, which is

 emitPrimOp [res] UnsafeFreezeArrayOp [arg] _
= stmtsC [ setInfo arg (CmmLit (CmmLabel mkMAP_FROZEN_infoLabel)),
CmmAssign (CmmLocal res) arg ]

but I couldn't find any implementation of unsafeThawArray#. Is it a
no-op? It seems to me that if unsafeFreezeArray# changes the head of
the array heap object so should unsafeThawArray#.


You'll find the implementation of unsafeThawArray# in rts/PrimOps.cmm, 
reproduced here for your enjoyment:


stg_unsafeThawArrayzh
{
  // SUBTLETY TO DO WITH THE OLD GEN MUTABLE LIST
  //
  // A MUT_ARR_PTRS lives on the mutable list, but a MUT_ARR_PTRS_FROZEN
  // normally doesn't.  However, when we freeze a MUT_ARR_PTRS, we leave
  // it on the mutable list for the GC to remove (removing something from
  // the mutable list is not easy).
  //
  // So that we can tell whether a MUT_ARR_PTRS_FROZEN is on the 
mutable list,

  // when we freeze it we set the info ptr to be MUT_ARR_PTRS_FROZEN0
  // to indicate that it is still on the mutable list.
  //
  // So, when we thaw a MUT_ARR_PTRS_FROZEN, we must cope with two cases:
  // either it is on a mut_list, or it isn't.  We adopt the convention that
  // the closure type is MUT_ARR_PTRS_FROZEN0 if it is on the mutable list,
  // and MUT_ARR_PTRS_FROZEN otherwise.  In fact it wouldn't matter if
  // we put it on the mutable list more than once, but it would get 
scavenged

  // multiple times during GC, which would be unnecessarily slow.
  //
  if (StgHeader_info(R1) != stg_MUT_ARR_PTRS_FROZEN0_info) {
SET_INFO(R1,stg_MUT_ARR_PTRS_DIRTY_info);
recordMutable(R1, R1);
// must be done after SET_INFO, because it ASSERTs closure_MUTABLE()
RET_P(R1);
  } else {
SET_INFO(R1,stg_MUT_ARR_PTRS_DIRTY_info);
RET_P(R1);
  }
}


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: How unsafe are unsafeThawArray# and unsafeFreezeArray#

2012-03-09 Thread Simon Marlow

On 09/03/2012 11:44, John Meacham wrote:

Out of curiosity, Is the reason you keep track of mutable vs not
mutable heap allocations in order to optimize the generational garbage
collector? as in, if a non-mutable value is placed in an older
generation you don't need to worry about it being updated with a link
to a newer one or is there another reason you keep track of it? Is it
a pure optimization or needed for correctness?


It's for correctness - this is how we track all the pointers from the 
old generation to the new generation.


There are lots of different ways of doing generational GC write 
barriers, and indeeed GHC uses a mixture of different methods: a 
remembered set for updated thunks and mutated IORefs, and a card table 
for arrays.  Mutable arrays normally stay in the remembered set 
continuously, so that the write barrier for array mutation doesn't have 
to worry about adding the array to the remembered set, but it does 
modify the card table.  An immutable array can be dropped from the 
remembered set, but only once all its pointers are safely pointing to 
the old generation.  But then, unsafeThawArray has to put it back in the 
remembered set again.


Once upon a time all the mutable objects were in the remembered set, but 
gradually we've been moving away from this and adding them to the 
remembered set only when they get modified.  There are a few objects 
that still use the old method: TVars, for example.



A weakness of jhc right now is its stop the world garbage collector,
so far, I have been mitigating it by not creating garbage whenever
possible, I do an escape analysis and allocate values on the stack
when possible, and recognize linear uses of heap value in some
circumstances and re-use heap locations directly (like when a cons
cell is deconstructed and another is constructed right away I can
reuse the spacue in certain cases) but eventually a full GC needs to
run and it blocks the whole runtime which is particularly not good for
embedded targets (where jhc seems to be thriving at the moment.). My
unsafeFreeze and unsafeThaw are currently NOPs. frozen arrays are just
a newtype of non-frozen ones (see
http://repetae.net/dw/darcsweb.cgi?r=jhc;a=headblob;f=/lib/jhc-prim/Jhc/Prim/Array.hs)


Interesting - we do generational GC instead of escape analysis and other 
compile-time GC techniques.  IMO, generational GC is a necessity, and 
once you have it, you don't need to worry so much about short-lived 
allocation because it all gets recycled in the cache.  (try +RTS -G1 
sometime to see the difference between generational GC and ordinary 
2-space copying).


On the other hand, generational GC does give rise to some strange 
performance characteristics, especially with mutable objects and 
sometimes laziness.  And it doesn't solve the pause time problem, except 
that the long pauses are less frequent.  Improving pause times is high 
up my list of things we want to do in the GC.


Cheers,
Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


<    1   2   3   4   5   6   7   8   9   10   >