from:"John Lato"

Re: default instance for IsString

2012-04-25 Thread John Lato

> From: Yitzchak Gale 
>
> Erik Hesselink wrote:
>> I don't think IsString should be dismissed so easily.
>
> I'm just saying I don't want to be forced to use it.
> If others like it, I'm not dismissing it.
>
>> we have a couple of newtypes over Text that do different kinds of
>> normalization. An IsString instance for these is useful and total.
>
> True. Perhaps you'd be able to get IsBuiltinString instances
> for those too, using newtype deriving, if only the method
> names of IsBuiltinString are hidden and the class name is
> exported.
>
> If that doesn't work, I'm fine with using a quasiquoter for
> those instead. Or even just the usual newtype unwrapping
> and wrapping. And again, if you provide IsString and others
> want to use it, that's fine.

I don't see how it would be possible to use a hidden IsBuiltinString
as you describe without bringing Text into base (or alternatively not
providing Text support).  Perhaps unfortunately, I think that makes
this solution a non-starter.

I think a neater solution would be some sort of modular String typing,
as I'm pretty sure somebody else on this list already mentioned.
Perhaps a pragma like "DefaultString Data.Text.Text", which would mean
that string literals would be treated as the provided monomorphic
type, on a per-module basis?

John Lato

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

parallel garbage collection performance

2012-06-18 Thread John Lato

Hello,

I have a program that is intermittently experiencing performance
issues that I believe are related to parallel GC, and I was hoping to
get some advice on how I might improve it.  Essentially, any given
execution is either slow or fast (the same executable, without
recompiling), most often slow.  So far I can't find anything that
would trigger either case.  This is with ghc-7.4.2 on 64bit linux.
Here are the statistics from running with -N4 -A8m -s:

slow run:

  16,647,460,328 bytes allocated in the heap
 313,767,248 bytes copied during GC
  17,305,120 bytes maximum residency (22 sample(s))
 601,952 bytes maximum slop
  73 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
  Gen  0  1268 colls,  1267 par8.62s8.00s 0.0063s0.0389s
  Gen  122 colls,22 par0.63s0.60s 0.0275s0.0603s

  Parallel GC work balance: 1.53 (39176141 / 25609887, ideal 4)

MUT time (elapsed)   GC time  (elapsed)
  Task  0 (worker) :0.00s(  0.01s)   0.00s(  0.00s)
  Task  1 (worker) :0.00s( 13.66s)   0.01s(  0.04s)
  Task  2 (bound)  :0.00s( 13.98s)   0.00s(  0.00s)
  Task  3 (worker) :0.00s( 18.14s)   0.16s(  0.44s)
  Task  4 (worker) :0.53s( 17.49s)   1.29s(  4.25s)
  Task  5 (worker) :0.00s( 17.45s)   1.25s(  4.42s)
  Task  6 (worker) :0.00s( 14.98s)   1.75s(  6.90s)
  Task  7 (worker) :0.00s( 21.87s)   0.02s(  0.06s)
  Task  8 (worker) :0.01s( 37.12s)   0.06s(  0.17s)
  Task  9 (worker) :0.00s( 21.41s)   4.88s( 15.99s)
  Task 10 (worker) :0.84s( 43.06s)   1.99s(  8.25s)
  Task 11 (bound)  :6.39s( 51.13s)   0.06s(  0.18s)
  Task 12 (worker) :0.00s(  0.00s)   8.04s( 21.42s)
  Task 13 (worker) :0.43s( 28.38s)   8.14s( 22.94s)
  Task 14 (worker) :5.35s( 29.30s)   5.81s( 22.02s)

  SPARKS: 7 (7 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INITtime0.03s  (  0.01s elapsed)
  MUT time   43.88s  ( 42.71s elapsed)
  GC  time9.26s  (  8.60s elapsed)
  EXITtime0.01s  (  0.01s elapsed)
  Total   time   53.65s  ( 51.34s elapsed)

  Alloc rate374,966,825 bytes per MUT second

  Productivity  82.7% of total user, 86.4% of total elapsed

gc_alloc_block_sync: 1388000
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

-- --
Fast run:

  42,061,441,560 bytes allocated in the heap
 725,062,720 bytes copied during GC
  36,963,480 bytes maximum residency (21 sample(s))
   1,382,536 bytes maximum slop
 141 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed)  Avg pause  Max pause
  Gen  0  3206 colls,  3205 par8.34s1.87s 0.0006s0.0089s
  Gen  121 colls,21 par0.76s0.17s 0.0081s0.0275s

  Parallel GC work balance: 1.78 (90535973 / 50955059, ideal 4)

MUT time (elapsed)   GC time  (elapsed)
  Task  0 (worker) :0.00s(  0.00s)   0.00s(  0.00s)
  Task  1 (worker) :0.00s(  0.00s)   0.00s(  0.00s)
  Task  2 (worker) :0.00s( 11.50s)   0.00s(  0.00s)
  Task  3 (worker) :0.00s( 12.40s)   0.00s(  0.00s)
  Task  4 (worker) :0.58s( 12.20s)   0.59s(  0.61s)
  Task  5 (bound)  :0.00s( 12.89s)   0.00s(  0.00s)
  Task  6 (worker) :0.00s( 13.40s)   0.02s(  0.02s)
  Task  7 (worker) :0.00s( 14.66s)   0.00s(  0.00s)
  Task  8 (worker) :0.95s( 14.18s)   0.69s(  0.76s)
  Task  9 (worker) :2.82s( 13.50s)   1.37s(  1.44s)
  Task 10 (worker) :1.72s( 17.59s)   1.07s(  1.16s)
  Task 11 (worker) :3.99s( 24.68s)   0.37s(  0.38s)
  Task 12 (worker) :1.24s( 24.25s)   0.80s(  0.82s)
  Task 13 (bound)  :6.18s( 25.02s)   0.04s(  0.04s)
  Task 14 (worker) :1.46s( 23.42s)   1.59s(  1.65s)
  Task 15 (worker) :0.00s(  0.00s)   0.66s(  0.66s)
  Task 16 (worker) :   11.00s( 23.36s)   1.67s(  1.70s)

  SPARKS: 28 (28 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INITtime0.04s  (  0.02s elapsed)
  MUT time   42.08s  ( 23.02s elapsed)
  GC  time9.10s  (  2.04s elapsed)
  EXITtime0.00s  (  0.00s elapsed)
  Total   time   51.69s  ( 25.09s elapsed)

  Alloc rate987,695,300 bytes per MUT second

  Productivity  82.3% of total user, 169.6% of total elapsed

gc_alloc_block_sync: 164572
whitehole_spin: 0
gen[0].sync: 164
gen[1].sync: 18147

When I record an eventlog and view it with Threadscope, the slow run
shows long frequent pauses for GC, whereas on a fast r

Re: parallel garbage collection performance

2012-06-18 Thread John Lato

Thanks for the suggestions.  I'll try them and report back.  Although
I've since found that out of 3 not-identical systems, this problem
only occurs on one.  So I may try different kernel/system libs and see
where that gets me.

-qg is funny.  My interpretation from the results so far is that, when
the parallel collector doesn't get stalled, it results in a big win.
But when parGC does stall, it's slower than disabling parallel gc
entirely.

I had thought the last core parallel slowdown problem was fixed a
while ago, but apparently not?

Thanks,
John

On Tue, Jun 19, 2012 at 8:49 AM, Ben Lippmeier  wrote:
>
> On 19/06/2012, at 24:48 , Tyson Whitehead wrote:
>
>> On June 18, 2012 04:20:51 John Lato wrote:
>>> Given this, can anyone suggest any likely causes of this issue, or
>>> anything I might want to look for?  Also, should I be concerned about
>>> the much larger gc_alloc_block_sync level for the slow run?  Does that
>>> indicate the allocator waiting to alloc a new block, or is it
>>> something else?  Am I on completely the wrong track?
>>
>> A total shot in the dark here, but wasn't there something about really bad
>> performance when you used all the CPUs on your machine under Linux?
>>
>> Presumably very tight coupling that is causing all the threads to stall
>> everytime the OS needs to do something or something?
>
> This can be a problem for data parallel computations (like in Repa). In Repa 
> all threads in the gang are supposed to run for the same time, but if one 
> gets swapped out by the OS then the whole gang is stalled.
>
> I tend to get best results using -N7 for an 8 core machine.
>
> It is also important to enable thread affinity (with the -qa) flag.
>
> For a Repa program on an 8 core machine I use +RTS -N7 -qa -qg
>
> Ben.
>
>

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: parallel garbage collection performance

2012-06-25 Thread John Lato

Thanks very much for this information.  My observations match your
recommendations, insofar as I can test them.

Cheers,
John

On Mon, Jun 25, 2012 at 11:42 PM, Simon Marlow  wrote:
> On 19/06/12 02:32, John Lato wrote:
>>
>> Thanks for the suggestions.  I'll try them and report back.  Although
>> I've since found that out of 3 not-identical systems, this problem
>> only occurs on one.  So I may try different kernel/system libs and see
>> where that gets me.
>>
>> -qg is funny.  My interpretation from the results so far is that, when
>> the parallel collector doesn't get stalled, it results in a big win.
>> But when parGC does stall, it's slower than disabling parallel gc
>> entirely.
>
>
> Parallel GC is usually a win for idiomatic Haskell code, it may or may not
> be a good idea for things like Repa - I haven't done much analysis of those
> types of programs yet.  Experiment with the -A flag, e.g. -A1m is often
> better than the default if your processor has a large cache.
>
> However, the parallel GC will be a problem if one or more of your cores is
> being used by other process(es) on the machine.  In that case, the GC
> synchronisation will stall and performance will go down the drain.  You can
> often see this on a ThreadScope profile as a big delay during GC while the
> other cores wait for the delayed core.  Make sure your machine is quiet
> and/or use one fewer cores than the total available.  It's not usually a
> good idea to use hyperthreaded cores either.
>
> I'm also seeing unpredictable performance on a 32-core AMD machine with
> NUMA.  I'd avoid NUMA for Haskell for the time being if you can.  Indeed you
> get unpredictable performance on this machine even for single-threaded code,
> because it makes a difference on which node the pages of your executable are
> cached (I heard a rumour that Linux has some kind of a fix for this in the
> pipeline, but I don't know the details).
>
>
>> I had thought the last core parallel slowdown problem was fixed a
>> while ago, but apparently not?
>
>
> We improved matters by inserting some "yield"s into the spinlock loops.
>  This helped a lot, but the problem still exists.
>
> Cheers,
>        Simon
>
>
>
>> Thanks,
>> John
>>
>> On Tue, Jun 19, 2012 at 8:49 AM, Ben Lippmeier  wrote:
>>>
>>>
>>> On 19/06/2012, at 24:48 , Tyson Whitehead wrote:
>>>
>>>> On June 18, 2012 04:20:51 John Lato wrote:
>>>>>
>>>>> Given this, can anyone suggest any likely causes of this issue, or
>>>>> anything I might want to look for?  Also, should I be concerned about
>>>>> the much larger gc_alloc_block_sync level for the slow run?  Does that
>>>>> indicate the allocator waiting to alloc a new block, or is it
>>>>> something else?  Am I on completely the wrong track?
>>>>
>>>>
>>>> A total shot in the dark here, but wasn't there something about really
>>>> bad
>>>> performance when you used all the CPUs on your machine under Linux?
>>>>
>>>> Presumably very tight coupling that is causing all the threads to stall
>>>> everytime the OS needs to do something or something?
>>>
>>>
>>> This can be a problem for data parallel computations (like in Repa). In
>>> Repa all threads in the gang are supposed to run for the same time, but if
>>> one gets swapped out by the OS then the whole gang is stalled.
>>>
>>> I tend to get best results using -N7 for an 8 core machine.
>>>
>>> It is also important to enable thread affinity (with the -qa) flag.
>>>
>>> For a Repa program on an 8 core machine I use +RTS -N7 -qa -qg
>>>
>>> Ben.
>>>
>>>
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

build failures when hiding non-visible imports

2012-08-16 Thread John Lato

Hello,

One of the issues I've noticed with ghc-7.6 is that a number of
packages fail due to problematic import statements.  For example, any
module which uses

> import Prelude hiding (catch)

now fails to build with the error

Module `Prelude' does not export `catch'

Of course fixing this example is relatively straightforward, but that
isn't always the case.

Would it be reasonable to change ghc's behavior to treat this as a
warning instead of an error?

Cheers,
John L.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: build failures when hiding non-visible imports

2012-08-20 Thread John Lato

> From: Brandon Allbery 
>
> On Sat, Aug 18, 2012 at 9:10 PM, Carter Schonwald <
> carter.schonw...@gmail.com> wrote:
>
>> meaning: flags for treating it as a warning vs as an error?  (pardon, i'm
>> over thinking ambiguity in phrasing).
>> if thats the desired difference, that sounds good to me!
>>
>
> I would expect it means that, having demoted it to a warning, we would have
> -fwarn-hiding-no-target / -fno-warn-hiding-no-target (or whatever we call
> it) as with all other warnings.
>
> For warning vs. error, it seems to me that should be more general:  perhaps
> taking any of the -f[no-]warn-* options and replacing "warn" with "err".

Yes.  To be concrete, this is what I would like to see.

In a statement of the form:

  import Module hiding (x)
where Module doesn't export x, ghc should report a warning instead of an error

This warning would be enabled/disabled by the usual flags (I like
-fwarn-unused-import-hiding, but -fwarn-hiding-no-target is good too).

The warning would be on by default.

If a user wants this to be an error, I think -Werror should be
sufficient.  I am unable to think of any case where hiding a
non-visible symbol would lead to errors on its own, and any errors
likely to occur in tandem with this issue already have their own, more
helpful, error conditions (e.g. symbols not in scope, symbols in a
qualified import list not visible).

I agree with Ganesh's point that it would be beneficial to have this
available for ghc-7.6.1 if possible.

John L.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

memory fragmentation with ghc-7.6.1

2012-09-20 Thread John Lato

Hello,

We've noticed that some applications exhibit significantly worse
memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading
to out of memory errors in some cases.  Running one app with +RTS -s,
I see this:

ghc-7.4
 525,451,699,736 bytes allocated in the heap
  53,404,833,048 bytes copied during GC
  39,097,600 bytes maximum residency (2439 sample(s))
   1,547,040 bytes maximum slop
 628 MB total memory in use (0 MB lost due to fragmentation)

ghc-7.6
512,535,907,752 bytes allocated in the heap
  53,327,184,712 bytes copied during GC
  40,038,584 bytes maximum residency (2391 sample(s))
   1,456,472 bytes maximum slop
3414 MB total memory in use (2744 MB lost due to fragmentation)

The total memory in use (consistent with 'top's output) is much higher
when built with ghc-7.6, due entirely to fragmentation.

I've filed a bug report
(http://hackage.haskell.org/trac/ghc/ticket/7257,
http://hpaste.org/74987), but I was wondering if anyone else has
noticed this?  I'm not entirely sure what's triggering this behavior
(some applications work fine), although I suspect it has to do with
allocation of pinned memory.

John L.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: memory fragmentation with ghc-7.6.1

2012-09-20 Thread John Lato

Yes, that's my current understanding.  I see this with ByteString and
Data.Vector.Storable, but not
Data.Vector/Data.Vector.Unboxed/Data.Text.  As ByteStrings are pretty
widely used for IO, I expected that somebody else would have
experienced this too.

I would expect some memory fragmentation with pinned memory, but the
change from ghc-7.4 to ghc-7.6 is rather extreme (no fragmentation to
several GB).

John L.

On Fri, Sep 21, 2012 at 10:53 AM, Carter Schonwald
 wrote:
> So the problem is only with the data structures on the heap that are pinned
> in place to play nice with C?
>
> I'd be curious to understand the change too, though per se pinned memory (a
> la storable or or bytestring) will by definition cause memory fragmentation
> in a gc'd lang as a rule,  (or at least one like Haskell).
> -Carter
>
> On Thu, Sep 20, 2012 at 8:59 PM, John Lato  wrote:
>>
>> Hello,
>>
>> We've noticed that some applications exhibit significantly worse
>> memory usage when compiled with ghc-7.6.1 compared to ghc-7.4, leading
>> to out of memory errors in some cases.  Running one app with +RTS -s,
>> I see this:
>>
>> ghc-7.4
>>  525,451,699,736 bytes allocated in the heap
>>   53,404,833,048 bytes copied during GC
>>   39,097,600 bytes maximum residency (2439 sample(s))
>>1,547,040 bytes maximum slop
>>  628 MB total memory in use (0 MB lost due to fragmentation)
>>
>> ghc-7.6
>> 512,535,907,752 bytes allocated in the heap
>>   53,327,184,712 bytes copied during GC
>>   40,038,584 bytes maximum residency (2391 sample(s))
>>1,456,472 bytes maximum slop
>> 3414 MB total memory in use (2744 MB lost due to
>> fragmentation)
>>
>> The total memory in use (consistent with 'top's output) is much higher
>> when built with ghc-7.6, due entirely to fragmentation.
>>
>> I've filed a bug report
>> (http://hackage.haskell.org/trac/ghc/ticket/7257,
>> http://hpaste.org/74987), but I was wondering if anyone else has
>> noticed this?  I'm not entirely sure what's triggering this behavior
>> (some applications work fine), although I suspect it has to do with
>> allocation of pinned memory.
>>
>> John L.
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: GHC 7.8 release?

2013-02-07 Thread John Lato

I agree with Ian.  Mid-February is very soon, and there's a lot of stuff
that seems to just be coming in now.  That doesn't leave much time for
testing to get 7.8 out in sync with the platform.

Although my perspective is a bit colored by the last release.  Testing the
7.6.1 RC took several weeks for us because of the number of upstream
packages that needed to be updated (not all trivially).  By the time we
were prepared to begin testing our own systems 7.6.1 was already released,
and we couldn't use it because of a number of bugs (
http://hackage.haskell.org/trac/ghc/ticket/7257 was a blocker, but there
were others also).  Most of the bugs were fixed very quickly (thanks Simon
M. and Simon PJ!), but by then they were already in the wild.  If there had
been a bit more time to test 7.6.1, maybe some of those fixes would have
made it into the release.


John L.


On Thu, Feb 7, 2013 at 10:23 PM, Ian Lynagh  wrote:

>
> I'm not too optimistic we could actually get the final release out
> during February, assuming we want to allow a couple of weeks for people
> to test an RC.
>
> Does the Haskell Platform actually want to commit to using a GHC release
> with "tons of [new] stuff", that has had little testing, days or weeks
> after its release? I thought the idea was that it would favour
> known-good releases over the latest-and-greatest, but perhaps I
> misunderstood or the philosophy has changed.
>
>
> Thanks
> Ian
>
> On Thu, Feb 07, 2013 at 09:00:37AM -0500, Richard Eisenberg wrote:
> > Geoff's reasoning seems quite sound.
> > +1 for February release.
> >
> > On Feb 7, 2013, at 3:50 AM, Geoffrey Mainland 
> wrote:
> >
> > > In practice the versions of GHC that are widely used are those that are
> > > included in the platform. Maybe we should coordinate with their next
> > > release? They are targeting a May 6 release, and the release process is
> > > starting March 4, so it sounds like the original GHC release plan
> > > (February release) would be a good fit for the platform as it would
> > > allow library writers to catch up and ensure that STABLE was tested
> > > enough for inclusion in the platform. It would be a shame to miss the
> > > platform release.
> > >
> > > Geoff
> > >
> > > On 02/07/2013 08:25 AM, Simon Peyton-Jones wrote:
> > >> Dear GHC users,
> > >>
> > >> *
> > >> *
> > >>
> > >> *Carter*: Will this RTS update make it into ghc 7.8 update thats
> coming
> > >> up in the next monthish?
> > >>
> > >> *Andreas*: We are almost there - we are now trying to sort out a
> problem
> > >> on mac os x. It would be helpful to know if there is a cutoff date for
> > >> getting things into 7.8.
> > >>
> > >>
> > >>
> > >> Simon, Ian, and I have just been discussing 7.8, and would be
> interested
> > >> in what you guys think.
> > >>
> > >>
> > >> At ICFP we speculated that we’d make a release of GHC soon after
> > >> Christmas to embody tons of stuff that has been included since 7.6,
> > >> specifically:
> > >>
> > >> · major improvements in DPH (vectorisation avoidance, new
> > >> vectoriser)
> > >>
> > >> · type holes
> > >>
> > >> · rebindable list syntax
> > >>
> > >> · major changes to the type inference engine
> > >>
> > >> · type level natural numbers
> > >>
> > >> · overlapping type families
> > >>
> > >> · the new code generator
> > >>
> > >> · support for vector (SSE/AVX) instructions
> > >>
> > >>
> > >>
> > >> Whenever it comes it would definitely be great to include Andreas &
> > >> friends’ work:
> > >>
> > >> · Scheduler changes to the RTS to improve latency
> > >>
> > >>
> > >>
> > >> The original major reason for proposing a post-Xmas release was to get
> > >> DPH in a working state out into the wild.  However, making a proper
> > >> release imposes costs on everyone else.  Library authors have to
> scurry
> > >> around to make their libraries work, etc.   Some of the new stuff
> hasn’t
> > >> been in HEAD for that long, and hence has not been very thoroughly
> > >> tested.   (But of course making a release unleashes a huge wave of
> > >> testing that doesn’t happen otherwise.)
> > >>
> > >>
> > >>
> > >> So another alternative is to leave it all as HEAD, and wait another
> few
> > >> months before making a release.  You can still use all the new stuff
> by
> > >> compiling HEAD, or grabbing a snapshot distribution.  And it makes it
> > >> hard for the Haskell platform if GHC moves too fast. Many people are
> > >> still on 7.4.
> > >>
> > >>
> > >>
> > >> There seem to be pros and cons each way.  I don’t have a strong
> > >> opinion.  If you have a view, let us know.
> > >>
> > >>
> > >>
> > >> Simon
>
> ___
> Haskell-platform mailing list
> haskell-platf...@projects.haskell.org
> http://projects.haskell.org/cgi-bin/mailman/listinfo/haskell-platform
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://w

Re: GHC 7.8 release?

2013-02-10 Thread John Lato

While I'm notionally in favor of decoupling API-breaking changes from
non-API breaking changes, there are two major difficulties: GHC.Prim and
Template Haskell. Should a non-API-breaking change mean that GHC.Prim is
immutable?  If so, this greatly restricts GHC's development.  If not, it
means that a large chunk of hackage will become unbuildable due to deps on
vector and primitive.  With Template Haskell the situation is largely
similar, although the deps are different.

What I would like to see are more patch-level bugfix releases.  I suspect
the reason we don't have more is that making a release is a lot of work.
 So, Ian, what needs to happen to make more frequent patch releases
feasible?

On Mon, Feb 11, 2013 at 7:42 AM, Carter Schonwald <
carter.schonw...@gmail.com> wrote:

> Well said. Having a more aggressive release cycle is another interesting
> perspective.
> On Feb 10, 2013 6:21 PM, "Gabriel Dos Reis" 
> wrote:
>
>> On Sun, Feb 10, 2013 at 3:16 PM, Ian Lynagh  wrote:
>> > On Sun, Feb 10, 2013 at 09:02:18PM +, Simon Peyton-Jones wrote:
>> >>
>> >> You may ask what use is a GHC release that doesn't cause a wave of
>> updates?  And hence that doesn't work with at least some libraries.  Well,
>> it's a very useful forcing function to get new features actually out and
>> tested.
>> >
>> > But the way you test new features is to write programs that use them,
>> > and programs depend on libraries.
>> >
>> >
>> > Thanks
>> > Ian
>>
>> Releasing GHC early and often (possibly with API breakage) isn't
>> really the problem.  The real problem is how to coordinate with
>> library authors (e.g. Haskell Platform), etc.
>>
>> I suspect GHC should continue to offer a platform for research
>> and experiments. That is much harder if you curtail the ability to
>> release GHC early and often.
>>
>> -- Gaby
>>
>> ___
>> ghc-devs mailing list
>> ghc-d...@haskell.org
>> http://www.haskell.org/mailman/listinfo/ghc-devs
>>
>
> ___
> ghc-devs mailing list
> ghc-d...@haskell.org
> http://www.haskell.org/mailman/listinfo/ghc-devs
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Release plans

2013-03-20 Thread John Lato

On Thu, Mar 21, 2013 at 8:08 AM, Ian Lynagh  wrote:

>
> We've had long discussions about snapshot releases, and the tricky part
> is that while we would like people to be able to try out new GHC
> features, we don't want to add to the burden of library maintainers by
> requiring them to update their libraries to work with a new GHC release
> more than once a year.
>
> But perhaps we should announce a "library API freeze" some time before
> the first RC on a stable branch. That way people can safely update their
> dependencies at that point, and by the time the RC is out people testing
> the RC will be able to test more without running into problems
> installing libraries.
>

What would be ideal would be if this "library API freeze" coincided with
the snapshot (odd-numbered) release.  Then library maintainers would only
have to update once, and hopefully many of them would have their updates
available before ghc's stable release.

John L.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Liberalising IncoherentInstances

2013-07-29 Thread John Lato

+1 to the original proposal and Edward's suggestion of emitting a warning.
I've occasionally wanted this behavior from IncoherentInstances as well.


On Mon, Jul 29, 2013 at 3:01 PM, Edward Kmett  wrote:

> I'll probably never use it, but I can't see any real problems with the
> proposal. In many ways it is what I always expected IncoherentInstances to
> be.
>
> One thing you might consider is that if you have to make an arbitrary
> instance selection at the end during compile time, making that emit a
> warning by default or at least under -Wall. That way it is clear when you
> are leaning on underdetermined semantics.
>
> -Edward
>
>
> On Sat, Jul 27, 2013 at 4:16 PM, Simon Peyton-Jones  > wrote:
>
>> Friends
>>
>> I've realised that GHC's -XIncoherentInstances flag is, I think,
>> over-conservative.  I propose to liberalise it a bit. This email describes
>> the issue.  Please yell if you think this is a bad idea.
>>
>> Simon
>>
>> Suppose we have
>>
>> class C a where { op :: a -> String }
>> instance C [a] where ...
>> instance C [Char] where ...
>>
>> f :: [b] -> String
>> f xs = "Result:" ++ op xs
>>
>> With -XOverlappingInstances, but without -XIncoherentInstances, f won't
>> compile.  Reason: if we call 'f' at Char (e.g.  f "foo") then you might
>> think we should use instance C [Char].  For example, if we inlined 'f' at
>> the call site, to get ("Result:" ++ op "foo"), we certainly then would use
>> the C [Char] instance, giving perhaps different results.  If we accept the
>> program as-is, we'll permanently commit 'f' to using the C [a] instance.
>>
>> The -XIncoherentInstances flag says "Go ahead and use an instance, even
>> if another instance might become relevant if you were to specialise or
>> inline the enclosing function."  The GHC user manual gives a more precise
>> spec [1].
>>
>> Now consider this
>> class D a b where { opD :: a -> b -> String }
>> instance D Int b where ...
>> instance D a Int where ...
>>
>> g (x::Int) = opD x x
>>
>> Here 'g' gives rise to a constraint (D Int Int), and that matches two
>> instance declarations.   So this is rejected regardless of flags.  We can
>> fix it up by adding
>> instance D Int Int where ...
>> but this is pretty tiresome in cases where it really doesn't matter which
>> instance you choose.  (And I have a use-case where it's more than tiresome
>> [2].)
>>
>> The underlying issue is similar to the previous example.  Before, there
>> was *potentially* more than one way to generate evidence for (C [b]); here
>> there is *actually* more than one instance.  In both cases the dynamic
>> semantics of the language are potentially affected by the choice -- but
>> -XIncoherentInstnaces says "I don't care".
>>
>>
>> So the change I propose to make IncoherentInstances to pick arbitrarily
>> among instances that match.  More precisely, when trying to find an
>> instance matching a target constraint (C tys),
>>
>> a) Find all instances matching (C tys); these are the candidates
>>
>> b) Eliminate any candidate X for which another candidate Y is
>>   strictly more specific (ie Y is a substitution instance of X),
>>   if either X or Y was complied with -XOverlappingInstances
>>
>> c) Check that any non-candidate instances that *unify* with (C tys)
>>were compiled with -XIncoherentInstances
>>
>> d) If only one candidate remains, pick it.
>> Otherwise if all remaining candidates were compiled with
>> -XInccoherentInstances, pick an arbitrary candidate
>>
>> All of this is precisely as now, except for the "Otherwise" part of (d).
>>  One could imagine different flags for the test in (c) and (d) but I really
>> don't think it's worth it.
>>
>>
>> Incidentally, I think it'd be an improvement to localise the
>> Overlapping/Incoherent flags to particular instance declarations, via
>> pragmas, something like
>> instance C [a] where
>>   {-# ALLOW_OVERLAP #-}
>>   op x = 
>>
>> Similarly {-# ALLOW_INCOHERENT #-}.   Having -XOverlappingInstances for
>> the whole module is a bit crude., and might be missed when looking at an
>> instance.   How valuable would this be?
>>
>> [1]
>> http://www.haskell.org/ghc/docs/latest/html/users_guide/type-class-extensions.html#instance-overlap
>> [2] http://ghc.haskell.org/trac/ghc/wiki/NewtypeWrappers
>>
>>
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>
>
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

RE: Giving function a larger arity

2013-11-11 Thread John Lato

Originally I thought Plan B would make more sense, but if Plan A were
implemented could this one-shot type annotation be unified with the state
hack? I'm envisioning something like RULES, where if the type matches ghc
knows it's a one-shot lambda.

I think it would be better to not do any analysis and leave this completely
up to the user. My intention is to get a mechanism to tell ghc it's okay to
recompute something in a lambda, essentially a manual state hack. I seem to
recall wanting this, but I don't remember the exact use case. It's possible
it was one-shot anyway.

John L.
On Nov 11, 2013 5:12 AM, "Simon Peyton-Jones"  wrote:

>  Strangely enough I’ve been thinking about eta expansion in the last day
> or two.  It’s one of GHC’s more delicate corners, because
>
> 1.   There can be big performance boosts
>
> 2.   If you push a redex inside a lambda
>
> But as you point out (2) may be arbitrarily bad unless you know the lambda
> is called at most once (is “one-shot”).
>
>
>
> There is really no good way to declare a lambda to be one-shot right now.
> As you discovered, full laziness tends to defeat your attempts to do so!
> (A workaround is to switch off full laziness, but that doesn’t feel right.)
>
>
>
> What is a more general solution?  I can think of two.
>
>
>
> A.  Declare one-shot-ness in the type.  Something like this:
>
> newtype OneShot a = OS a
>
> newtype Builder = Builder (OneShot (Ptr ()) -> IO (Ptr ()))
>
> plus telling GHC that anything with a OneShot type is a
> one-shot lambda.
>
>
>
> B.  Declaring one-shot-ness in the terms.  Something like
>
> ..Builder (\x {-# ONESHOT #-} -> blah)…
>
> That would declare this particular lambda to be one-shot,
> but not any other.
>
>
>
> Notes
>
> · Plan A would require a bit of fiddling to move your values in
> and out of the OneShot type.  And it’d make the terms a bit bigger at
> compile time.
>
> · Plan B is more explicit all the use sites.
>
> · Both plans would be vulnerable to user error.  I could imagine
> and analysis that would guarantee that you met the one-shot claim; but it
> would necessarily be quite conservative.  That might or might not be OK
>
> · GHC already embodies a version of (A): the “state hack” means
> that  a lambda whose binder is a state token (State# RealWorld#) is treated
> as one-shot.  We have many bug reports describing when this hack makes
> things bad, but it is such a huge win for more programs that it is on by
> default.(Your “rebuild” idea might work better with (State# Realorld#
> -> Builder) rather than (() -> Builder) for that reason.)
>
>
>
> Simon
>
>
>
> *From:* Glasgow-haskell-users [mailto:
> glasgow-haskell-users-boun...@haskell.org] *On Behalf Of *Akio Takano
> *Sent:* 11 November 2013 09:19
> *To:* glasgow-haskell-users@haskell.org
> *Subject:* Giving function a larger arity
>
>
>
> Hi,
>
> I've been trying to get a certain type of programs compiled into efficient
> code, but I haven't been able to find a good way to do it, so I'm asking
> for help.
>
> Specifically, it involves a library that defines a newtype whose
> representation is a function. Attached Lib.hs is an example of such a
> library. It defines a newtype (Builder), and functions (fromInt, mappend)
> that deal with it.
>
> In user code I want to write a (often-recursive) function that produces a
> value of the newtype (the 'upto' function in arity.hs is an example). The
> problem is that I know that the resulting value will be used only once, and
> I'd like GHC to take advantage of it. In other words, I want the 'upto'
> function to get compiled into something that takes 4 arguments (Int#, Int#,
> Addr# and State#), rather than a binary function that returns a lambda.
>
> I understand that GHC does not do this by default for a good reason. It
> avoids potentially calling 'slightlyExpensive' more than once. However I
> need some way to get the larger arity, because the performance difference
> can be rather large (for example, this problem can add a lot of boxing to
> an otherwise allocation-free loop).
>
> One of my attempts was to have the library expose a function with which
> the user can tell GHC that re-computation is okay. Lib.rebuild is such a
> function, and the 'upto_rebuild' function demonstrates how to use it.
> Unfortunately this approach only worked when the full-laziness optimization
> was explicitly disabled.
>
> This problem happened many times to me. In particular Reader and State
> monads often triggered it.
>
> I'm using GHC 7.6.3.
>
> Any advice?
>
> Thank you,
> Takano Akio
>
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskel

Re: Giving function a larger arity

2013-11-11 Thread John Lato

Yes, that's what I meant.  I was thinking that from an implementation
perspective, it would be nice if all the one-shot hacks were in a single
place, and if plan A facilitated that it would be a good reason to support
that approach.  But I suppose checking an annotation is no harder than
checking a type, so it's probably irrelevant.


On Mon, Nov 11, 2013 at 3:29 PM, Simon Peyton-Jones
wrote:

>  Well, Plan A is indeed an extended version of the ”state hack”. Instead
> of just (State# RealWorld#) being magically considered one-shot, we’d also
> add (OneShot t).
>
>
>
> Simon
>
>
>
> *From:* John Lato [mailto:jwl...@gmail.com]
> *Sent:* 11 November 2013 17:29
> *To:* Simon Peyton-Jones
> *Cc:* glasgow-haskell-users@haskell.org; Akio Takano
> *Subject:* RE: Giving function a larger arity
>
>
>
> Originally I thought Plan B would make more sense, but if Plan A were
> implemented could this one-shot type annotation be unified with the state
> hack? I'm envisioning something like RULES, where if the type matches ghc
> knows it's a one-shot lambda.
>
> I think it would be better to not do any analysis and leave this
> completely up to the user. My intention is to get a mechanism to tell ghc
> it's okay to recompute something in a lambda, essentially a manual state
> hack. I seem to recall wanting this, but I don't remember the exact use
> case. It's possible it was one-shot anyway.
>
> John L.
>
> On Nov 11, 2013 5:12 AM, "Simon Peyton-Jones" 
> wrote:
>
>  Strangely enough I’ve been thinking about eta expansion in the last day
> or two.  It’s one of GHC’s more delicate corners, because
>
> 1.   There can be big performance boosts
>
> 2.   If you push a redex inside a lambda
>
> But as you point out (2) may be arbitrarily bad unless you know the lambda
> is called at most once (is “one-shot”).
>
>
>
> There is really no good way to declare a lambda to be one-shot right now.
> As you discovered, full laziness tends to defeat your attempts to do so!
> (A workaround is to switch off full laziness, but that doesn’t feel right.)
>
>
>
> What is a more general solution?  I can think of two.
>
>
>
> A.  Declare one-shot-ness in the type.  Something like this:
>
> newtype OneShot a = OS a
>
> newtype Builder = Builder (OneShot (Ptr ()) -> IO (Ptr ()))
>
> plus telling GHC that anything with a OneShot type is a
> one-shot lambda.
>
>
>
> B.  Declaring one-shot-ness in the terms.  Something like
>
> ..Builder (\x {-# ONESHOT #-} -> blah)…
>
> That would declare this particular lambda to be one-shot,
> but not any other.
>
>
>
> Notes
>
> · Plan A would require a bit of fiddling to move your values in
> and out of the OneShot type.  And it’d make the terms a bit bigger at
> compile time.
>
> · Plan B is more explicit all the use sites.
>
> · Both plans would be vulnerable to user error.  I could imagine
> and analysis that would guarantee that you met the one-shot claim; but it
> would necessarily be quite conservative.  That might or might not be OK
>
> · GHC already embodies a version of (A): the “state hack” means
> that  a lambda whose binder is a state token (State# RealWorld#) is treated
> as one-shot.  We have many bug reports describing when this hack makes
> things bad, but it is such a huge win for more programs that it is on by
> default.(Your “rebuild” idea might work better with (State# Realorld#
> -> Builder) rather than (() -> Builder) for that reason.)
>
>
>
> Simon
>
>
>
> *From:* Glasgow-haskell-users [mailto:
> glasgow-haskell-users-boun...@haskell.org] *On Behalf Of *Akio Takano
> *Sent:* 11 November 2013 09:19
> *To:* glasgow-haskell-users@haskell.org
> *Subject:* Giving function a larger arity
>
>
>
> Hi,
>
> I've been trying to get a certain type of programs compiled into efficient
> code, but I haven't been able to find a good way to do it, so I'm asking
> for help.
>
> Specifically, it involves a library that defines a newtype whose
> representation is a function. Attached Lib.hs is an example of such a
> library. It defines a newtype (Builder), and functions (fromInt, mappend)
> that deal with it.
>
> In user code I want to write a (often-recursive) function that produces a
> value of the newtype (the 'upto' function in arity.hs is an example). The
> problem is that I know that the resulting value will be used only once, and
> I'd like GHC to take advantage of it. In other words, I want the 'upto'
> function to get compiled into something that tak

memory ordering

2013-12-19 Thread John Lato

Hello,

I'm working on a lock-free algorithm that's meant to be used in a
concurrent setting, and I've run into a possible issue.

The crux of the matter is that a particular function needs to perform the
following:

> x <- MVector.read vec ix
> position <- readIORef posRef

and the algorithm is only safe if these two reads are not reordered (both
the vector and IORef are written to by other threads).

My concern is, according to standard Haskell semantics this should be safe,
as IO sequencing should guarantee that the reads happen in-order.  Of
course this also relies upon the architecture's memory model, but x86 also
guarantees that reads happen in order.  However doubts remain; I do not
have confidence that the code generator will handle this properly.  In
particular, LLVM may freely re-order loads of NotAtomic and Unordered
values.

The one hope I have is that ghc will preserve IO semantics through the
entire pipeline.  This seems like it would be necessary for proper handling
of exceptions, for example.  So, can anyone tell me if my worries are
unfounded, or if there's any way to ensure the behavior I want?  I could
change the readIORef to an atomicModifyIORef, which should issue an mfence,
but that seems a bit heavy-handed as just a read fence would be sufficient
(although even that seems more than necessary).

Thanks,
John L.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: memory ordering

2013-12-22 Thread John Lato

Hi Carter,

Atomics are more or less what I'm after.

I've taken a look at Ryan's work (
https://github.com/rrnewton/haskell-lockfree/), and there are certainly
some very useful items there, but I don't see anything quite like what I'm
looking for (except perhaps for a read barrier).

The problem is that I don't actually need atomic operations for this.  I'm
just doing reads after all.  My concern is that many optimization pipelines
(i.e. LLVM) don't guarantee ordering of reads unless you use atomic
variables.

The IORef docs warn that IORef operations may appear out-of-order depending
on the architecture's memory model.  On (newer) x86, loads won't move
relative to other loads, so that should be ok, and Haskell's semantics
should guarantee that two IO operations will happen in program order.

It's the Haskell semantics guarantee I'm concerned about; I guess I'm not
entirely sure I believe that it's implemented properly (although I have no
reason to believe it's wrong either).  Perhaps I'm just overly paranoid.

John Lato

On Fri, Dec 20, 2013 at 9:05 AM, Carter Schonwald <
carter.schonw...@gmail.com> wrote:

> Hey John, so you're wanting atomic reads and writes?
>
> I'm pretty sure that you want to use atomic memory operations for this.  I
> believe Ryan Newton has some tooling you can use right now for that.
>
>
> On Fri, Dec 20, 2013 at 3:57 AM, Christian Höner zu Siederdissen <
> choe...@tbi.univie.ac.at> wrote:
>
>> Hi John,
>>
>> I guess you probably want to "pseq x". See below for an example. Since
>> your 2nd
>> action does not depend on your 1st.
>>
>> Gruss,
>> Christian
>>
>>
>> import Debug.Trace
>> import GHC.Conc
>>
>> main = do
>>   x <- return (traceShow "1" $ 1::Int)
>>   -- x `pseq` print (2::Int)
>>   print (2::Int)
>>   print x
>>
>>
>> * John Lato  [20.12.2013 02:36]:
>> >Hello,
>> >
>> >I'm working on a lock-free algorithm that's meant to be used in a
>> >concurrent setting, and I've run into a possible issue.
>> >
>> >The crux of the matter is that a particular function needs to
>> perform the
>> >following:
>> >
>> >> x <- MVector.read vec ix
>> >> position <- readIORef posRef
>> >
>> >and the algorithm is only safe if these two reads are not reordered
>> (both
>> >the vector and IORef are written to by other threads).
>> >
>> >My concern is, according to standard Haskell semantics this should be
>> >safe, as IO sequencing should guarantee that the reads happen
>> in-order.
>> >Of course this also relies upon the architecture's memory model, but
>> x86
>> >also guarantees that reads happen in order.  However doubts remain;
>> I do
>> >not have confidence that the code generator will handle this
>> properly.  In
>> >particular, LLVM may freely re-order loads of NotAtomic and Unordered
>> >values.
>> >
>> >The one hope I have is that ghc will preserve IO semantics through
>> the
>> >entire pipeline.  This seems like it would be necessary for proper
>> >handling of exceptions, for example.  So, can anyone tell me if my
>> worries
>> >are unfounded, or if there's any way to ensure the behavior I want?
>>  I
>> >could change the readIORef to an atomicModifyIORef, which should
>> issue an
>> >mfence, but that seems a bit heavy-handed as just a read fence would
>> be
>> >sufficient (although even that seems more than necessary).
>> >
>> >Thanks,
>> >John L.
>>
>> > ___
>> > Glasgow-haskell-users mailing list
>> > Glasgow-haskell-users@haskell.org
>> > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: memory ordering

2013-12-30 Thread John Lato

Hi Edward,

Thanks very much for this reply, it answers a lot of questions I'd had.
 I'd hoped that ordering would be preserved through C--, but c'est la vie.
 Optimizing compilers are ever the bane of concurrent algorithms!

stg/SMP.h does define a loadLoadBarrier, which is exposed in Ryan Newton's
atomic-primops package.  From the docs, I think that's a general read
barrier, and should do what I want.  Assuming it works properly, of course.
 If I'm lucky it might even be optimized out.

Thanks,
John


On Mon, Dec 30, 2013 at 6:04 AM, Edward Z. Yang  wrote:

> Hello John,
>
> Here are some prior discussions (which I will attempt to summarize
> below):
>
> http://www.haskell.org/pipermail/haskell-cafe/2011-May/091878.html
> http://www.haskell.org/pipermail/haskell-prime/2006-April/001237.html
> http://www.haskell.org/pipermail/haskell-prime/2006-March/001079.html
>
> The guarantees that Haskell and GHC give in this area are hand-wavy at
> best; at the moment, I don't think Haskell or GHC have a formal memory
> model—this seems to be an open research problem. (Unfortunately, AFAICT
> all the researchers working on relaxed memory models have their hands
> full with things like C++ :-)
>
> If you want to go ahead and build something that /just/ works for a
> /specific version/ of GHC, you will need to answer this question
> separately for every phase of the compiler.  For Core and STG, monads
> will preserve ordering, so there is no trouble.  However, for C--, we
> will almost certainly apply optimizations which reorder reads (look at
> CmmSink.hs).  To properly support your algorithm, you will have to add
> some new read barrier mach-ops, and teach the optimizer to respect them.
> (This could be fiendishly subtle; it might be better to give C-- a
> memory model first.)  These mach-ops would then translate into
> appropriate arch-specific assembly or LLVM instructions, preserving
> the guarantees further.
>
> This is not related to your original question, but the situation is a
> bit better with regards to reordering stores: we have a WriteBarrier
> MachOp, which in principle, prevents store reordering.  In practice, we
> don't seem to actually have any C-- optimizations that reorder stores.
> So, at least you can assume these will work OK!
>
> Hope this helps (and is not too inaccurate),
> Edward
>
> Excerpts from John Lato's message of 2013-12-20 09:36:11 +0800:
> > Hello,
> >
> > I'm working on a lock-free algorithm that's meant to be used in a
> > concurrent setting, and I've run into a possible issue.
> >
> > The crux of the matter is that a particular function needs to perform the
> > following:
> >
> > > x <- MVector.read vec ix
> > > position <- readIORef posRef
> >
> > and the algorithm is only safe if these two reads are not reordered (both
> > the vector and IORef are written to by other threads).
> >
> > My concern is, according to standard Haskell semantics this should be
> safe,
> > as IO sequencing should guarantee that the reads happen in-order.  Of
> > course this also relies upon the architecture's memory model, but x86
> also
> > guarantees that reads happen in order.  However doubts remain; I do not
> > have confidence that the code generator will handle this properly.  In
> > particular, LLVM may freely re-order loads of NotAtomic and Unordered
> > values.
> >
> > The one hope I have is that ghc will preserve IO semantics through the
> > entire pipeline.  This seems like it would be necessary for proper
> handling
> > of exceptions, for example.  So, can anyone tell me if my worries are
> > unfounded, or if there's any way to ensure the behavior I want?  I could
> > change the readIORef to an atomicModifyIORef, which should issue an
> mfence,
> > but that seems a bit heavy-handed as just a read fence would be
> sufficient
> > (although even that seems more than necessary).
> >
> > Thanks,
> > John L.
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: memory ordering

2014-01-01 Thread John Lato

One point I'm getting from this discussion is that perhaps not much time
has been spent considering these issues in ghc backends. If so, it's
probably a good thing to work through it now.

For myself, I guess the only option I have now is to measure using
loadLoadBarrier and see if it's better or worse than calling
atomicModifyIORef.
On Dec 31, 2013 6:42 AM, "Edward Z. Yang"  wrote:

> I was thinking about my response, and realized there was one major
> misleading thing in my description.  The load reordering I described
> applies to load instructions in C-- proper, i.e. things that show up
> in the C-- dup as:
>
> W_ x = I64[...addr...]
>
> Reads to IORefs and reads to vectors get compiled inline (as they
> eventually translate into inline primops), so my admonitions are
> applicable.
>
> However, the story with *foreign primops* (which is how loadLoadBarrier
> in atomic-primops is defined, how you might imagine defining a custom
> read function as a primop) is a little different.  First, what does a
> call to an foreign primop compile into? It is *not* inlined, so it will
> eventually get compiled into a jump (this could be a problem if you're
> really trying to squeeze out performance!)  Second, the optimizer is a
> bit more conservative when it comes to primop calls (internally referred
> to as "unsafe foreign calls"); at the moment, the optimizer assumes
> these foreign calls clobber heap memory, so we *automatically* will not
> push loads/stores beyond this boundary. (NB: We reserve the right to
> change this in the future!)
>
> This is probably why atomic-primops, as it is written today, seems to
> work OK, even in the presence of the optimizer.  But I also have a hard
> time believing it gives the speedups you want, due to the current
> design. (CC'd Ryan Newton, because I would love to be wrong here, and
> maybe he can correct me on this note.)
>
> Cheers,
> Edward
>
> P.S. loadLoadBarrier compiles to a no-op on x86 architectures, but
> because it's not inlined I think you will still end up with a jump (LLVM
> might be able to eliminate it).
>
> Excerpts from John Lato's message of 2013-12-31 03:01:58 +0800:
> > Hi Edward,
> >
> > Thanks very much for this reply, it answers a lot of questions I'd had.
> >  I'd hoped that ordering would be preserved through C--, but c'est la
> vie.
> >  Optimizing compilers are ever the bane of concurrent algorithms!
> >
> > stg/SMP.h does define a loadLoadBarrier, which is exposed in Ryan
> Newton's
> > atomic-primops package.  From the docs, I think that's a general read
> > barrier, and should do what I want.  Assuming it works properly, of
> course.
> >  If I'm lucky it might even be optimized out.
> >
> > Thanks,
> > John
> >
> > On Mon, Dec 30, 2013 at 6:04 AM, Edward Z. Yang  wrote:
> >
> > > Hello John,
> > >
> > > Here are some prior discussions (which I will attempt to summarize
> > > below):
> > >
> > > http://www.haskell.org/pipermail/haskell-cafe/2011-May/091878.html
> > >
> http://www.haskell.org/pipermail/haskell-prime/2006-April/001237.html
> > >
> http://www.haskell.org/pipermail/haskell-prime/2006-March/001079.html
> > >
> > > The guarantees that Haskell and GHC give in this area are hand-wavy at
> > > best; at the moment, I don't think Haskell or GHC have a formal memory
> > > model—this seems to be an open research problem. (Unfortunately, AFAICT
> > > all the researchers working on relaxed memory models have their hands
> > > full with things like C++ :-)
> > >
> > > If you want to go ahead and build something that /just/ works for a
> > > /specific version/ of GHC, you will need to answer this question
> > > separately for every phase of the compiler.  For Core and STG, monads
> > > will preserve ordering, so there is no trouble.  However, for C--, we
> > > will almost certainly apply optimizations which reorder reads (look at
> > > CmmSink.hs).  To properly support your algorithm, you will have to add
> > > some new read barrier mach-ops, and teach the optimizer to respect
> them.
> > > (This could be fiendishly subtle; it might be better to give C-- a
> > > memory model first.)  These mach-ops would then translate into
> > > appropriate arch-specific assembly or LLVM instructions, preserving
> > > the guarantees further.
> > >
> > > This is not related to your original question, but the situation is a
> > > bit better with regards to reordering stores: we have a WriteBarrier
> > > MachOp, which in principle, prevents store reordering.  In practice, we
> > > don't seem to actually have any C-- optimizations that reorder stores.
> > > So, at least you can assume these will work OK!
> > >
> > > Hope this helps (and is not too inaccurate),
> > > Edward
> > >
> > > Excerpts from John Lato's message of 2013-12-20 09:36:11 +0800:
> > > > Hello,
> > > >
> > > > I'm working on a lock-free algorithm that's meant to be used in a
> > > > concurrent setting, and I've run into a possible issue.
> > > >
> > > > The crux of the matter is that a part

Re: Parallel building multiple targets

2014-01-05 Thread John Lato

(FYI, I expect I'm the source of the suggestion that ghc -M is broken)

First, just to clarify, I don't think ghc -M is obviously broken.  Rather,
I think it's broken in subtle, unobvious ways, such that trying to develop
a make-based project with ghc -M will fail at various times in a
non-obvious fashion, at least without substantial additional rules.  For an
example of some of the extra steps necessary to make something like this
work, see e.g. https://github.com/nh2/multishake (which is admittedly for a
more complicated setup, and also has some issues).  The especially
frustrating part is, just when you think you have everything working,
someone wants to add some other tool to a workflow (hsc2hs, .cmm files,
etc), and your build system doesn't support it.

ghc --make doesn't allow building several binaries in one run, however if
you use cabal all the separate runs will use a shared build directory, so
subsequent builds will be able to take advantage of the intermediate output
of the first build.  Of course you could do the same without cabal, but
it's a convenient way to create a common build directory and manage
multiple targets.  This is the approach I would take to building multiple
executables from the same source files.

ghc doesn't do any locking of build files AFAIK.  Running parallel ghc
commands for two main modules that have the same import, using the same
working directory, is not safe.  In pathological cases the two different
main modules may even generate different code *for the imported module*.
 This sort of situation can arise with the IncoherentInstances extension,
for example.

The obvious approach is of course to make a library out of your common
files.  This has the downsides of requiring a bit more work on the
developer's part, but if the common files are relatively stable it'll
probably lead to the fastest builds of your executables.  Also in this case
you could run multiple `ghc --make`s in parallel, using different build
directories, since they won't be rebuilding any common code.

John L.

On Sun, Jan 5, 2014 at 1:47 PM, Sami Liedes  wrote:

> Hi,
>
> I have a Haskell project where a number of executables are produced
> from mostly the same modules. I'm using a Makefile to enable parallel
> builds. I received advice[1] that ghc -M is broken, but that there
> is parallel ghc --make in HEAD.
>
> As far as I can tell, ghc --make does not allow building several
> binaries in one run, so I think it may not still be a full replacement
> for Makefiles.
>
> However I have a question about ghc --make that is also relevant
> without parallel ghc --make:
>
> If I have two main modules, prog1.hs and prog2.hs, which have mutual
> dependencies (for example, both import A from A.hs), is it safe to run
> "ghc --make prog1" in parallel with "ghc --make prog2"? IOW, is there
> some kind of locking to prevent both from building module A at the
> same time and interfering with each other?
>
> Is there a good way (either in current releases or HEAD) to build
> multiple binaries partially from the same sources in parallel?
>
> Sami
>
>
> [1]
> http://stackoverflow.com/questions/20938894/generating-correct-link-dependencies-for-ghc-and-makefile-style-builds
>
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Parallel building multiple targets

2014-01-05 Thread John Lato

On Sun, Jan 5, 2014 at 3:54 PM, Erik de Castro Lopo wrote:

> John Lato wrote:
>
> > ghc --make doesn't allow building several binaries in one run, however if
> > you use cabal all the separate runs will use a shared build directory, so
> > subsequent builds will be able to take advantage of the intermediate
> output
> > of the first build.
>
> As long as the ghc-options flags are the same for all the different
> component
> sections (I've been bitten by this).
>

Yes, good point.  This is one case where putting the common code in a
library will help a lot.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Parallel building multiple targets

2014-01-22 Thread John Lato

On Wed, Jan 22, 2014 at 12:25 AM, Simon Marlow  wrote:

> On 05/01/2014 23:48, John Lato wrote:
>
>> (FYI, I expect I'm the source of the suggestion that ghc -M is broken)
>>
>> First, just to clarify, I don't think ghc -M is obviously broken.
>>   Rather, I think it's broken in subtle, unobvious ways, such that
>> trying to develop a make-based project with ghc -M will fail at various
>> times in a non-obvious fashion, at least without substantial additional
>> rules.
>>
>
> If I understand you correctly, you're not saying that ghc -M is broken,
> but that it would be easier to use if it did more.  Right?  Maybe you could
> make specific suggestions?  Saying it is "broken" is a bit FUD-ish.  We use
> it in GHC's build system, so by an existence proof it is certainly not
> broken.
>

This is more-or-less true.  To be a bit more precise, I'm saying that the
raw output from ghc -M is insufficient in several interesting/useful cases.
 I'm also not convinced that it's worth anyone's time to add the necessary
features to ghc -M to cover these cases, or indeed that it's even
necessarily the correct place to fix it.

Some specific shortcomings of ghc -M are:

1.  no support for .cmm file dependencies (though in fairness, --make
doesn't track these either)
2.  no support for Language.Haskell.TH.Syntax.addDependentFile
3.  no support for preprocessing, such as .hsc or #include'd files.  (I
think it would work to write a .hs file that uses -pgmF to call hsc2hs, but
that seems rather non-obvious and I'm not sure it's a good idea).

However, these are all rather obviously fixable as part of the build
system.  For me, the worst problems have to do with cleaning.  If you're
using a Makefile, typically you want to leave intermediate object files
around and only rebuild them when the sources have changed.  However, there
are various issues with ghc batch-mode that make this difficult (e.g.
https://ghc.haskell.org/trac/ghc/ticket/8029 ).  The workarounds to deal
with this are not as straightforward.  The alternative is to live with the
occasional build error that can only be fixed by blowing away the entire
build dir (a remedy that I often need with ghc's source tree, as even make
maintainer-clean doesn't always cut it.  Hopefully my experience here is
unique, but I do not believe it is).

Also, the most common use case seems to be for parallel building of
modules.  As ghc-7.8 provides this with --make, I'd expect the demand for
ghc -M will be greatly reduced.  That's why I'm not certain it's worth the
time it would take to resolve these issues.

Cheers,
John

> Cheers,
> Simon
>
>
>
>   For an example of some of the extra steps necessary to make
>
>> something like this work, see e.g.
>> https://github.com/nh2/multishake (which is admittedly for a more
>> complicated setup, and also has some issues).  The especially
>> frustrating part is, just when you think you have everything working,
>> someone wants to add some other tool to a workflow (hsc2hs, .cmm files,
>> etc), and your build system doesn't support it.
>>
>> ghc --make doesn't allow building several binaries in one run, however
>> if you use cabal all the separate runs will use a shared build
>> directory, so subsequent builds will be able to take advantage of the
>> intermediate output of the first build.  Of course you could do the same
>> without cabal, but it's a convenient way to create a common build
>> directory and manage multiple targets.  This is the approach I would
>> take to building multiple executables from the same source files.
>>
>> ghc doesn't do any locking of build files AFAIK.  Running parallel ghc
>> commands for two main modules that have the same import, using the same
>> working directory, is not safe.  In pathological cases the two different
>> main modules may even generate different code *for the imported module*.
>>   This sort of situation can arise with the IncoherentInstances
>> extension, for example.
>>
>> The obvious approach is of course to make a library out of your common
>> files.  This has the downsides of requiring a bit more work on the
>> developer's part, but if the common files are relatively stable it'll
>> probably lead to the fastest builds of your executables.  Also in this
>> case you could run multiple `ghc --make`s in parallel, using different
>> build directories, since they won't be rebuilding any common code.
>>
>> John L.
>>
>> On Sun, Jan 5, 2014 at 1:47 PM, Sami Liedes > <mailto:sami.lie...@iki.fi>> wrote:
>>
>> Hi,
>>
>> I ha

Re: Parallel building multiple targets

2014-01-23 Thread John Lato

On Jan 23, 2014 1:28 AM, "Simon Marlow"  wrote:
>
> On 23/01/14 03:52, John Lato wrote:
>
>> However, these are all rather obviously fixable as part of the build
>> system.  For me, the worst problems have to do with cleaning.  If you're
>> using a Makefile, typically you want to leave intermediate object files
>> around and only rebuild them when the sources have changed.  However,
>> there are various issues with ghc batch-mode that make this difficult
>> (e.g. https://ghc.haskell.org/trac/ghc/ticket/8029 ).  The workarounds
>> to deal with this are not as straightforward.  The alternative is to
>> live with the occasional build error that can only be fixed by blowing
>> away the entire build dir (a remedy that I often need with ghc's source
>> tree, as even make maintainer-clean doesn't always cut it.  Hopefully my
>> experience here is unique, but I do not believe it is).
>
>
> You said "various issues", but you've only mentioned *one* specific issue
so far: #8029, and we concluded that was not a bug, although I do see how
it could require manually deleting a .hi file if you have a module that
shadows a package module and then remove it.  This seems a rare occurrence
to me, but perhaps it is something you do often.  If it really hurts, then
you could have a way to tell your build system about a file when it is
removed from the project, so that it can delete the build artifacts that go
with it.

It seems uncommon until you're developing in a branch that does so, and try
to go back and forth between that branch and another.

>
> Anyway, are there other problems you'd like to bring to our attention?

If the one bug I linked to earlier is to be closed as "not a bug" (seems
correct to me), there doesn't seem much point to raising other issues
relating to out-of-date intermediate files.  The general solution is
exactly as you suggested, leading to an increasingly baroque build system.

I don't think any of this detracts from my original thrust, which is that
something that looks like an afternoon's work is much more complicated.
Plus, you'll end up fighting with/hacking on a build system instead of what
you meant to work on.

>
> Cheers,
> Simon
>
>
>
>
>
>> Also, the most common use case seems to be for parallel building of
>> modules.  As ghc-7.8 provides this with --make, I'd expect the demand
>> for ghc -M will be greatly reduced.  That's why I'm not certain it's
>> worth the time it would take to resolve these issues.
>>
>> Cheers,
>> John
>>
>>
>> Cheers,
>> Simon
>>
>>
>>
>>For an example of some of the extra steps necessary to make
>>
>> something like this work, see e.g.
>> https://github.com/nh2/__multishake
>>
>> <https://github.com/nh2/multishake> (which is admittedly for a
more
>> complicated setup, and also has some issues).  The especially
>> frustrating part is, just when you think you have everything
>> working,
>> someone wants to add some other tool to a workflow (hsc2hs, .cmm
>> files,
>> etc), and your build system doesn't support it.
>>
>> ghc --make doesn't allow building several binaries in one run,
>> however
>> if you use cabal all the separate runs will use a shared build
>> directory, so subsequent builds will be able to take advantage
>> of the
>> intermediate output of the first build.  Of course you could do
>> the same
>> without cabal, but it's a convenient way to create a common build
>> directory and manage multiple targets.  This is the approach I
would
>> take to building multiple executables from the same source files.
>>
>> ghc doesn't do any locking of build files AFAIK.  Running
>> parallel ghc
>> commands for two main modules that have the same import, using
>> the same
>> working directory, is not safe.  In pathological cases the two
>> different
>> main modules may even generate different code *for the imported
>> module*.
>>This sort of situation can arise with the IncoherentInstances
>> extension, for example.
>>
>> The obvious approach is of course to make a library out of your
>> common
>> files.  This has the downsides of requiring a bit more work on
the
>> developer's part, but if the common files are relatively stable
>>

Re: Eta Reduction

2014-04-01 Thread John Lato

I think this is a great idea and should become a top priority. I would
probably start by switching to a type-class-based seq, after which perhaps
the next step forward would become more clear.

John L.
On Apr 1, 2014 2:54 AM, "Dan Doel"  wrote:

> In the past year or two, there have been multiple performance problems in
> various areas related to the fact that lambda abstraction is not free,
> though we
> tend to think of it as so. A major example of this was deriving of
> Functor. If we
> were to derive Functor for lists, we would end up with something like:
>
> instance Functor [] where
>   fmap _ [] = []
>   fmap f (x:xs) = f x : fmap (\y -> f y) xs
>
> This definition is O(n^2) when fully evaluated,, because it causes O(n) eta
> expansions of f, so we spend time following indirections proportional to
> the
> depth of the element in the list. This has been fixed in 7.8, but there are
> other examples. I believe lens, [1] for instance, has some stuff in it that
> works very hard to avoid this sort of cost; and it's not always as easy to
> avoid
> as the above example. Composing with a newtype wrapper, for instance,
> causes an
> eta expansion that can only be seen as such at the core level.
>
> The obvious solution is: do eta reduction. However, this is not
> operationally
> sound currently. The problem is that seq is capable of telling the
> difference
> between the following two expressions:
>
> undefined
> \x -> undefined x
>
> The former causes seq to throw an exception, while the latter is considered
> defined enough to not do so. So, if we eta reduce, we can cause terminating
> programs to diverge if they make use of this feature.
>
> Luckily, there is a solution.
>
> Semantically one would usually identify the above two expressions. While I
> do
> believe one could construct a semantics that does distinguish them, it is
> not
> the usual practice. This suggests that there is a way to not distinguish
> them,
> perhaps even including seq. After all, the specification of seq is
> monotone and
> continuous regardless of whether we unify ⊥ with \x -> ⊥ x or insert an
> extra
> element for the latter.
>
> The currently problematic case is function spaces, so I'll focus on it. How
> should:
>
> seq : (a -> b) -> c -> c
>
> act? Well, other than an obvious bottom, we need to emit bottom whenever
> our
> given function is itself bottom at every input. This may first seem like a
> problem, but it is actually quite simple. Without loss of generality, let
> us
> assume that we can enumerate the type a. Then, we can feed these values to
> the
> function, and check their results for bottom. Conal Elliot has prior art
> for
> this sort of thing with his unamb [2] package. For each value x :: a,
> simply
> compute 'f x `seq` ()' in parallel, and look for any successes. If we ever
> find
> one, we know the function is non-bottom, and we can return our value of c.
> If we
> never finish searching, then the function must be bottom, and seq should
> not
> terminate, so we have satisfied the specification.
>
> Now, some may complain about the enumeration above. But this, too, is a
> simple
> matter. It is well known that Haskell programs are denumerable. So it is
> quite
> easy to enumerate all Haskell programs that produce a value, check whether
> that
> value has the type we're interested in, and compute said value. All of
> this can
> be done in Haskell. Thus, every Haskell type is programatically enumerable
> in
> Haskell, and we can use said enumeration in our implementation of seq for
> function types. I have discussed this with Russell O'Connor [3], and he
> assures
> me that this argument should be sufficient even if we consider semantic
> models
> of Haskell that contain values not denoted by any Haskell program, so we
> should
> be safe there.
>
> The one possible caveat is that this implementation of seq is not
> operationally
> uniform across all types, so the current fully polymorphic type of seq may
> not
> make sense. But moving back to a type class based approach isn't so bad,
> and
> this time we will have a semantically sound backing, instead of just
> having a
> class with the equivalent of the current magic function in it.
>
> Once this machinery is in place, we can eta reduce to our hearts' content,
> and
> not have to worry about breaking semantics. We'd no longer put the burden
> on
> programmers to use potentially unsafe hacks to avoid eta expansions. I
> apologize
> for not sketching an implementation of the above algorithm, but I'm sure it
> should be elementary enough to make it into GHC in the next couple
> versions.
> Everyone learns about this type of thing in university computer science
> programs, no?
>
> Thoughts? Comments? Questions?
>
> Cheers,
> -- Dan
>
> [1] http://hackage.haskell.org/package/lens
> [2] http://hackage.haskell.org/package/unamb
> [3] http://r6.ca/
>
>
> ___
> Glasgow-haskell-users mailing list
> Gl

Re: [Haskell-cafe] Eta Reduction

2014-04-01 Thread John Lato

Hi Edward,

Yes, I'm aware of that.  However, I thought Dan's proposal especially droll
given that changing seq to a class-based function would be sufficient to
make eta-reduction sound, given appropriate instances (or lack thereof).
Meaning we could leave the rest of the proposal unevaluated (lazily!).

And if somebody were to suggest that on a different day, +1 from me.

John
On Apr 1, 2014 10:32 AM, "Edward Kmett"  wrote:

> John,
>
> Check the date and consider the process necessary to "enumerate all
> Haskell programs and check their types".
>
> -Edward
>
>
> On Tue, Apr 1, 2014 at 9:17 AM, John Lato  wrote:
>
>> I think this is a great idea and should become a top priority. I would
>> probably start by switching to a type-class-based seq, after which perhaps
>> the next step forward would become more clear.
>>
>> John L.
>> On Apr 1, 2014 2:54 AM, "Dan Doel"  wrote:
>>
>>>  In the past year or two, there have been multiple performance problems
>>> in
>>> various areas related to the fact that lambda abstraction is not free,
>>> though we
>>> tend to think of it as so. A major example of this was deriving of
>>> Functor. If we
>>> were to derive Functor for lists, we would end up with something like:
>>>
>>> instance Functor [] where
>>>   fmap _ [] = []
>>>   fmap f (x:xs) = f x : fmap (\y -> f y) xs
>>>
>>> This definition is O(n^2) when fully evaluated,, because it causes O(n)
>>> eta
>>> expansions of f, so we spend time following indirections proportional to
>>> the
>>> depth of the element in the list. This has been fixed in 7.8, but there
>>> are
>>> other examples. I believe lens, [1] for instance, has some stuff in it
>>> that
>>> works very hard to avoid this sort of cost; and it's not always as easy
>>> to avoid
>>> as the above example. Composing with a newtype wrapper, for instance,
>>> causes an
>>> eta expansion that can only be seen as such at the core level.
>>>
>>> The obvious solution is: do eta reduction. However, this is not
>>> operationally
>>> sound currently. The problem is that seq is capable of telling the
>>> difference
>>> between the following two expressions:
>>>
>>> undefined
>>> \x -> undefined x
>>>
>>> The former causes seq to throw an exception, while the latter is
>>> considered
>>> defined enough to not do so. So, if we eta reduce, we can cause
>>> terminating
>>> programs to diverge if they make use of this feature.
>>>
>>> Luckily, there is a solution.
>>>
>>> Semantically one would usually identify the above two expressions. While
>>> I do
>>> believe one could construct a semantics that does distinguish them, it
>>> is not
>>> the usual practice. This suggests that there is a way to not distinguish
>>> them,
>>> perhaps even including seq. After all, the specification of seq is
>>> monotone and
>>> continuous regardless of whether we unify ⊥ with \x -> ⊥ x or insert an
>>> extra
>>> element for the latter.
>>>
>>> The currently problematic case is function spaces, so I'll focus on it.
>>> How
>>> should:
>>>
>>> seq : (a -> b) -> c -> c
>>>
>>> act? Well, other than an obvious bottom, we need to emit bottom whenever
>>> our
>>> given function is itself bottom at every input. This may first seem like
>>> a
>>> problem, but it is actually quite simple. Without loss of generality,
>>> let us
>>> assume that we can enumerate the type a. Then, we can feed these values
>>> to the
>>> function, and check their results for bottom. Conal Elliot has prior art
>>> for
>>> this sort of thing with his unamb [2] package. For each value x :: a,
>>> simply
>>> compute 'f x `seq` ()' in parallel, and look for any successes. If we
>>> ever find
>>> one, we know the function is non-bottom, and we can return our value of
>>> c. If we
>>> never finish searching, then the function must be bottom, and seq should
>>> not
>>> terminate, so we have satisfied the specification.
>>>
>>> Now, some may complain about the enumeration above. But this, too, is a
>>> simple
>>> matter. It is well known that Haskell programs are denumerable. So it is
>>> quite
>>>

Re: [Haskell-cafe] Eta Reduction

2014-04-01 Thread John Lato

I'm already uneasy using bang patterns on polymorphic data because I don't
know exactly what it will accomplish.  Maybe it adds too much strictness?
Not enough? Simply duplicates work?  Perhaps it's acceptable to remove that
feature entirely (although that may require adding extra strictness in a
lot of other places).

Alternatively, maybe it's enough to simply find a use for that
good-for-nothing syntax we already have?

On Apr 1, 2014 5:32 PM, "Edward Kmett"  wrote:
>
> Unfortunately the old class based solution also carries other baggage,
like the old data type contexts being needed in the language for bang
patterns. :(
>
> -Edward
>
> On Apr 1, 2014, at 5:26 PM, John Lato  wrote:
>
>> Hi Edward,
>>
>> Yes, I'm aware of that.  However, I thought Dan's proposal especially
droll given that changing seq to a class-based function would be sufficient
to make eta-reduction sound, given appropriate instances (or lack
thereof).  Meaning we could leave the rest of the proposal unevaluated
(lazily!).
>>
>> And if somebody were to suggest that on a different day, +1 from me.
>>
>> John
>>
>> On Apr 1, 2014 10:32 AM, "Edward Kmett"  wrote:
>>>
>>> John,
>>>
>>> Check the date and consider the process necessary to "enumerate all
Haskell programs and check their types".
>>>
>>> -Edward
>>>
>>>
>>> On Tue, Apr 1, 2014 at 9:17 AM, John Lato  wrote:
>>>>
>>>> I think this is a great idea and should become a top priority. I would
probably start by switching to a type-class-based seq, after which perhaps
the next step forward would become more clear.
>>>>
>>>> John L.
>>>>
>>>> On Apr 1, 2014 2:54 AM, "Dan Doel"  wrote:
>>>>>
>>>>> In the past year or two, there have been multiple performance
problems in
>>>>> various areas related to the fact that lambda abstraction is not
free, though we
>>>>> tend to think of it as so. A major example of this was deriving of
Functor. If we
>>>>> were to derive Functor for lists, we would end up with something like:
>>>>>
>>>>> instance Functor [] where
>>>>>   fmap _ [] = []
>>>>>   fmap f (x:xs) = f x : fmap (\y -> f y) xs
>>>>>
>>>>> This definition is O(n^2) when fully evaluated,, because it causes
O(n) eta
>>>>> expansions of f, so we spend time following indirections proportional
to the
>>>>> depth of the element in the list. This has been fixed in 7.8, but
there are
>>>>> other examples. I believe lens, [1] for instance, has some stuff in
it that
>>>>> works very hard to avoid this sort of cost; and it's not always as
easy to avoid
>>>>> as the above example. Composing with a newtype wrapper, for instance,
causes an
>>>>> eta expansion that can only be seen as such at the core level.
>>>>>
>>>>> The obvious solution is: do eta reduction. However, this is not
operationally
>>>>> sound currently. The problem is that seq is capable of telling the
difference
>>>>> between the following two expressions:
>>>>>
>>>>> undefined
>>>>> \x -> undefined x
>>>>>
>>>>> The former causes seq to throw an exception, while the latter is
considered
>>>>> defined enough to not do so. So, if we eta reduce, we can cause
terminating
>>>>> programs to diverge if they make use of this feature.
>>>>>
>>>>> Luckily, there is a solution.
>>>>>
>>>>> Semantically one would usually identify the above two expressions.
While I do
>>>>> believe one could construct a semantics that does distinguish them,
it is not
>>>>> the usual practice. This suggests that there is a way to not
distinguish them,
>>>>> perhaps even including seq. After all, the specification of seq is
monotone and
>>>>> continuous regardless of whether we unify ⊥ with \x -> ⊥ x or insert
an extra
>>>>> element for the latter.
>>>>>
>>>>> The currently problematic case is function spaces, so I'll focus on
it. How
>>>>> should:
>>>>>
>>>>> seq : (a -> b) -> c -> c
>>>>>
>>>>> act? Well, other than an obvious bottom, we need to emit bottom
whenever our
>>>>> given function is itself bottom at every input. This may first seem
like a
>>&g

how to compile non-dynamic ghc-7.8.2 ?

2014-04-24 Thread John Lato

Hello,

I'd like to compile ghc-7.8.2 with DynamicGhcPrograms disabled (on 64-bit
linux).  I downloaded the source tarball, added

DYNAMIC_GHC_PROGRAMS = NO

to mk/build.mk, and did ./configure && ./make.

ghc builds and everything seems to work (cabal installed a bunch of
packages, ghci seems to work), however whenever I try to run Setup.hs
dynamically (either 'runghc Setup.hs configure' or loading it with ghci and
executing 'main') it dumps core.  Compiling Setup.hs works, and nothing
else has caused ghci to crash either (this email is a literate haskell file
equivalent to Setup.hs).

Building with DYNAMIC_GHC_PROGRAMS = YES works properly.

With that in mind, I have a few questions:

 How should I compile a non-dynamic ghc?
 Is this a bug in ghc?

Thanks,
John L.

* Setup.hs
> import Distribution.Simple
> main = defaultMain
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: how to compile non-dynamic ghc-7.8.2 ?

2014-04-25 Thread John Lato

On Apr 25, 2014 5:36 AM, "Bertram Felgenhauer" <
bertram.felgenha...@googlemail.com> wrote:
>
> John Lato wrote:
> > I'd like to compile ghc-7.8.2 with DynamicGhcPrograms disabled (on
64-bit
> > linux).  I downloaded the source tarball, added
> >
> > DYNAMIC_GHC_PROGRAMS = NO
>
> I've had success with setting both
>
> DYNAMIC_BY_DEFAULT   = NO
> DYNAMIC_GHC_PROGRAMS = NO
>
> and removing the 'dyn' way altogether from GhcLibWays by setting
> GhcLibWays = v explicitely. I expect that the latter is optional;
> my goal was only to speed up the build.
>
> (The lines were copied from the perf-cross settings.)

I tried disabling dynamicByDefault also, but that didn't help. It may be a
problem with lib ways though, I'll try that.

Thanks, John

> Cheers,
>
> Bertram
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: how to compile non-dynamic ghc-7.8.2 ?

2014-04-27 Thread John Lato

Hi Carter,

cabal-install-1.20.0.0, using Cabal 1.20.0.0.  But this also happened IIRC
with just ghc, no libraries installed, and no cabal anywhere on the path.

In any case, thus far the worst behavior I've seen from a too-old cabal is
a compile failure, not a core dump :)


On Fri, Apr 25, 2014 at 9:47 AM, Carter Schonwald <
carter.schonw...@gmail.com> wrote:

> @john, what version of cabal-install were you using? (i realize you're
> probably using the right one, but worth asking :) )
>
>
> On Fri, Apr 25, 2014 at 12:25 PM, John Lato  wrote:
>
>>
>> On Apr 25, 2014 5:36 AM, "Bertram Felgenhauer" <
>> bertram.felgenha...@googlemail.com> wrote:
>> >
>> > John Lato wrote:
>> > > I'd like to compile ghc-7.8.2 with DynamicGhcPrograms disabled (on
>> 64-bit
>> > > linux).  I downloaded the source tarball, added
>> > >
>> > > DYNAMIC_GHC_PROGRAMS = NO
>> >
>> > I've had success with setting both
>> >
>> > DYNAMIC_BY_DEFAULT   = NO
>> > DYNAMIC_GHC_PROGRAMS = NO
>> >
>> > and removing the 'dyn' way altogether from GhcLibWays by setting
>> > GhcLibWays = v explicitely. I expect that the latter is optional;
>> > my goal was only to speed up the build.
>> >
>> > (The lines were copied from the perf-cross settings.)
>>
>> I tried disabling dynamicByDefault also, but that didn't help. It may be
>> a problem with lib ways though, I'll try that.
>>
>> Thanks, John
>>
>> > Cheers,
>> >
>> > Bertram
>> > ___
>> > Glasgow-haskell-users mailing list
>> > Glasgow-haskell-users@haskell.org
>> > http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: how to compile non-dynamic ghc-7.8.2 ?

2014-04-29 Thread John Lato

Hi Simon,

Thanks very much for this response.  I believe you're correct; ghc -e
'System.Environment.getEnvironment' segfaults with my ghc build.

John


On Tue, Apr 29, 2014 at 10:36 AM, Simon Marlow  wrote:

> On 25/04/2014 02:15, John Lato wrote:
>
>> Hello,
>>
>> I'd like to compile ghc-7.8.2 with DynamicGhcPrograms disabled (on
>> 64-bit linux).  I downloaded the source tarball, added
>>
>> DYNAMIC_GHC_PROGRAMS = NO
>>
>> to mk/build.mk <http://build.mk>, and did ./configure && ./make.
>>
>>
>> ghc builds and everything seems to work (cabal installed a bunch of
>> packages, ghci seems to work), however whenever I try to run Setup.hs
>> dynamically (either 'runghc Setup.hs configure' or loading it with ghci
>> and executing 'main') it dumps core.  Compiling Setup.hs works, and
>> nothing else has caused ghci to crash either (this email is a literate
>> haskell file equivalent to Setup.hs).
>>
>> Building with DYNAMIC_GHC_PROGRAMS = YES works properly.
>>
>> With that in mind, I have a few questions:
>>
>>   How should I compile a non-dynamic ghc?
>>   Is this a bug in ghc?
>>
>
> I think you are running into this: https://ghc.haskell.org/trac/
> ghc/ticket/8935
>
> It took me a *long* time to track that one down.  I still don't know what
> the root cause is, because I don't understand the system linker's behaviour
> here.  Given that other people are running into this, we ought to milestone
> it for 7.8.3 and do something about it.
>
> Cheers,
> Simon
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

vector and GeneralizedNewtypeDeriving

2014-05-13 Thread John Lato

Hello,

Prior to ghc-7.8, it was possible to do this:

> module M where
>
> import qualified Data.Vector.Generic.Base as G
> import qualified Data.Vector.Generic.Mutable as M
> import Data.Vector.Unboxed.Base -- provides MVector and Vector
>
> newtype Foo = Foo Int deriving (Eq, Show, Num,
> M.MVector MVector, G.Vector Vector, Unbox)

M.MVector is defined as

> class MVector v a where
> basicLength :: v s a -> Int
etc.

With ghc-7.8 this no longer compiles due to an unsafe coercion, as MVector
s Foo and MVector s Int have different types.  The error suggests trying
-XStandaloneDeriving to manually specify the context, however I don't see
any way that will help in this case.

For that matter, I don't see any way to fix this in the vector package
either.  We might think to define

> type role M.MVector nominal representational

but that doesn't work as both parameters to M.MVector require a nominal
role (and it's probably not what we really want anyway).  Furthermore
Data.Vector.Unboxed.Base.MVector (which fills in at `v` in the instance) is
a data family, so we're stuck at that point also.

So given this situation, is there any way to automatically derive Vector
instances from newtypes?

tl;dr: I would really like to be able to do:

> coerce (someVector :: Vector Foo) :: Vector Int

am I correct that the current machinery isn't up to handling this?

Thanks,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: vector and GeneralizedNewtypeDeriving

2014-05-13 Thread John Lato

Not by anything I've tried yet, no.


On Tue, May 13, 2014 at 10:40 PM, Carter Schonwald <
carter.schonw...@gmail.com> wrote:

> can you get the deriving to work on
> a newtype instance MVector s Foo = 
> ?
>
>
> On Tue, May 13, 2014 at 9:39 PM, John Lato  wrote:
>
>> Hello,
>>
>> Prior to ghc-7.8, it was possible to do this:
>>
>> > module M where
>> >
>> > import qualified Data.Vector.Generic.Base as G
>> > import qualified Data.Vector.Generic.Mutable as M
>> > import Data.Vector.Unboxed.Base -- provides MVector and Vector
>> >
>> > newtype Foo = Foo Int deriving (Eq, Show, Num,
>> > M.MVector MVector, G.Vector Vector, Unbox)
>>
>> M.MVector is defined as
>>
>> > class MVector v a where
>> > basicLength :: v s a -> Int
>> etc.
>>
>> With ghc-7.8 this no longer compiles due to an unsafe coercion, as
>> MVector s Foo and MVector s Int have different types.  The error suggests
>> trying -XStandaloneDeriving to manually specify the context, however I
>> don't see any way that will help in this case.
>>
>> For that matter, I don't see any way to fix this in the vector package
>> either.  We might think to define
>>
>> > type role M.MVector nominal representational
>>
>> but that doesn't work as both parameters to M.MVector require a nominal
>> role (and it's probably not what we really want anyway).  Furthermore
>> Data.Vector.Unboxed.Base.MVector (which fills in at `v` in the instance) is
>> a data family, so we're stuck at that point also.
>>
>> So given this situation, is there any way to automatically derive Vector
>> instances from newtypes?
>>
>> tl;dr: I would really like to be able to do:
>>
>> > coerce (someVector :: Vector Foo) :: Vector Int
>>
>> am I correct that the current machinery isn't up to handling this?
>>
>> Thanks,
>> John
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: vector and GeneralizedNewtypeDeriving

2014-05-14 Thread John Lato

Hi Richard,

Thanks for pointing me to the ticket; I agree that's the issue (although
I'm glad to have you and Simon confirm it).  I've summarized the issue and
raised the priority, and Simon linked to this thread.

I would have expected this would have affected a lot users, but as I
haven't heard many complaints (and nobody else said anything here!) maybe
the impact is smaller than I thought.

Thanks,
John


On Wed, May 14, 2014 at 6:02 AM, Richard Eisenberg wrote:

> Is this an instance of https://ghc.haskell.org/trac/ghc/ticket/8177 ? I
> think so.
>
> The problem boils down to the fact that Vector and MVector are data
> families and are thus (currently) exempted from the roles mechanism. (Or,
> more properly, may *only* have nominal roles.) There is no technical reason
> for this restriction. It's just that the feature would take a few solid
> days of work to implement and I wasn't aware of a concrete use case.
>
> Here is such a use case.
>
> If you agree that you've hit #8177, please post to that bug report and
> raise the priority to High -- being able to coerce Vectors seems very
> reasonable indeed, and we should support it. I doubt the feature will land
> in 7.8.3 (depending on the timeline for that release), but I'll get to it
> eventually. (Or, if you feel this is more critical in the larger picture,
> shout more loudly on the ticket and perhaps I can squeeze it in before
> 7.8.3.)
>
> Thanks,
> Richard
>
> On May 13, 2014, at 9:39 PM, John Lato  wrote:
>
> Hello,
>
> Prior to ghc-7.8, it was possible to do this:
>
> > module M where
> >
> > import qualified Data.Vector.Generic.Base as G
> > import qualified Data.Vector.Generic.Mutable as M
> > import Data.Vector.Unboxed.Base -- provides MVector and Vector
> >
> > newtype Foo = Foo Int deriving (Eq, Show, Num,
> > M.MVector MVector, G.Vector Vector, Unbox)
>
> M.MVector is defined as
>
> > class MVector v a where
> > basicLength :: v s a -> Int
> etc.
>
> With ghc-7.8 this no longer compiles due to an unsafe coercion, as MVector
> s Foo and MVector s Int have different types.  The error suggests trying
> -XStandaloneDeriving to manually specify the context, however I don't see
> any way that will help in this case.
>
> For that matter, I don't see any way to fix this in the vector package
> either.  We might think to define
>
> > type role M.MVector nominal representational
>
> but that doesn't work as both parameters to M.MVector require a nominal
> role (and it's probably not what we really want anyway).  Furthermore
> Data.Vector.Unboxed.Base.MVector (which fills in at `v` in the instance) is
> a data family, so we're stuck at that point also.
>
> So given this situation, is there any way to automatically derive Vector
> instances from newtypes?
>
> tl;dr: I would really like to be able to do:
>
> > coerce (someVector :: Vector Foo) :: Vector Int
>
> am I correct that the current machinery isn't up to handling this?
>
> Thanks,
> John
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: vector and GeneralizedNewtypeDeriving

2014-05-14 Thread John Lato

Hi Richard,

Following your comment, I created a new ticket,
https://ghc.haskell.org/trac/ghc/ticket/9112, for this issue.

I'm not entirely sure I follow all the subtleties of your analysis, but I
think it's correct.

On Wed, May 14, 2014 at 9:10 PM, Bryan O'Sullivan wrote:

>
> On Wed, May 14, 2014 at 7:02 PM, John Lato  wrote:
>
>> I would have expected this would have affected a lot users, but as I
>> haven't heard many complaints (and nobody else said anything here!) maybe
>> the impact is smaller than I thought.
>>
>
> I think people just haven't migrated much to 7.8 yet. This is definitely a
> substantial problem.
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Future of DYNAMIC_GHC_PROGRAMS?

2014-05-24 Thread John Lato

On May 24, 2014 11:48 AM, "Simon Marlow"  wrote:
>
> On 19/05/2014 13:51, harry wrote:
>>
>> harry wrote
>>>
>>> I need to build GHC 7.8 so that Template Haskell will work without
shared
>>> libraries (due to a shortage of space).
>>>
>>> I understand that this can be done by turning off DYNAMIC_GHC_PROGRAMS
and
>>> associated build options. Is this possibility going to be kept going
>>> forward, or will it be deprecated once dynamic GHC is fully supported on
>>> all platforms?
>>
>>
>> PS This is for Linux x64.
>
>
> We may yet go back and turn DYNAMIC_GHC_PROGRAMS off by default, it has
yet to be decided.  The worst situation would be to have to support both,
so I imagine once we've decided one way or the other we'll deprecated the
other method.
>
> Is it just shortage of space, or is there anything else that pushes you
towards DYNAMIC_GHC_PROGRAMS=NO?  Isn't disk space cheap?
>
> Cheers,
> Simon

Speaking for myself, but I've noticed compilation times can be much shorter
with DYNAMIC_GHC_PROGRAMS=NO.  On one project using dynamic ghc added about
18 minutes to the build time (45 minutes vs 27). That's significant enough
that we're leaning towards static ghc for now.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: __GLASGOW_HASKELL__=708?

2014-09-25 Thread John Lato

The value 708 is correct.  From the user's guide,
http://www.haskell.org/ghc/docs/latest/html/users_guide/options-phases.html#c-pre-processor
:

_GLASGOW_HASKELL__
For version x.y.z of GHC, the value of __GLASGOW_HASKELL__ is the integer
xyy (if y is a single digit, then a leading zero is added, so for example
in version 6.2 of GHC, __GLASGOW_HASKELL__==602). More information in
Section 1.4, “GHC version numbering policy”.


On Fri, Sep 26, 2014 at 11:09 AM, Greg Fitzgerald  wrote:

> Using GHC 7.8.3 from the latest Haskell Platform on OS X 10.9.4, the
> __GLASGOW_HASKELL__ preprocessor symbol is being set to 708 instead of
> 783.  I'd guess I have some stale files lying from previous versions
> GHC or HP, but I can't seem to find them.  Any clues?
>
>
> $ cat wtf.hs
>
> {-# LANGUAGE CPP #-}
>
> $ ghc-7.8.3 -v -E wtf.hs 2>&1 | grep 708
>
> /usr/bin/gcc -E -undef -traditional -Wno-invalid-pp-token -Wno-unicode
> -Wno-trigraphs -I
>
> /Library/Frameworks/GHC.framework/Versions/7.8.3-x86_64/usr/lib/ghc-7.8.3/base-4.7.0.1/include
> -I
> /Library/Frameworks/GHC.framework/Versions/7.8.3-x86_64/usr/lib/ghc-7.8.3/integer-gmp-0.5.1.0/include
> -I
> /Library/Frameworks/GHC.framework/Versions/7.8.3-x86_64/usr/lib/ghc-7.8.3/include
> '-D__GLASGOW_HASKELL__=708' '-Ddarwin_BUILD_OS=1'
> '-Dx86_64_BUILD_ARCH=1' '-Ddarwin_HOST_OS=1' '-Dx86_64_HOST_ARCH=1'
> -U__PIC__ -D__PIC__ '-D__SSE__=1' '-D__SSE2__=1' -x assembler-with-cpp
> wtf.hs -o
> /var/folders/w7/_cxvr2k540163p59kwvqlzrcgn/T/ghc14288_0/ghc14288_1.hscpp
>
>
> Thanks,
> Greg
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Thread behavior in 7.8.3

2014-10-29 Thread John Lato

By any chance do the delays get shorter if you run your program with `+RTS
-C0.005` ?  If so, I suspect you're having a problem very similar to one
that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
reason), involving possible misbehavior of the thread scheduler.

On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones  wrote:

> I have a general question about thread behavior in 7.8.3 vs 7.6.X
>
> I moved from 7.6 to 7.8 and my application behaves very differently. I
> have three threads, an application thread that plots data with wxhaskell or
> sends it over a network (depends on settings), a thread doing usb bulk
> writes, and a thread doing usb bulk reads. Data is moved around with TChan,
> and TVar is used for coordination.
>
> When the application was compiled with 7.6, my stream of usb traffic was
> smooth. With 7.8, there are lots of delays where nothing seems to be
> running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or
> so.
>
> When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine
> without with -N2/4.
>
> The program is compiled -O2 with profiling. The -N2/4 version uses more
> memory,  but in both cases with 7.8 and with 7.6 there is no space leak.
>
> I tired to compile and use -ls so I could take a look with threadscope,
> but the application hangs and writes no data to the file. The CPU fans run
> wild like it is in an infinite loop. It at least pops an unpainted
> wxhaskell window, so it got partially running.
>
> One of my libraries uses option -fsimpl-tick-factor=200 to get around the
> compiler.
>
> What do I need to know about changes to threading and event logging
> between 7.6 and 7.8? Is there some general documentation somewhere that
> might help?
>
> I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and
> installed myself, after removing 7.6 with apt-get.
>
> Any hints appreciated.
>
> Mike
>
>
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Thread behavior in 7.8.3

2014-10-29 Thread John Lato

I guess I should explain what that flag does...

The GHC RTS maintains capabilities, the number of capabilities is specified
by the `+RTS -N` option.  Each capability is a virtual machine that
executes Haskell code, and maintains its own runqueue of threads to process.

A capability will perform a context switch at the next heap block
allocation (every 4k of allocation) after the timer expires.  The timer
defaults to 20ms, and can be set by the -C flag.  Capabilities perform
context switches in other circumstances as well, such as when a thread
yields or blocks.

My guess is that either the context switching logic changed in ghc-7.8, or
possibly your code used to trigger a switch via some other mechanism (stack
overflow or something maybe?), but is optimized differently now so instead
it needs to wait for the timer to expire.

The problem we had was that a time-sensitive thread was getting scheduled
on the same capability as a long-running non-yielding thread, so the
time-sensitive thread had to wait for a context switch timeout (even though
there were free cores available!).  I expect even with -N4 you'll still see
occasional delays (perhaps <5% of calls).

We've solved our problem with judicious use of `forkOn`, but that won't
help at N1.

We did see this behavior in 7.6, but it's definitely worse in 7.8.

Incidentally, has there been any interest in a work-stealing scheduler?
There was a discussion from about 2 years ago, in which Simon Marlow noted
it might be tricky, but it would definitely help in situations like this.

John L.

On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones  wrote:

> John,
>
> Adding -C0.005 makes it much better. Using -C0.001 makes it behave more
> like -N4.
>
> Thanks. This saves my project, as I need to deploy on a single core Atom
> and was stuck.
>
> Mike
>
> On Oct 29, 2014, at 5:12 PM, John Lato  wrote:
>
> By any chance do the delays get shorter if you run your program with `+RTS
> -C0.005` ?  If so, I suspect you're having a problem very similar to one
> that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
> reason), involving possible misbehavior of the thread scheduler.
>
> On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones  wrote:
>
>> I have a general question about thread behavior in 7.8.3 vs 7.6.X
>>
>> I moved from 7.6 to 7.8 and my application behaves very differently. I
>> have three threads, an application thread that plots data with wxhaskell or
>> sends it over a network (depends on settings), a thread doing usb bulk
>> writes, and a thread doing usb bulk reads. Data is moved around with TChan,
>> and TVar is used for coordination.
>>
>> When the application was compiled with 7.6, my stream of usb traffic was
>> smooth. With 7.8, there are lots of delays where nothing seems to be
>> running. These delays are up to 40ms, whereas with 7.6 delays were a 1ms or
>> so.
>>
>> When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs fine
>> without with -N2/4.
>>
>> The program is compiled -O2 with profiling. The -N2/4 version uses more
>> memory,  but in both cases with 7.8 and with 7.6 there is no space leak.
>>
>> I tired to compile and use -ls so I could take a look with threadscope,
>> but the application hangs and writes no data to the file. The CPU fans run
>> wild like it is in an infinite loop. It at least pops an unpainted
>> wxhaskell window, so it got partially running.
>>
>> One of my libraries uses option -fsimpl-tick-factor=200 to get around the
>> compiler.
>>
>> What do I need to know about changes to threading and event logging
>> between 7.6 and 7.8? Is there some general documentation somewhere that
>> might help?
>>
>> I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball and
>> installed myself, after removing 7.6 with apt-get.
>>
>> Any hints appreciated.
>>
>> Mike
>>
>>
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>>
>
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Thread behavior in 7.8.3

2014-10-29 Thread John Lato

My understanding is that -fno-omit-yields is subtly different.  I think
that's for the case when a function loops without performing any heap
allocations, and thus would never yield even after the context switch
timeout.  In my case the looping function does perform heap allocations and
does eventually yield, just not until after the timeout.

Is that understanding correct?

(technically, doesn't it change to yielding after stack checks or something
like that?)



On Thu, Oct 30, 2014 at 8:24 AM, Edward Z. Yang  wrote:

> I don't think this is directly related to the problem, but if you have a
> thread that isn't yielding, you can force it to yield by using
> -fno-omit-yields on your code.  It won't help if the non-yielding code
> is in a library, and it won't help if the problem was that you just
> weren't setting timeouts finely enough (which sounds like what was
> happening). FYI.
>
> Edward
>
> Excerpts from John Lato's message of 2014-10-29 17:19:46 -0700:
> > I guess I should explain what that flag does...
> >
> > The GHC RTS maintains capabilities, the number of capabilities is
> specified
> > by the `+RTS -N` option.  Each capability is a virtual machine that
> > executes Haskell code, and maintains its own runqueue of threads to
> process.
> >
> > A capability will perform a context switch at the next heap block
> > allocation (every 4k of allocation) after the timer expires.  The timer
> > defaults to 20ms, and can be set by the -C flag.  Capabilities perform
> > context switches in other circumstances as well, such as when a thread
> > yields or blocks.
> >
> > My guess is that either the context switching logic changed in ghc-7.8,
> or
> > possibly your code used to trigger a switch via some other mechanism
> (stack
> > overflow or something maybe?), but is optimized differently now so
> instead
> > it needs to wait for the timer to expire.
> >
> > The problem we had was that a time-sensitive thread was getting scheduled
> > on the same capability as a long-running non-yielding thread, so the
> > time-sensitive thread had to wait for a context switch timeout (even
> though
> > there were free cores available!).  I expect even with -N4 you'll still
> see
> > occasional delays (perhaps <5% of calls).
> >
> > We've solved our problem with judicious use of `forkOn`, but that won't
> > help at N1.
> >
> > We did see this behavior in 7.6, but it's definitely worse in 7.8.
> >
> > Incidentally, has there been any interest in a work-stealing scheduler?
> > There was a discussion from about 2 years ago, in which Simon Marlow
> noted
> > it might be tricky, but it would definitely help in situations like this.
> >
> > John L.
> >
> > On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones 
> wrote:
> >
> > > John,
> > >
> > > Adding -C0.005 makes it much better. Using -C0.001 makes it behave more
> > > like -N4.
> > >
> > > Thanks. This saves my project, as I need to deploy on a single core
> Atom
> > > and was stuck.
> > >
> > > Mike
> > >
> > > On Oct 29, 2014, at 5:12 PM, John Lato  wrote:
> > >
> > > By any chance do the delays get shorter if you run your program with
> `+RTS
> > > -C0.005` ?  If so, I suspect you're having a problem very similar to
> one
> > > that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
> > > reason), involving possible misbehavior of the thread scheduler.
> > >
> > > On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones 
> wrote:
> > >
> > >> I have a general question about thread behavior in 7.8.3 vs 7.6.X
> > >>
> > >> I moved from 7.6 to 7.8 and my application behaves very differently. I
> > >> have three threads, an application thread that plots data with
> wxhaskell or
> > >> sends it over a network (depends on settings), a thread doing usb bulk
> > >> writes, and a thread doing usb bulk reads. Data is moved around with
> TChan,
> > >> and TVar is used for coordination.
> > >>
> > >> When the application was compiled with 7.6, my stream of usb traffic
> was
> > >> smooth. With 7.8, there are lots of delays where nothing seems to be
> > >> running. These delays are up to 40ms, whereas with 7.6 delays were a
> 1ms or
> > >> so.
> > >>
> > >> When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs
> fine
> > >> without with -N2/4.
> > >&

Re: Thread behavior in 7.8.3

2014-10-30 Thread John Lato

ults to 20ms, and can be set by the -C flag.  Capabilities perform
> >>>> context switches in other circumstances as well, such as when a thread
> >>>> yields or blocks.
> >>>>
> >>>> My guess is that either the context switching logic changed in
> ghc-7.8,
> >>> or
> >>>> possibly your code used to trigger a switch via some other mechanism
> >>> (stack
> >>>> overflow or something maybe?), but is optimized differently now so
> >>> instead
> >>>> it needs to wait for the timer to expire.
> >>>>
> >>>> The problem we had was that a time-sensitive thread was getting
> scheduled
> >>>> on the same capability as a long-running non-yielding thread, so the
> >>>> time-sensitive thread had to wait for a context switch timeout (even
> >>> though
> >>>> there were free cores available!).  I expect even with -N4 you'll
> still
> >>> see
> >>>> occasional delays (perhaps <5% of calls).
> >>>>
> >>>> We've solved our problem with judicious use of `forkOn`, but that
> won't
> >>>> help at N1.
> >>>>
> >>>> We did see this behavior in 7.6, but it's definitely worse in 7.8.
> >>>>
> >>>> Incidentally, has there been any interest in a work-stealing
> scheduler?
> >>>> There was a discussion from about 2 years ago, in which Simon Marlow
> >>> noted
> >>>> it might be tricky, but it would definitely help in situations like
> this.
> >>>>
> >>>> John L.
> >>>>
> >>>> On Thu, Oct 30, 2014 at 8:02 AM, Michael Jones 
> >>> wrote:
> >>>>
> >>>>> John,
> >>>>>
> >>>>> Adding -C0.005 makes it much better. Using -C0.001 makes it behave
> more
> >>>>> like -N4.
> >>>>>
> >>>>> Thanks. This saves my project, as I need to deploy on a single core
> >>> Atom
> >>>>> and was stuck.
> >>>>>
> >>>>> Mike
> >>>>>
> >>>>> On Oct 29, 2014, at 5:12 PM, John Lato  wrote:
> >>>>>
> >>>>> By any chance do the delays get shorter if you run your program with
> >>> `+RTS
> >>>>> -C0.005` ?  If so, I suspect you're having a problem very similar to
> >>> one
> >>>>> that we had with ghc-7.8 (7.6 too, but it's worse on ghc-7.8 for some
> >>>>> reason), involving possible misbehavior of the thread scheduler.
> >>>>>
> >>>>> On Wed, Oct 29, 2014 at 2:18 PM, Michael Jones 
> >>> wrote:
> >>>>>
> >>>>>> I have a general question about thread behavior in 7.8.3 vs 7.6.X
> >>>>>>
> >>>>>> I moved from 7.6 to 7.8 and my application behaves very
> differently. I
> >>>>>> have three threads, an application thread that plots data with
> >>> wxhaskell or
> >>>>>> sends it over a network (depends on settings), a thread doing usb
> bulk
> >>>>>> writes, and a thread doing usb bulk reads. Data is moved around with
> >>> TChan,
> >>>>>> and TVar is used for coordination.
> >>>>>>
> >>>>>> When the application was compiled with 7.6, my stream of usb traffic
> >>> was
> >>>>>> smooth. With 7.8, there are lots of delays where nothing seems to be
> >>>>>> running. These delays are up to 40ms, whereas with 7.6 delays were a
> >>> 1ms or
> >>>>>> so.
> >>>>>>
> >>>>>> When I add -N2 or -N4, the 7.8 program runs fine. But on 7.6 it runs
> >>> fine
> >>>>>> without with -N2/4.
> >>>>>>
> >>>>>> The program is compiled -O2 with profiling. The -N2/4 version uses
> >>> more
> >>>>>> memory,  but in both cases with 7.8 and with 7.6 there is no space
> >>> leak.
> >>>>>>
> >>>>>> I tired to compile and use -ls so I could take a look with
> >>> threadscope,
> >>>>>> but the application hangs and writes no data to the file. The CPU
> >>> fans run
> >>>>>> wild like it is in an infinite loop. It at least pops an unpainted
> >>>>>> wxhaskell window, so it got partially running.
> >>>>>>
> >>>>>> One of my libraries uses option -fsimpl-tick-factor=200 to get
> around
> >>> the
> >>>>>> compiler.
> >>>>>>
> >>>>>> What do I need to know about changes to threading and event logging
> >>>>>> between 7.6 and 7.8? Is there some general documentation somewhere
> >>> that
> >>>>>> might help?
> >>>>>>
> >>>>>> I am on Ubuntu 14.04 LTS. I downloaded the 7.8 tool chain tar ball
> and
> >>>>>> installed myself, after removing 7.6 with apt-get.
> >>>>>>
> >>>>>> Any hints appreciated.
> >>>>>>
> >>>>>> Mike
> >>>>>>
> >>>>>>
> >>>>>> ___
> >>>>>> Glasgow-haskell-users mailing list
> >>>>>> Glasgow-haskell-users@haskell.org
> >>>>>> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
>
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: GHC 7.8.3 thread hang

2014-11-11 Thread John Lato

The blocked on black hole message is very suspicious.  It means that thread
7 is blocked waiting for another thread to evaluate a thunk.  But in this
case, it's thread 7 that created that thunk and is supposed to be doing the
evaluating.  This is some evidence that Gregory's theory is correct and
your encode function loops somewhere.

On Wed Nov 12 2014 at 11:25:30 AM Michael Jones  wrote:

> Gregory,
>
> The options in the 7.8.3 user guide says in the -Msize option that by
> default the heap is unlimited. I have several applications, and they all
> have messages like:
>
> 7fddc7bcd700: cap 2: waking up thread 7 on cap 2
> 7fddc7bcd700: cap 2: thread 4 stopped (yielding)
> 7fddcaad6740: cap 2: running thread 7 (ThreadRunGHC)
> 7fddcaad6740: cap 2: thread 7 stopped (heap overflow)
> 7fddcaad6740: cap 2: requesting parallel GC
> 7fddc5ffe700: cap 0: starting GC
> 7fddc57fd700: cap 1: starting GC
> 7fdda77fe700: cap 3: starting GC
> 7fddcaad6740: cap 2: starting GC
>
> I assumed that when the heap ran out of space, it caused a GC, or it
> enlarged the heap. The programs that have these messages run for very long
> periods of time, and when I heap profile them, they use about 500KM to 1MB
> over long periods of time, and are quite stable.
>
> As a test, I ran the hang application with profiling to see if memory
> jumps up before or after the hang.
>
> What I notice is the app moves along using about 800KB, then there is a
> spike to 2MB at the hang. So I believe you, but I am confused about the RTS
> behavior and how I can have all these overflow messages in a normal
> application and how to tell the difference between these routine messages
> vs a real heap problem.
>
> So, I dug deeper into the log. A normal execution for sending a command
> looks like:
>
> 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: thread 7 stopped (heap overflow)
> 7f99e6ffd700: cap 0: requesting parallel GC
> 7f99e6ffd700: cap 0: starting GC
> 7f99e6ffd700: cap 0: GC working
> 7f99e6ffd700: cap 0: GC idle
> 7f99e6ffd700: cap 0: GC done
> 7f99e6ffd700: cap 0: GC idle
> 7f99e6ffd700: cap 0: GC done
> 7f99e6ffd700: cap 0: all caps stopped for GC
> 7f99e6ffd700: cap 0: finished GC
> 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: sendCommand
> 7f99e6ffd700: cap 0: sendCommand: encoded
> 7f99e6ffd700: cap 0: sendCommand: size 4
> 7f99e6ffd700: cap 0: sendCommand: unpacked
> 7f99e6ffd700: cap 0: Sending command of size 4
> 7f99e6ffd700: cap 0: Sending command of size "\NUL\EOT"
> 7f99e6ffd700: cap 0: sendCommand: sent
> 7f99e6ffd700: cap 0: sendCommand: flushed
> 7f99e6ffd700: cap 0: thread 7 stopped (blocked on an MVar)
> 7f99e6ffd700: cap 0: running thread 2 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: thread 2 stopped (yielding)
> 7f99e6ffd700: cap 0: running thread 45 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: fetchTelemetryServer
> 7f99e6ffd700: cap 0: fetchTelemetryServer: got lock
>
> The thread is run, overflows, GC, runs, then blocks on an MVAr.
>
> For a the hang case:
>
> 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: sendCommand
> 7f99e6ffd700: cap 0: thread 7 stopped (heap overflow)
> 7f99e6ffd700: cap 0: requesting parallel GC
> 7f99e6ffd700: cap 0: starting GC
> 7f99e6ffd700: cap 0: GC working
> 7f99e6ffd700: cap 0: GC idle
> 7f99e6ffd700: cap 0: GC done
> 7f99e6ffd700: cap 0: GC idle
> 7f99e6ffd700: cap 0: GC done
> 7f99e6ffd700: cap 0: all caps stopped for GC
> 7f99e6ffd700: cap 0: finished GC
> 7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)
> 7f9a05362a40: cap 0: thread 1408 stopped (yielding)
> 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: thread 7 stopped (heap overflow)
> 7f99e6ffd700: cap 0: requesting parallel GC
> 7f99e6ffd700: cap 0: starting GC
> 7f99e6ffd700: cap 0: GC working
> 7f99e6ffd700: cap 0: GC idle
> 7f99e6ffd700: cap 0: GC done
> 7f99e6ffd700: cap 0: GC idle
> 7f99e6ffd700: cap 0: GC done
> 7f99e6ffd700: cap 0: all caps stopped for GC
> 7f99e6ffd700: cap 0: finished GC
> 7f99e6ffd700: cap 0: running thread 7 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: thread 7 stopped (yielding)
> 7f99e6ffd700: cap 0: running thread 2 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: thread 2 stopped (yielding)
> 7f99e6ffd700: cap 0: running thread 45 (ThreadRunGHC)
> 7f99e6ffd700: cap 0: fetchTelemetryServer
> 7f99e6ffd700: cap 0: fetchTelemetryServer: got lock
> ...
> 7f99e6ffd700: cap 0: fetchTelemetryServer: got lock
> 7f99e6ffd700: cap 0: fetchTelemetryServer: unlock
> 7f99e6ffd700: cap 0: fetchTelemetryServer
> 7f99e6ffd700: cap 0: fetchTelemetryServer: got lock
> 7f99e6ffd700: cap 0: fetchTelemetryServer: unlock
> 7f99e6ffd700: cap 0: fetchTelemetryServer
> 7f99e6ffd700: cap 0: thread 45 stopped (yielding)
> 7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)
> 7f9a05362a40: cap 0: thread 1408 stopped (suspended while making a foreign
> call)
> 7f9a05362a40: cap 0: running thread 1408 (ThreadRunGHC)
> 7f9a05362a40: c

Re: What to do when garbage collector is slow?

2014-12-23 Thread John Lato

Can't try your code now,  but have you tried using threadscope? Just a
thought, but maybe the garbage collection is blocked waiting for a thread
to finish.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: What to do when garbage collector is slow?

2014-12-23 Thread John Lato

Ah, just took a look.  I think my suggestion is unlikely to be correct.

On 08:40, Tue, Dec 23, 2014 John Lato  wrote:

> Can't try your code now,  but have you tried using threadscope? Just a
> thought, but maybe the garbage collection is blocked waiting for a thread
> to finish.
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: template haskell vs. -prof

2015-01-23 Thread John Lato

I agree that mixing template haskell with -prof can be tricky.  It's easier
if you turn off dynamic linking entirely.

As for multi-line string literals, I also think that an explicit syntax
would be nice.  Until then, I usually use:

unlines
  [ "Line 1"
  , "Line 2"
  ]

which ends up being pretty maintainable and easy to read.

On Fri Jan 23 2015 at 6:16:46 AM Evan Laforge  wrote:

> I ran into trouble compiling template haskell with -prof, and came
> across the ghc manual "7.9.4. Using Template Haskell with Profiling".
> Unfortunately I can't use its advice directly since I put profiling
> and non-profiling .o files into different directories.  But in
> principle it seems it should work, I just have to get ghc to load TH
> from the debug build directory, which is built with -dynamic, while
> continuing to load from the profile build directory.
>
> But are there flags to get it to do that?  I'm using "-osuf .hs.o
> -ibuild/profile/obj".  If I put ":build/debug/obj" on the -i line, it
> still seems to find the profiling one.  The ghc manual advice probably
> gets around it by using different -osufs... I guess TH somehow ignores
> -osuf?  Except when I compile the debug version with osuf, if finds
> them fine, so I don't really know how it works.
>
> Is there a way I can directly tell TH where to look?  It seems awkward
> to rely on all these implicit and seemingly undocumented heuristics.
>
> And, this is somewhat beside the point, but shouldn't TH theoretically
> be able to load directly from .hs and compile to bytecode like ghci
> can do if it doesn't find the .o file?
>
> And, even more beside the point, the only reason I'm messing with TH
> is for a really simple (one line) multi-line string literal
> quasiquote.  Surely I'm not the only person who would enjoy a
> -XMultiLineStringLiteral extension?  The alternative seems to be a
> program to add or strip all of the "\n\"s, and when I want to edit,
> copy out, strip, edit, paste back in, add back.  At that point maybe
> it's easier to just get used to all the \s... but then indentation is
> all a bit off due to the leading \.
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: template haskell vs. -prof

2015-01-23 Thread John Lato

On 21:23, Fri, Jan 23, 2015 Evan Laforge  wrote:

On Sat, Jan 24, 2015 at 2:38 AM, John Lato  wrote:
> I agree that mixing template haskell with -prof can be tricky.  It's
easier if you turn
> off dynamic linking entirely.

But that's the thing, I do turn of dynamic linking because I have to
for -prof, but TH seems to require it.

 I mean to use a ghc that's been built without dynamic support.

 > unlines
>   [ "Line 1"
>   , "Line 2"
>   ]
>
> which ends up being pretty maintainable and easy to read.

Yeah, I use this one too.  It's ok for short things, but it can still
be annoying to edit.  My editor doesn't know how to do line wrapping
for it.  Then you can't just copy paste in and out.  And tabs get
messed up because you're already indented, and probably not in a
tabstop multiple... though I guess I could align it so it was.  And of
course since there's prefix gunk, folding doesn't work inside.

 Yeah, those are all problematic.
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Removing latency spikes. Garbage collector related?

2015-09-28 Thread John Lato

Try Greg's recommendations first.  If you still need to do more
investigation, I'd recommend that you look at some samples with either
threadscope or dumping the eventlog to text.  I really like
ghc-events-analyze, but it doesn't provide quite the same level of detail.
You may also want to dump some of your metrics into the eventlog, because
then you'll be able to see exactly how high latency episodes line up with
GC pauses.

On Mon, Sep 28, 2015 at 1:02 PM Gregory Collins 
wrote:

>
> On Mon, Sep 28, 2015 at 9:08 AM, Will Sewell  wrote:
>
>> If it is the GC, then is there anything that can be done about it?
>
>
>- Increase value of -A (the default is too small) -- best value for
>this is L3 cache size of the chip
>- Increase value of -H (total heap size) -- this will use more ram but
>you'll run GC less often
>- This will sound flip, but: generate less garbage. Frequency of GC
>runs is proportional to the amount of garbage being produced, so if you can
>lower mutator allocation rate then you will also increase net productivity.
>Built-up thunks can transparently hide a lot of allocation so fire up the
>profiler and tighten those up (there's an 80-20 rule here). Reuse output
>buffers if you aren't already, etc.
>
> G
>
> --
> Gregory Collins 
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

Re: Removing latency spikes. Garbage collector related?

2015-09-29 Thread John Lato

By dumping metrics, I mean essentially the same as the ghc-events-analyze
annotations but with any more information that is useful for the
investigation.  In particular,  if you have a message id, include that. You
may also want to annotate thread names with GHC.Conc.labelThread. You may
also want to add more annotations to drill down if you uncover a problem
area.

If I were investigating, I would take e.g. the five largest outliers, then
look in the (text) eventlog for those message ids, and see what happened
between the start and stop.  You'll likely want to track the thread states
(which is why I suggested you annotate the thread names).

I'm not convinced it's entirely the GC, the latencies are larger than I
would expect from a GC pause (although lots of factors can affect that). I
suspect that either you have something causing abnormal GC spikes, or
there's a different cause.

On 04:15, Tue, Sep 29, 2015 Will Sewell  wrote:

> Thanks for the reply John. I will have a go at doing that. What do you
> mean exactly by dumping metrics, do you mean measuring the latency
> within the program, and dumping it if it exceeds a certain threshold?
>
> And from the answers I'm assuming you believe it is the GC that is
> most likely causing these spikes. I've never profiled Haskell code, so
> I'm not used to seeing what the effects of the GC actually are.
>
> On 28 September 2015 at 19:31, John Lato  wrote:
> > Try Greg's recommendations first.  If you still need to do more
> > investigation, I'd recommend that you look at some samples with either
> > threadscope or dumping the eventlog to text.  I really like
> > ghc-events-analyze, but it doesn't provide quite the same level of
> detail.
> > You may also want to dump some of your metrics into the eventlog, because
> > then you'll be able to see exactly how high latency episodes line up
> with GC
> > pauses.
> >
> > On Mon, Sep 28, 2015 at 1:02 PM Gregory Collins  >
> > wrote:
> >>
> >>
> >> On Mon, Sep 28, 2015 at 9:08 AM, Will Sewell  wrote:
> >>>
> >>> If it is the GC, then is there anything that can be done about it?
> >>
> >> Increase value of -A (the default is too small) -- best value for this
> is
> >> L3 cache size of the chip
> >> Increase value of -H (total heap size) -- this will use more ram but
> >> you'll run GC less often
> >> This will sound flip, but: generate less garbage. Frequency of GC runs
> is
> >> proportional to the amount of garbage being produced, so if you can
> lower
> >> mutator allocation rate then you will also increase net productivity.
> >> Built-up thunks can transparently hide a lot of allocation so fire up
> the
> >> profiler and tighten those up (there's an 80-20 rule here). Reuse output
> >> buffers if you aren't already, etc.
> >>
> >> G
> >>
> >> --
> >> Gregory Collins 
> >> ___
> >> Glasgow-haskell-users mailing list
> >> Glasgow-haskell-users@haskell.org
> >> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users

cross module optimization issues

2008-11-15 Thread John Lato

Hello,

I have a problem with a package I'm working on, and I don't have any
idea how to sort out the current problem.

One part of my package is in one monolithic module, without an export
list, which works fine.  However, when I've started to separate out
certain functions into another module, and added an export list to one
of the modules, which dramatically decreases performance.  The memory
behavior (as shown by -hT) is also quite different, with substantial
memory usage by "FUN_2_0".  Are there any suggestions as to how I
could improve this?

Thanks,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: cross module optimization issues

2008-11-18 Thread John Lato

On Sat, Nov 15, 2008 at 10:09 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
> jwlato:
>> Hello,
>>
>> I have a problem with a package I'm working on, and I don't have any
>> idea how to sort out the current problem.
>>
>> One part of my package is in one monolithic module, without an export
>> list, which works fine.  However, when I've started to separate out
>> certain functions into another module, and added an export list to one
>> of the modules, which dramatically decreases performance.  The memory
>> behavior (as shown by -hT) is also quite different, with substantial
>> memory usage by "FUN_2_0".  Are there any suggestions as to how I
>> could improve this?
>>
>
> Are you compiling with aggressive cross-module optimisations on (e.g.
> -O2)? You may have to add explicit inlining pragmas (check the Core
> output), to ensure key functions are exported in their entirety.
>

Thanks for the reply.

I'm compiling with -O2 -Wall.  After looking at the Core output, I
think I've found the key difference.  A function that is bound in a
"where" statement is different between the monolithic and split
sources.  I have no idea why, though.  I'll experiment with a few
different things to see if I can get this resolved.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: cross module optimization issues

2008-11-21 Thread John Lato

On Wed, Nov 19, 2008 at 4:17 PM, Simon Peyton-Jones
<[EMAIL PROTECTED]> wrote:
> | I'm compiling with -O2 -Wall.  After looking at the Core output, I
> | think I've found the key difference.  A function that is bound in a
> | "where" statement is different between the monolithic and split
> | sources.  I have no idea why, though.  I'll experiment with a few
> | different things to see if I can get this resolved.
>
> In general, splitting code across modules should not make programs less 
> efficient -- as Don says, GHC does quite aggressive cross-module inlining.
>
> There is one exception, though.  If a non-exported non-recursive function is 
> called exactly once, then it is inlined *regardless of size*, because doing 
> so does not cause code duplication.  But if it's exported and is large, then 
> its inlining is not exposed -- and even if it were it might not be inlined, 
> because doing so duplicates its code an unknown number of times.  You can 
> change the threshold for (a) exposing and (b) using an inlining, with flags 
> -funfolding-creation-threshold and -funfolding-use-threshold respectively.
>
> If you find there's something else going on then I'm all ears.
>
> Simon
>

I did finally find the changes that make a difference.  I think it's
safe to say that I have no idea what's actually going on, so I'll just
report my results and let others try to figure it out.

I tried upping the thresholds mentioned, up to
-funfolding-creation-threshold 200 -funfolding-use-threshold 100.
This didn't seem to make any performance difference (I didn't check
the core output).

This project is based on Oleg's Iteratee code; I started using his
IterateeM.hs and Enumerator.hs files and added my own stuff to
Enumerator.hs (thanks Oleg, great work as always).  When I started
cleaning up by moving my functions from Enumerator.hs to MyEnum.hs, my
minimal test case increased from 19s to 43s.

I've found two factors that contributed.  When I was cleaning up, I
also removed a bunch of unused functions from IterateeM.hs (some of
the test functions and functions specific to his running example of
HTTP encoding).  When I added those functions back in, and added
INLINE pragmas to the exported functions in MyEnum.hs, I got the
performance back.

In general I hadn't added export lists to the modules yet, so all
functions should have been exported.

So it seems that somehow the unused functions in IterateeM.hs are
affecting how the functions I care about get implemented (or
exported).  I did not expect that.  Next step for me is to see what
happens if I INLINE the functions I'm exporting and remove the others,
I suppose.

Thank you Simon and Don for your advice, especially since I'm pretty
far over my head at this point.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: cross module optimization issues

2008-11-23 Thread John Lato

On Sat, Nov 22, 2008 at 6:55 PM, Don Stewart <[EMAIL PROTECTED]> wrote:
> jwlato:
>
> Is this , since it is in IO code, a -fno-state-hack scenario?
> Simon  wrote recently about when and why -fno-state-hack would be
> needed, if you want to follow that up.
>
> -- Don
>

Unfortunately, -fno-state-hack doesn't seem to make much difference.
In any case, only the functions that actually do file IO are in the IO
monad; otherwise the functions  use a generic Monad constraint.
Although you have reminded me that I should make a non-IO test case.

For Neil, and anyone else interested in looking at this, I'll put the
code and build instructions up later today.  I've just been cleaning
up some test cases to make it easier to run.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Can't compile GHC 6.8.2

2008-11-25 Thread John Lato

> On Monday 24 November 2008 23:15, Barney Stratford wrote:
>> There's good news and bad news. The good news is that the compilation of
>> my shiny almost-new GHC is complete. The bad news is, it won't link.
>> It's grumbling about
>>
>> ld:
>> /System/Fink/src/fink.build/ghc-6.8.2-1/ghc-6.8.2/rts/libHSrts.a(PrimOps.o)
>> has external relocation entries in non-writable section (__TEXT,__text)
>> for symbols:
>> ___gmpn_cmp
--- etc. --

IMO, many apple dev. tools are insidiously broken (Apple's cpp is my
current nemesis).  Could you try using a different linker?  That might
be a sufficient workaround.  You could call gcc with the correct
arguments to complete the linking phase, rather than ld (or
vice-versa).

The only other suggestion I've seen is to re-order the exported
functions before compiling the library.  I don't know if this would
help in your case, but it has worked with other libraries.

That's the best I can suggest, anyway.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: cross module optimization issues

2008-11-28 Thread John Lato

Neil, thank you very much for taking the time to look at this; I
greatly appreciate it.

One thing I don't understand is why the specializations are caused by
print_lines.  I suppose the optimizer can infer something which it
couldn't otherwise.

If I read this properly, the functions being specialized are liftI,
(>>=), return, and $f2.  One thing I'm not sure about is when INLINE
provides the desired optimal behavior, as opposed to SPECIALIZE.  The
monad functions are defined in the Monad instance, and thus aren't
currently INLINE'd or SPECIALIZE'd.  However, if they are separate
functions, would INLINE be sufficient?  Would that give the optimizer
enough to work with the derive the specializations on its own?  I'll
have some time to experiment with this myself tomorrow, but I'd
appreciate some direction (rather than guessing blindly).

What is "$f2"?  I've seen that appear before, but I'm not sure where
it comes from.

Thanks,
John

On Fri, Nov 28, 2008 at 10:31 AM, Simon Peyton-Jones
<[EMAIL PROTECTED]> wrote:
> The specialisations are indeed caused (indirectly) by the presence of 
> print_lines.  If print_lines is dead code (as it is when print_lines is not 
> exported), then there are no calls to the overloaded functions at these 
> specialised types, and so you don't get the specialised versions.  You can 
> get specialised versions by a SPECIALISE pragma, or SPECIALISE INSTANCE
>
> Does that make sense?
>
> Simon
>
> | -Original Message-
> | From: Neil Mitchell [mailto:[EMAIL PROTECTED]
> | Sent: 28 November 2008 09:48
> | To: Simon Peyton-Jones
> | Cc: John Lato; glasgow-haskell-users@haskell.org; Don Stewart
> | Subject: Re: cross module optimization issues
> |
> | Hi
> |
> | I've talked to John a bit, and discussed test cases etc. I've tracked
> | this down a little way.
> |
> | Given the attached file, compiling witih SHORT_EXPORT_LIST makes the
> | code go _slower_. By exporting the "print_lines" function the code
> | doubles in speed. This runs against everything I was expecting, and
> | that Simon has described.
> |
> | Taking a look at the .hi files for the two alternatives, there are two
> | differences:
> |
> | 1) In the faster .hi file, the body of print_lines is exported. This
> | is reasonable and expected.
> |
> | 2) In the faster .hi file, there are additional specialisations, which
> | seemingly have little/nothing to do with print_lines, but are omitted
> | if it is not exported:
> |
> | "SPEC >>= [GHC.IOBase.IO]" ALWAYS forall @ el
> |  $dMonad :: GHC.Base.Monad 
> GHC.IOBase.IO
> |   Sound.IterateeM.>>= @ GHC.IOBase.IO @ el $dMonad
> |   = Sound.IterateeM.a
> |   `cast`
> | (forall el1 a b.
> |  Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO a
> |  -> (a -> Sound.IterateeM.IterateeGM el1 GHC.IOBase.IO b)
> |  -> trans
> | (sym ((GHC.IOBase.:CoIO)
> |   (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO b)))
> | (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO b)))
> |   @ el
> | "SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO]" ALWAYS forall @ el
> |  $dMonad ::
> | GHC.Base.Monad GHC.IOBase.IO
> |   Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
> |   = Sound.IterateeM.$s$f2 @ el
> | "SPEC Sound.IterateeM.$f2 [GHC.IOBase.IO]" ALWAYS forall @ el
> |  $dMonad ::
> | GHC.Base.Monad GHC.IOBase.IO
> |   Sound.IterateeM.$f2 @ GHC.IOBase.IO @ el $dMonad
> |   = Sound.IterateeM.$s$f21 @ el
> | "SPEC Sound.IterateeM.liftI [GHC.IOBase.IO]" ALWAYS forall @ el
> |@ a
> |$dMonad ::
> | GHC.Base.Monad GHC.IOBase.IO
> |   Sound.IterateeM.liftI @ GHC.IOBase.IO @ el @ a $dMonad
> |   = Sound.IterateeM.$sliftI @ el @ a
> | "SPEC return [GHC.IOBase.IO]" ALWAYS forall @ el
> | $dMonad :: GHC.Base.Monad
> | GHC.IOBase.IO
> |   Sound.IterateeM.return @ GHC.IOBase.IO @ el $dMonad
> |   = Sound.IterateeM.a7
> |   `cast`
> | (forall el1 a.
> |  a
> |  -> trans
> | (sym ((GHC.IOBase.:CoIO)
> |   (Sound.IterateeM.IterateeG el1 GHC.IOBase.IO a)))
> | (sym ((Sound.IterateeM.:CoIterateeGM) el1 GHC.IOBase.IO a)))
> |   @ el
> |
> | My guess is that these cause the slowdown - but is there any reason
> | that print_lines not being exported should cause t

Re: cross module optimization issues

2008-11-28 Thread John Lato

Yes, this does help, thank you.  I didn't know you could generate
specialized instances.  In fact, I was so sure that this was some
arcane feature I immediately went to the GHC User Guide because I
didn't believe it was documented.

I immediately stumbled upon Section 8.13.9.

Thanks to everyone who helped me with this.  I think I've achieved a
small bit of enlightenment.

Cheers,
John

On Fri, Nov 28, 2008 at 2:46 PM, Simon Peyton-Jones
<[EMAIL PROTECTED]> wrote:
> The $f2 comes from the instance Monad (IterateeGM ...).
> print_lines uses a specialised version of that instance, namely
>Monad (IterateeGM el IO)
> The fact that print_lines uses it makes GHC generate a specialised version of 
> the instance decl.
>
> Even in the absence of print_lines you can generate the specialised instance 
> thus
>
> instance Monad m => Monad (IterateeGM el m) where
>{-# SPECIALISE instance Monad (IterateeGM el IO) #-}
>... methods...
>
> does that help?
>
> Simon
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: GADT record syntax and contexts

2009-06-17 Thread John Lato

>From the perspective of someone who doesn't use GADT's much, I find
(B) to be more clear.

John Lato

> SPJ wrote:
>
> Question for everyone:
>
>  * are (A) and (B) the only choices?
>  * do you agree (B) is best
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

question about -fno-pre-inlining

2009-06-18 Thread John Lato

Hello,

I was experimenting with compiler flags trying to tune some
performance and got something unexpected with the -fno-pre-inlining
flag.  I was hoping somebody here might be able to clarify an issue
for me.

When compiled with -fno-pre-inlining, my test program gives a
different result than compiled without (0.988... :: Double, compared
to 1.0).  It's numerical code, and was originally compiled with
-fexcess-precision, however I have tried both with and without
-fexcess-precision and the results are the same.  The only other
compiler flags in use are -O2 and --make.  Is this expected behavior
or a possible bug?  I believe the value with -fno-pre-inlining is
correct (and runs about 30% faster too).

This was done on an OSX 10.5 Macbook with GHC-6.10.3.  I could check
this on some other systems if it would be helpful.

Sincerely,
John Lato
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: question about -fno-pre-inlining

2009-06-18 Thread John Lato

Simon,

Thanks for the quick reply, and also the link.  I'll be sure to read
it.  I don't know what pre-inlining is; I was testing different
compiler options with acovea, which indicated the performance boost.
When I tried it myself, I noticed the differing value.

I'm pretty sure the affected code is in a library I'm developing.  If
I turn off pre-inlining when compiling the library I get the same
final value as when turning it off in just the test program, although
performance is markedly worse.  Unfortunately that doesn't narrow it
down much; there are several modules in the library.

The algorithm shouldn't be particularly sensitive.  I'm just normalize
Ints to Doubles in the range +- 1.0 and finding the maximum.

I'm not using the FFI, but there are a few questionable tactics
employed.  In particular, I'm doing both:
1.  Casting Ptr's (in IO).
2.  Using an Int24 data type that has operations on unboxed Int#'s,
similar to Int16's implementation.

Of course the problem may be unrelated to both of these.  I just
wanted to find out if this was expected or not before I attempt to
isolate it, because that will take a bit of work.  I'll see what I can
do, but it may be a while before I make any progress.

Cheers,
John

On Thu, Jun 18, 2009 at 11:16 AM, Simon
Peyton-Jones wrote:
> John
>
> | When compiled with -fno-pre-inlining, my test program gives a
> | different result than compiled without (0.988... :: Double, compared
> | to 1.0).  It's numerical code, and was originally compiled with
>
> That's entirely unexpected. I am very surprised that turning off pre-inlining
> a) affects the results at all
> b) improves performance
>
> Of course this is a floating point program, where various numeric 
> transformations are invalid if you want bit-for-bit accuracy.  (eg addition 
> is not associative).   But a 2% change seems big, unless it's a very 
> sensitive algorithm.
>
> To find out what "pre-inlining" is read Section 5 of
> http://research.microsoft.com/en-us/um/people/simonpj/papers/inlining/inline-jfp.ps.gz
> It's called "PreInlineUnconditionally" there.
>
> I'm not sure how to proceed.  The more you can boil it down, the easier it'll 
> be to find out what is going on.  One way to do this is to make the program 
> smaller. But even finding out which function is sensitive to the setting of 
> -fno-pre-inlining would be interesting.  (You can't set this on a function by 
> function basis, so you'll have to split the module.)
>
> If you can make a self-contained test case, do make a Trac ticket for it.
>
> Are you using the FFI?
>
> All very odd.
>
> Simon
>
> | -Original Message-
> | From: glasgow-haskell-users-boun...@haskell.org 
> [mailto:glasgow-haskell-users-
> | boun...@haskell.org] On Behalf Of John Lato
> | Sent: 18 June 2009 09:58
> | To: glasgow-haskell-users@haskell.org
> | Subject: question about -fno-pre-inlining
> |
> | Hello,
> |
> | I was experimenting with compiler flags trying to tune some
> | performance and got something unexpected with the -fno-pre-inlining
> | flag.  I was hoping somebody here might be able to clarify an issue
> | for me.
> |
> | When compiled with -fno-pre-inlining, my test program gives a
> | different result than compiled without (0.988... :: Double, compared
> | to 1.0).  It's numerical code, and was originally compiled with
> | -fexcess-precision, however I have tried both with and without
> | -fexcess-precision and the results are the same.  The only other
> | compiler flags in use are -O2 and --make.  Is this expected behavior
> | or a possible bug?  I believe the value with -fno-pre-inlining is
> | correct (and runs about 30% faster too).
> |
> | This was done on an OSX 10.5 Macbook with GHC-6.10.3.  I could check
> | this on some other systems if it would be helpful.
> |
> | Sincerely,
> | John Lato
> | ___
> | Glasgow-haskell-users mailing list
> | Glasgow-haskell-users@haskell.org
> | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: question about -fno-pre-inlining

2009-06-26 Thread John Lato

In case anyone else was following this, I've discovered the source of
the differing output.  I had made some assumptions about when some
code would be executed based upon faulty reasoning.  Without
pre-inlining those assumptions happened to hold, but they did not when
pre-inlining was enabled.  Thanks to Echo Nolan who independently
discovered my error.

Now that I've fixed this issue, the output is the same (and correct)
regardless of compiler flags.  It's still faster without pre-inlining,
but that's a relatively minor problem in comparison.

Sincerely,
John

On Thu, Jun 18, 2009 at 11:16 AM, Simon
Peyton-Jones wrote:
> John
>
> | When compiled with -fno-pre-inlining, my test program gives a
> | different result than compiled without (0.988... :: Double, compared
> | to 1.0).  It's numerical code, and was originally compiled with
>
> That's entirely unexpected. I am very surprised that turning off pre-inlining
> a) affects the results at all
> b) improves performance
>
> Of course this is a floating point program, where various numeric 
> transformations are invalid if you want bit-for-bit accuracy.  (eg addition 
> is not associative).   But a 2% change seems big, unless it's a very 
> sensitive algorithm.
>
> To find out what "pre-inlining" is read Section 5 of
> http://research.microsoft.com/en-us/um/people/simonpj/papers/inlining/inline-jfp.ps.gz
> It's called "PreInlineUnconditionally" there.
>
> I'm not sure how to proceed.  The more you can boil it down, the easier it'll 
> be to find out what is going on.  One way to do this is to make the program 
> smaller. But even finding out which function is sensitive to the setting of 
> -fno-pre-inlining would be interesting.  (You can't set this on a function by 
> function basis, so you'll have to split the module.)
>
> If you can make a self-contained test case, do make a Trac ticket for it.
>
> Are you using the FFI?
>
> All very odd.
>
> Simon
>
> | -Original Message-
> | From: glasgow-haskell-users-boun...@haskell.org 
> [mailto:glasgow-haskell-users-
> | boun...@haskell.org] On Behalf Of John Lato
> | Sent: 18 June 2009 09:58
> | To: glasgow-haskell-users@haskell.org
> | Subject: question about -fno-pre-inlining
> |
> | Hello,
> |
> | I was experimenting with compiler flags trying to tune some
> | performance and got something unexpected with the -fno-pre-inlining
> | flag.  I was hoping somebody here might be able to clarify an issue
> | for me.
> |
> | When compiled with -fno-pre-inlining, my test program gives a
> | different result than compiled without (0.988... :: Double, compared
> | to 1.0).  It's numerical code, and was originally compiled with
> | -fexcess-precision, however I have tried both with and without
> | -fexcess-precision and the results are the same.  The only other
> | compiler flags in use are -O2 and --make.  Is this expected behavior
> | or a possible bug?  I believe the value with -fno-pre-inlining is
> | correct (and runs about 30% faster too).
> |
> | This was done on an OSX 10.5 Macbook with GHC-6.10.3.  I could check
> | this on some other systems if it would be helpful.
> |
> | Sincerely,
> | John Lato
> | ___
> | Glasgow-haskell-users mailing list
> | Glasgow-haskell-users@haskell.org
> | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Type checker's expected and inferred types (reformatted)

2009-10-28 Thread John Lato

> From: Isaac Dupree 
> David Menendez wrote:
>> On Sun, Oct 25, 2009 at 1:37 PM, Isaac Dupree
>>  wrote:
>>> David Menendez wrote:
 The expected type is what the context wants (it's *ex*ternal). The
 inferred type is what the expression itself has (it's *in*ternal).

 So inferring the type Maybe () for bar seems wrong.
>>> well, maybe GHC just gets it wrong enough of the time, that I got confused.
>>>
>>> Or maybe ... When there are bound variables interacting, on the inside and
>>> outside, it gets confusing.
>>>
>>>
>>> ghci:
>>> Prelude> \x -> (3+x) + (length x)
>>>
>>> :1:15:
>>>    Couldn't match expected type `[a]' against inferred type `Int'
>>>    In the second argument of `(+)', namely `(length x)'
>>>    In the expression: (3 + x) + (length x)
>>>    In the expression: \ x -> (3 + x) + (length x)
>>>
>>> Your explanation of "expected" and "inferred" could make sense to me if the
>>> error message followed the "Couldn't match" line with, instead,
>>>    "In the first argument of `length', namely `x'"
>>> because 'length' gives the context of expected list-type, but we've found
>>> out from elsewhere (a vague word) that 'x' needs to have type Int.
>>
>> This had me confused for a while, but I think I've worked out what's
>> happening. (+) is polymorphic,   ...
>
> Oh darn, it sounds like you're right. And polymorphism is so common.  I
> just came up with that example randomly as the first nontrivial
> type-error-with-a-lambda I could think of...

I think this is a great example of why the current type errors are not
as helpful as they could be.  The code where the type checker
determines 'x' has type [a] is several steps removed from where the
error arises.  This is how I understand this process (I've probably
left out some details):

1.  checker infers x :: [a] from 'length x'
2.  checker infers (3 + x) :: [a] from (+) and step 1
3.  checker infers the second (+) :: [a] -> [a] -> [a] from step 2
4.  conflict - checker expects (length x) :: [a] from step 3
 and infers (length x) :: Int from definition of 'length'.

Even with this simple example the error message given doesn't directly
point to the problem.  I don't think it's uncommon for there to be
more steps in practice.  I frequently find myself adding explicit type
signatures to let-bound functions to debug these.  This is a pain
because it also usually involves enabling ScopedTypeVariables.

I personally would find it useful if error messages showed the
sequence of how the type checker determined the given context.
Especially if it would also do so for infinite type errors, which
don't provide much useful information to me.

Cheers,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Type checker's expected and inferred types (reformatted)

2009-10-29 Thread John Lato

On Thu, Oct 29, 2009 at 8:37 AM, Philip K.F.
 wrote:
> Dear GHCers,
>
>
> On Wed, 2009-10-28 at 12:14 +0000, John Lato wrote:
>> >> This had me confused for a while, but I think I've worked out what's
>> >> happening. (+) is polymorphic,   ...
>> >
>> > Oh darn, it sounds like you're right. And polymorphism is so common.  I
>> > just came up with that example randomly as the first nontrivial
>> > type-error-with-a-lambda I could think of...
>>
>> I think this is a great example of why the current type errors are not
>> as helpful as they could be.  The code where the type checker
>> determines 'x' has type [a] is several steps removed from where the
>> error arises.  This is how I understand this process (I've probably
>> left out some details):
>
> I am a little ambiguous on this issue; I usually find GHC's type errors
> make me realize what I did wrong very quickly, i.e. until I start
> messing with combinations of GADTs, type classes and type families. When
> I've looked at an error for too long without understanding what's
> happening, I usually look for ways to express the same problem in
> simpler constructs.
>
> This case has me stumped, though.
>
>> 1.  checker infers x :: [a] from 'length x'
>> 2.  checker infers (3 + x) :: [a] from (+) and step 1
>> 3.  checker infers the second (+) :: [a] -> [a] -> [a] from step 2
>
> Pardon? Judging from the error GHC gives, you must be right, but isn't
> *this* the point where things should go wrong? I'm not too intimately
> familiar with the type checker's internals, so this example leads me to
> speculate that "normal" types are inferred and checked *before* type
> class constraints are evaluated.

This "(+) :: [a] -> [a] -> [a]" seems wrong from an intuitive sense,
but the type checker doesn't know that.  We know that

  (+) :: Num t => t -> t -> t

It's completely legal (though maybe ill-advised) to construct a Num
instance for [a].  Even if that instance isn't in scope when this code
is compiled, it could be added later.

I should probably wait for a GHC guru to respond to this one, because
I'm in the realm of speculation here, but I expect that "normal" types
need to be inferred, unified, and checked before type class
constraints can be applied.  In the case where a constraint is
necessary, either the type is a concrete type, e.g. Int, Char, for
which the class instance must be in scope, or the type is polymorphic,
e.g. [a], in which case the constraint must be passed up to the
context of where it's used.  The compiler doesn't know which is the
case (or exactly what context is necessary) until it's finished
checking the "normal" types.

> However, I would have wanted this
> error:
>
> Prelude> [1] + [2]
>
> :1:0:
>    No instance for (Num [t])
>      arising from a use of `+' at :1:0-8
>    Possible fix: add an instance declaration for (Num [t])
>    In the expression: [1] + [2]
>    In the definition of `it': it = [1] + [2]
>
> In other words: isn't the problem in this case that the type checker
> does not gather all information (no instance of type class Num) to give
> the proper error? Is gathering type class information after "normal"
> types have already conflicted even possible?
>

Just because a type class isn't in scope doesn't mean it will never be
in scope.  In order for this to work, you'd need to distinguish
between code that will never be linked to from elsewhere and code that
will.  I don't think this is possible without completely changing the
compilation chain.  Ghci can do it because, as an interpreter, it
doesn't produce object code to be linked to anyway.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Easily generating efficient instances for classes

2010-03-01 Thread John Lato

> From: Christian H?ner zu Siederdissen
>
> Hi,
>
> I am thinking about how to easily generate instances for a class. Each
> instance is a tuple with 1 or more elements. In addition there is a
> second tuple with the same number of elements but different type. This
> means getting longer and longer chains of something like (...,x3*x2,x2,0).
>
> - template haskell?
> - CPP and macros?
>
> Consider arrays with fast access like Data.Vector, but with higher
> dimensionality. Basically, I want (!) to fuse when used in Data.Vector
> code.

(shameless plug) You may want to look at my AdaptiveTuple package,
which does something very similar to this.  I used Template Haskell
because AFAIK neither generic approaches nor DrIFT/Derive will
generate data decls.

If all you need are the instances, then DrIFT or Derive would be my
recommendations.

Cheers,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Can't install Criterion package on ghc ..

2010-04-13 Thread John Lato

> From: Mozhgan kabiri 
>
> Hi,
>
> I am trying to install Criterion package, but I keep getting an error and I
> can't figure it out why it is like this !!
>
> mozh...@mozhgan-kch:~$ cabal install Criterion
> Resolving dependencies...
> Configuring vector-algorithms-0.3...
> Preprocessing library vector-algorithms-0.3...
> Building vector-algorithms-0.3...
...

> [8 of 9] Compiling Data.Vector.Algorithms.Intro (
> Data/Vector/Algorithms/Intro.hs, dist/build/Data/Vector/Algorithms/Intro.o )
> ghc: panic! (the 'impossible' happened)
>  (GHC version 6.10.4 for i386-unknown-linux):
>    idInfo co{v a9WB} [tv]

This is definitely a bug in GHC, most likely related to type families
and fixed in GHC-6.12.x

If you can upgrade to ghc-6.12.1, that should solve this problem.  If
you need to remain on ghc-6.10.4, try installing an older version of
criterion.  You can do this with 'cabal install criterion-0.4.1.0'
Criterion-0.5 is the first version to depend upon vector-algorithms,
so any previous version has a chance of working.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Can't install Criterion package on ghc ..

2010-04-14 Thread John Lato

I'm not the original reporter, but I can confirm that I am able to
install Criterion-0.5.0 with ghc-6.12.1 on OSX 10.6.  One datum,
anyway.

The problem isn't with criterion itself, but with vector-algorithms.
The vector library relies heavily on type families, which have dodgy
support in ghc-6.10.

John

On Wed, Apr 14, 2010 at 3:24 PM, Simon Peyton-Jones
 wrote:
> Can you check?  If it still happens with 6.12, please submit a bug report.
> If 6.10 can't compile criterion, you might want to add a constraint to the 
> Cabal meta-data.
>
> Simon
>
> | -Original Message-
> | From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-
> | users-boun...@haskell.org] On Behalf Of John Lato
> | Sent: 13 April 2010 16:15
> | To: Mozhgan kabiri
> | Cc: glasgow-haskell-users@haskell.org
> | Subject: Re: Can't install Criterion package on ghc ..
> |
> | > From: Mozhgan kabiri 
> | >
> | > Hi,
> | >
> | > I am trying to install Criterion package, but I keep getting an error and 
> I
> | > can't figure it out why it is like this !!
> | >
> | > mozh...@mozhgan-kch:~$ cabal install Criterion
> | > Resolving dependencies...
> | > Configuring vector-algorithms-0.3...
> | > Preprocessing library vector-algorithms-0.3...
> | > Building vector-algorithms-0.3...
> | ...
> |
> | > [8 of 9] Compiling Data.Vector.Algorithms.Intro (
> | > Data/Vector/Algorithms/Intro.hs, dist/build/Data/Vector/Algorithms/Intro.o
> | )
> | > ghc: panic! (the 'impossible' happened)
> | >  (GHC version 6.10.4 for i386-unknown-linux):
> | >    idInfo co{v a9WB} [tv]
> |
> | This is definitely a bug in GHC, most likely related to type families
> | and fixed in GHC-6.12.x
> |
> | If you can upgrade to ghc-6.12.1, that should solve this problem.  If
> | you need to remain on ghc-6.10.4, try installing an older version of
> | criterion.  You can do this with 'cabal install criterion-0.4.1.0'
> | Criterion-0.5 is the first version to depend upon vector-algorithms,
> | so any previous version has a chance of working.
> |
> | John
> | ___
> | Glasgow-haskell-users mailing list
> | Glasgow-haskell-users@haskell.org
> | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Parallel Haskell: 2-year project to push real world use

2010-05-05 Thread John Lato

> From: Roman Leshchinskiy 


Following on this discussion, I have an algorithm that currently uses
BLAS to do the heavy work.  I'd like to try to get it working with DPH
or Repa, although my prior attempts have been less than successful.

I have a vector of vectors where each element depends upon the
previous two; I can use zipWithP to generate each successive element,
but I don't see how to create the entire structure.  I could use
"unfold" if one were provided.  The best approach I can think of is to
create a sequential list of element-vectors.

Also, where is scanP?  I don't see it in Data.Array.Parallel.Prelude
(GHC-6.12.1).

Cheers,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Dynamic libraries and GHCi

2010-05-18 Thread John Lato

> From: Simon Marlow 
>>
>
>> But currently there is one problem with "GhcShared=YES": with this
>> option, the stage-2 compiler gets linked dynamically but the
>> corresponding inplace shell wrapper does not set (DY)LD_LIBRARY_PATH,
>> thus ./inplace/bin/ghc-stage2 doesn't run at all. I could work around
>> this by manually symlinking all the dynamic libraries to ./inplace/lib
>> and setting (DY)LD_LIBRARY_PATH to there, but obvisouly there should
>> be a solution better than this.
>
> On Linux we link the binary using -rpath (I know OS X doesn't have
> -rpath).  This is another issue we need to resolve before we can switch
> to a dynamically-linked GHCi.
>
> Basically there's a fair bit to do to make a dynamic GHCi a reality, and
> before we can do anything there are some tricky decisions to make.
>

When you say OSX doesn't have -rpath, do you mean there's some problem
with using -rpath on OSX?  I have code that links to a 3rd-party
library (libcudart.dylib) at runtime and it seems to use -rpath fine.
This is with ghc-6.12.1.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Dynamic libraries and GHCi

2010-05-20 Thread John Lato

On Thu, May 20, 2010 at 1:07 PM, Brandon S. Allbery KF8NH
 wrote:
> On May 20, 2010, at 06:23 , Simon Marlow wrote:
>>
>> On 18/05/2010 17:48, John Lato wrote:
>>>>
>>>> From: Simon Marlow
>>>>>
>>>>> But currently there is one problem with "GhcShared=YES": with this
>>>>> option, the stage-2 compiler gets linked dynamically but the
>>>>> corresponding inplace shell wrapper does not set (DY)LD_LIBRARY_PATH,
>>>>> thus ./inplace/bin/ghc-stage2 doesn't run at all. I could work around
>>>>> this by manually symlinking all the dynamic libraries to ./inplace/lib
>>>>> and setting (DY)LD_LIBRARY_PATH to there, but obvisouly there should
>>>>> be a solution better than this.
>>>>
>>>> On Linux we link the binary using -rpath (I know OS X doesn't have
>>>> -rpath).  This is another issue we need to resolve before we can switch
>>>> to a dynamically-linked GHCi.
>>>
>>> When you say OSX doesn't have -rpath, do you mean there's some problem
>>> with using -rpath on OSX?  I have code that links to a 3rd-party
>>> library (libcudart.dylib) at runtime and it seems to use -rpath fine.
>>> This is with ghc-6.12.1.
>>
>> It was my understanding that OS X doesn't have -rpath from reading
>>
>> http://hackage.haskell.org/trac/ghc/wiki/SharedLibraries/Management
>>
>> and from what I remember Mac people saying in the past.  Or maybe they
>> were just saying that -rpath is not the right thing on OS X because
>> libraries themselves have paths baked in.
>
>
> The latter.  Also, I'd recommend DYLD_FALLBACK_LIBRARY_PATH per Apple
> recommendations, as you can otherwise get weird results.

Apparently -rpath was only added in 10.5, so the former was previously
true also.  According to the ld manpage, -rpath only has an effect
when the library's load path begins with @rpath/, which seems to break
the usual OS X loading scheme so you then have to use -rpath.

According to the dyld docs, it looks like -rpath is meant to be used
as a last resort when other mechanisms aren't feasible.  In
particular, "The use of @rpath is most useful when  you  have  a
complex  directory structure  of  programs  and  dylibs  which can be
installed anywhere, but keep their relative positions." [1]

I suppose I'd agree that it's best avoided on OSX if it can be helped.

John

[1] 
http://developer.apple.com/mac/library/documentation/Darwin/Reference/ManPages/man1/dyld.1.html#//apple_ref/doc/man/1/dyld
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Dynamic libraries and GHCi

2010-05-20 Thread John Lato

On Thu, May 20, 2010 at 1:42 PM, Brandon S. Allbery KF8NH
 wrote:
> On May 20, 2010, at 08:29 , John Lato wrote:
>>
>> On Thu, May 20, 2010 at 1:07 PM, Brandon S. Allbery KF8NH
>>  wrote:
>>>
>>> On May 20, 2010, at 06:23 , Simon Marlow wrote:
>>>>
>>>> On 18/05/2010 17:48, John Lato wrote:
>>>>>>
>>>>>> From: Simon Marlow
>>>>>>>
>>>>>>> But currently there is one problem with "GhcShared=YES": with this
>>>>>>> option, the stage-2 compiler gets linked dynamically but the
>>>>>>> corresponding inplace shell wrapper does not set (DY)LD_LIBRARY_PATH,
>>>>>>> thus ./inplace/bin/ghc-stage2 doesn't run at all. I could work around
>>>>>>> this by manually symlinking all the dynamic libraries to
>>>>>>> ./inplace/lib
>>>>>>> and setting (DY)LD_LIBRARY_PATH to there, but obvisouly there should
>>>>>>> be a solution better than this.
>>>>>>
>>>>>> On Linux we link the binary using -rpath (I know OS X doesn't have
>>>>>> -rpath).  This is another issue we need to resolve before we can
>>>>>> switch
>>>>>> to a dynamically-linked GHCi.
>>>>>
>>>>> When you say OSX doesn't have -rpath, do you mean there's some problem
>>>>> with using -rpath on OSX?  I have code that links to a 3rd-party
>>>>> library (libcudart.dylib) at runtime and it seems to use -rpath fine.
>>>>> This is with ghc-6.12.1.
>>>>
>>>> It was my understanding that OS X doesn't have -rpath from reading
>>>>
>>>> http://hackage.haskell.org/trac/ghc/wiki/SharedLibraries/Management
>>>>
>>>> and from what I remember Mac people saying in the past.  Or maybe they
>>>> were just saying that -rpath is not the right thing on OS X because
>>>> libraries themselves have paths baked in.
>>>
>>> The latter.  Also, I'd recommend DYLD_FALLBACK_LIBRARY_PATH per Apple
>>> recommendations, as you can otherwise get weird results.
>>
>> According to the dyld docs, it looks like -rpath is meant to be used
>> as a last resort when other mechanisms aren't feasible.  In
>> particular, "The use of @rpath is most useful when  you  have  a
>> complex  directory structure  of  programs  and  dylibs  which can be
>> installed anywhere, but keep their relative positions." [1]
>>
>> I suppose I'd agree that it's best avoided on OSX if it can be helped.
>
> Interesting, considering that Apple's recommended use case appears to be
> exactly what we're looking for.

In that case, maybe -rpath is the right thing to do?  I don't know
much about Apple's dev. guidelines, so I won't make any
recommendations.

>
> Oh, also? �...@rpath isn't mentioned in dyld(1) on Leopard and is barely
> referenced in ld(1), which is why I missed it (and probably part of why
> -rpath was ignored).
>

At my most charitable, I'd describe Apple's docs for command-line
tools incomplete at best.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: link problem under macosx

2010-11-02 Thread John Lato

>
> From: Christian Maeder 
>
> Am 02.11.2010 11:48, schrieb Christian Maeder:
> > Hi,
> >
> > after installing
> >
> http://lambda.galois.com/hp-tmp/2010.2.0.0/haskell-platform-2010.2.0.0.i386.dmg
> > and various more libraries using cabal, we get the following linker
> > error below.
> >
> > A simple hello program compiles and links fine and uses
> > /usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version
> > 7.0.0) as shown by "otool -L".
> >
> > Does someone have an explanation or solution?
>
> adding "-L/usr/lib" as first argument to ghc solved the problem.
> Another cabal-package (gtk and friends) used /opt/local/lib under
> library-dirs:.
>
> Are there better workarounds?
>

Judging from the /opt/*, it looks like you're using macports for gtk.  The
libiconv provided by macports is incompatible with the system libiconv, and
trying to mix the two is a doomed effort.

If you're using a gtk2hs from macports (or gtk+ from macports) it will link
to /opt/local/lib/libiconv, but HP and ghc core libraries will link to
/usr/lib/libiconv.  This means that various haskell packages will be
entirely incompatible with each other, and problems will manifest as these
linker errors.

Since Apple seems disinclined to fix the system's libiconv, and macports
projects refuse to use it, the only real solution is to use either HP
without macports or the macports GHC without HP.  Personally I chose to use
the HP and built gtk+ using http://gtk-osx.sourceforge.net/ , which is
moderately painful but has been stable once it's built.

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: link problem under macosx

2010-11-03 Thread John Lato

On Wed, Nov 3, 2010 at 9:55 AM, Christian Maeder
wrote:

> Am 02.11.2010 18:03, schrieb Thorkil Naur:
> > Hello,
> >
> > On Tue, Nov 02, 2010 at 01:03:04PM +0100, Christian Maeder wrote:
> >> ...
> >> Are there better workarounds?
> >
> > I am not sure about that, I assume that you have looked at
> http://hackage.haskell.org/trac/ghc/ticket/4068?
>
> no, I found Simon Michael's message
> http://www.mail-archive.com/haskell-c...@haskell.org/msg81961.html
> by chance.
>
> I did not try out his extra-lib-dirs proposal, since all cabal packages
> were already installed.
>
> And I agree with him that it should be documented somewhere more
> prominent and that avoiding macports is no (good) solution.
>
> I'll add his proposal to your (closed) ticket to increase the hit rate.
>
> Cheers Christian
>

His proposed solution works until you try to link a Haskell project to a
macports lib that requires libiconv.  It's also inconvenient that you'll
sometimes need to unpack hackage code and manually edit the .cabal file.

If you want to use macports, the only real solution is to build a GHC+libs
that prefers /opt/local/ to the system-installed locations.  The macports
GHC does this, or you can try to compile it yourself with appropriate flags
to configure (whatever they may be).

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: link problem under macosx

2010-11-03 Thread John Lato

>
> From: Simon Michael 
>
> On 11/2/10 10:20 AM, John Lato wrote:
> > Since Apple seems disinclined to fix the system's libiconv, and macports
> projects refuse to use it, the only real
> > solution is to use either HP without macports or the macports GHC without
> HP.  Personally I chose to use the HP and
>
> Not so, as mentioned you just need to make sure /usr/lib is in the link
> path before /opt/local/lib. Add -L/usr/lib to
> your build flags for ghc --make, and put:
>
> extra-lib-dirs: /usr/lib
> extra-lib-dirs: /opt/local/lib
>
> in that order in ~/.cabal/config for cabal build.
>
> This is a very FAQ and needs to be documented somewhere more obvious, I
> wonder where. cabal-install and GHC release notes ?
>

As I mentioned in another reply, this worked for me until I tried to link to
a macports lib (I think some regex lib) that actually used the macports
libiconv (a lot of them don't even if they pull in the dependency).  Then I
was really stuck because I had dependencies on two different libiconv's, and
neither one would link properly.

As such, I consider this a pretty fragile "solution".

John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

os x 64-bit?

2010-11-09 Thread John Lato

Hello,

I was wondering if there is a status report anywhere of progress towards
making ghc compile 64-bit on Snow Leopard.  There are a few trac tickets
that seem related:

4263: http://hackage.haskell.org/trac/ghc/ticket/4163
2965: (not sure, trac says the database is locked when I try to look at this
one)

Is Ian working on this, or anyone else?  I'd like to help if I can, and I
was wondering where would be the best place to start?  I probably won't
actually contribute much, but this seems as good a way as any for me to
start working with ghc.

Cheers,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: [Glasgow-haskell-users] os x 64-bit?

2010-11-11 Thread John Lato

Hi Greg,

Thanks very much for all your work on this, and Barney Stratford too.  This
is great news.  I'm doing a lot of numerical work with Doubles at the
moment, so progress is quite welcome!

I can try building 7.0.1-RC using a 32-bit 6.12.3, although probably not
until the weekend.  I'll let you know how it goes.

If there are any special build instructions for this, maybe you could update
trac #2965 with details?

Cheers,
John

From: Gregory Wright 
>
> Hi,
>
> I built ghc 7.0.1-rc2 yesterday 64-bit on Snow Leopard.  Much of the work
> in getting ghc to build 64-bit was done by Barney Stratford; the MacPorts
> ghc 6.10.4 has built successfully in 64 bit mode for a number of months.
>
> Until just a few weeks ago 6.12.x and HEAD wouldn't build 64-bit because
> of changes in ghci.  These changes were in part to remove the limitation
> that modules
> loaded by ghci had to be located below the 2 GB address boundary.
> Unfortunately,
> they revealed code paths in ghc's Mach-O linker that were never tested.
> (See
> bug #4318. Other bugs may be partial duplicates; the tickets probably need
> to be examined to check which ones are distinct.)
>
> The good news is that the linker patches were done just in time for 7.0.1.
> You should be all set to try the release candidates on Snow Leopard 64-bit.
>
> I built my 7.0.1-rc2 using a 64 bit 6.10.4.  My original 64-bit 6.10.4
> was bootstrapped
> using a 32-bit 6.10.4 compiled on Leopard.  Reports of trouble building the
> 7.0.1 release candidates using a 32-bit bootstrap compiler would be
> especially
> useful.
>
> Best Wishes,
> Greg
>
>
> On 11/9/10 12:48 PM, Brian Bloniarz wrote:
> > On 11/09/2010 02:36 AM, John Lato wrote:
> >> I was wondering if there is a status report anywhere of progress towards
> >> making ghc compile 64-bit on Snow Leopard.  There are a few trac tickets
> >> that seem related:
> > I think http://hackage.haskell.org/trac/ghc/ticket/3472
> > is related if you haven't seen it.
> >
> > I'm not working on this, though I am interested in helping enable
> > cross-compilation of GHC in general. I have been working on one facet
> > though: hsc2hs. hsc2hs is one of barriers to cross-compilation because
> > it requires compiling and running a .c file on the target machine
> > (because it needs target-specific information like the layout of
> > structures; GHC's mkDerivedConstants also has the same problem).
> >
> > I have a proof-of-concept patch which can do hsc2hs codegeneration
> > without running anything. This uses the same approach that autoconf
> > uses for cross-compilation. I'll try to post it within the next few
> > days -- if anybody finds this approach interesting please let me know.
> >
> > Thanks,
> > -Brian
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: building ghc-7.1.20110125 under Mac OS X

2011-02-01 Thread John Lato

>
> Subject: building ghc-7.1.20110125 under Mac OS X
>
> hi list.
>
> i have to build ghc-7.1.20110125 under mac os x, so i grabbed the stable
> snapshot. Everything builds fine but the resulting compiler has problems
>  with ld. It passes gcc flags to ld like "-march=-i686". Any ideas?
>
> BTW while still here. Are there any specific docs available on building
> 64-bit mac os x compiler?
>

Hi Pavel,

I'm not aware of any specific docs for building 64-bit on os x, but I didn't
have any difficulty last time I tried.  The key point is you need a 64-bit
bootstrap compiler (probably until
http://hackage.haskell.org/trac/ghc/ticket/3472 is fixed), however there are
binary distributions of ghc you can use for this.  After you install a
64-bit binary, go to the ghc src and do

./configure --with-ghc=/path/to/ghc64bin/ && make

and it should work.

N.B. don't install 32-bit and 64-bit versions of the same ghc, because the
libraries will clobber each other.  If you want to have this, change the
version before you install the second arch.

Cheers,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread John Lato

Hi Tim,

Sorry I can't tell you more about slop (I know less than you at this point),
but I do see the problem.  You're reading each line from a Handle as a
String (bad), then creating ByteStrings from that string with BS.pack
(really bad).  You want to read a ByteString (or Data.Text, or other compact
representation) directly from the handle without going through an
intervening string format.  Also, you'll be better off using a real parser
instead of "read", which is very difficult to use robustly.

John L.


> From: Tim Docker 
> Subject: memory slop (was: Using the GHC heap profiler)
> To: glasgow-haskell-users@haskell.org
> Message-ID: <4d895bb0.1080...@dockerz.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>
> On Mon, Mar 21, 2011 at 9:59 AM, I wrote:
> >
> > My question on the ghc heap profiler on stack overflow:
> >
> >
> http://stackoverflow.com/questions/5306717/how-should-i-interpret-the-output-of-the-ghc-heap-profiler
> >
> > remains unanswered :-( Perhaps that's not the best forum. Is there
> someone
> > here prepared to explain how the memory usage in the heap profiler
> relates
> > to the  "Live Bytes" count shown in the garbage collection statistics?
>
> I've made a little progress on this. I've simplified my program down to
> a simple executable that loads a bunch of data into an in-memory map,
> and then writes it out again. I've added calls to `seq` to ensure that
> laziness is not causing excessing memory consumption. When I run this on
> my sample data set, it takes ~7 cpu seconds, and uses ~120 MB of vm An
> equivalent python script, takes ~2 secs and ~19MB of vm :-(.
>
> The code is below. I'm mostly concerned with the memory usage rather
> than performance at this stage. What is interesting, is that when I turn
> on garbage collection statistics (+RTS -s), I see this:
>
>   10,089,324,996 bytes allocated in the heap
>  201,018,116 bytes copied during GC
>   12,153,592 bytes maximum residency (8 sample(s))
>   59,325,408 bytes maximum slop
>  114 MB total memory in use (1 MB lost due to fragmentation)
>
>   Generation 0: 19226 collections, 0 parallel,  1.59s,  1.64selapsed
>   Generation 1: 8 collections, 0 parallel,  0.04s,  0.04selapsed
>
>   INIT  time0.00s  (  0.00s elapsed)
>   MUT   time5.84s  (  5.96s elapsed)
>   GCtime1.63s  (  1.68s elapsed)
>   EXIT  time0.00s  (  0.00s elapsed)
>   Total time7.47s  (  7.64s elapsed)
>
>   %GC time  21.8%  (22.0% elapsed)
>
>   Alloc rate1,726,702,840 bytes per MUT second
>
>   Productivity  78.2% of total user, 76.5% of total elapsed
>
> This seems strange. The maximum residency of 12MB sounds about correct
> for my data. But what's with the 59MB of "slop"? According to the ghc docs:
>
> | The "bytes maximum slop" tells you the most space that is ever wasted
> | due to the way GHC allocates memory in blocks. Slop is memory at the
> | end of a block that was wasted. There's no way to control this; we
> | just like to see how much memory is being lost this way.
>
> There's this page also:
>
> http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/Slop
>
> but it doesn't really make things clearer for me.
>
> Is the slop number above likely to be a significant contribution to net
> memory usage? Are there any obvious reasons why the code below could be
> generating so much? The data file in question has 61k lines, and is <6MB
> in total.
>
> Thanks,
>
> Tim
>
>  Map2.hs 
>
> module Main where
>
> import qualified Data.Map as Map
> import qualified Data.ByteString.Char8 as BS
> import System.Environment
> import System.IO
>
> type MyMap = Map.Map BS.ByteString BS.ByteString
>
> foldLines :: (a -> String -> a) -> a -> Handle -> IO a
> foldLines f a h = do
> eof <- hIsEOF h
> if eof
>   then (return a)
>   else do
>  l <- hGetLine h
>  let a' = f a l
>  a' `seq` foldLines f a' h
>
> undumpFile :: FilePath -> IO MyMap
> undumpFile path = do
> h <- openFile path ReadMode
> m <- foldLines addv Map.empty h
> hClose h
> return m
>   where
> addv m "" = m
> addv m s = let (k,v) = readKV s
>in k `seq` v `seq` Map.insert k v m
>
> readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs)
>
> dump :: [(BS.ByteString,BS.ByteString)] -> IO ()
> dump vs = mapM_ putV vs
>   where
> putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v))
>
> main :: IO ()
> main =  do
> args <- getArgs
> case args of
>   [path] -> do
>   v <- undumpFile path
>   dump (Map.toList v)
>   return ()
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: memory slop (was: Using the GHC heap profiler)

2011-03-22 Thread John Lato

Minor update, here's how I would handle this problem (using uu-parsinglib
and the latest ListLike, mostly untested):


import Data.ListLike (fromString, CharString (..))
import Text.ParserCombinators.UU
import Text.ParserCombinators.UU.BasicInstances
import Text.ParserCombinators.UU.Utils

-- change the local bindings in undumpFile to:

addv m s | BS.null s = m
addv m s = let (k,v) = readKV s
   in Map.insert k v m
readKV :: BS.ByteString -> (BS.ByteString, BS.ByteString)
readKV s = let [ks,vs] = parse (pTuple [pQuotedString, pQuotedString])
(createStr (LineColPos 0 0 0) $ CS s)
  unCSf = BS.drop 1 . BS.init . unCS
  in (unCSf ks, unCSf vs)


And of course change the type of "foldLines" and use
BS.hGetLine, both to enable ByteString IO.

To use uu-parsinglib's character parsers (e.g. pTuple) with ByteStrings, you
need to use a newtype wrapper such as CharString from ListLike, "CS" and
"unCS" wrap and unwrap the type.  The "unCSf" function removes the starting
and trailing quotes in addition to unwrapping.  This is still
quick-and-dirty in that there's no error recovery, but it's easy to add,
just see the uu-parsinglib documentation and examples, particularly "pEnd".

I think this will make a significant difference to your application.

John L.

Message: 4

> Date: Tue, 22 Mar 2011 20:32:16 -0600
> From: Tim Docker 
> Subject: memory slop (was: Using the GHC heap profiler)
> To: glasgow-haskell-users@haskell.org
> Message-ID: <4d895bb0.1080...@dockerz.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>  Map2.hs 
>
> module Main where
>
> import qualified Data.Map as Map
> import qualified Data.ByteString.Char8 as BS
> import System.Environment
> import System.IO
>
> type MyMap = Map.Map BS.ByteString BS.ByteString
>
> foldLines :: (a -> String -> a) -> a -> Handle -> IO a
> foldLines f a h = do
> eof <- hIsEOF h
> if eof
>   then (return a)
>   else do
>  l <- hGetLine h
>  let a' = f a l
>  a' `seq` foldLines f a' h
>
> undumpFile :: FilePath -> IO MyMap
> undumpFile path = do
> h <- openFile path ReadMode
> m <- foldLines addv Map.empty h
> hClose h
> return m
>   where
> addv m "" = m
> addv m s = let (k,v) = readKV s
>in k `seq` v `seq` Map.insert k v m
>
> readKV s = let (ks,vs) = read s in (BS.pack ks, BS.pack vs)
>
> dump :: [(BS.ByteString,BS.ByteString)] -> IO ()
> dump vs = mapM_ putV vs
>   where
> putV (k,v) = putStrLn (show (BS.unpack k, BS.unpack v))
>
> main :: IO ()
> main =  do
> args <- getArgs
> case args of
>   [path] -> do
>   v <- undumpFile path
>   dump (Map.toList v)
>   return ()
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

shared libraries on OS X

2011-05-28 Thread John Lato

Hello,

I recently tried to build ghc on OS X Snow Leopard as 64-bit with shared
library support.  I had to self-compile gmp and modify mk/build.mk (I later
saw that Edward Amsden blogged about the same experience,
http://blog.edwardamsden.com/2011/04/howto-install-ghc-703-on-os-x-64-bit.html),
and it seemed to work, but executables don't run.  For example, with this
small program:

> import qualified Data.Vector.Unboxed as V
> main = let vec = V.replicate 10 (1 :: Int) in print $ V.sum vec

I get this result:

Mac-1:~ johnlato$ ghc -O -dynamic foo.hs
[1 of 1] Compiling Main ( foo.hs, foo.o )
Linking foo ...
Mac-1:~ johnlato$ ./foo
dyld: Library not loaded:
/private/var/folders/aJ/aJF0t1uBF7WDCz1PZV0A0U+++TI/-Tmp-/vector-0.7.0.176669/vector-0.7.0.1/dist/build/libHSvector-0.7.0.1-ghc7.0.3.dylib
  Referenced from: /Users/johnlato/./foo
  Reason: image not found
Trace/BPT trap

It seems that dyld is looking into build folders for the libraries.  If I
set DYLD_LIBRARY_PATH before compiling it appears to work:

Mac-1:~ johnlato$ export
DYLD_LIBRARY_PATH=~/.cabal/lib/vector-0.7.0.1/ghc-7.0.3/:~/.cabal/lib/primitive-0.3.1/ghc-7.0.3/
Mac-1:~ johnlato$ ghc -O -dynamic foo.hs [1 of 1] Compiling Main
( foo.hs, foo.o )
Linking foo ...
Mac-1:~ johnlato$ ./foo
10

This seems to be required for any libraries I've installed via cabal-install
--user, which quickly becomes onerous.

Could anyone give me some advice on how to make this work properly (e.g.
without manually setting DYLD_LIBRARY_PATH)?

Thanks,
John
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: GHC and Haskell 98

2011-06-20 Thread John Lato

>
> From: Bas van Dijk 
>
> On 17 June 2011 16:47, Simon Peyton-Jones  wrote:
> > So: ? ?Under Plan A, some Hackage packages will become un-compilable,
> > ? ? ? and will require source code changes to fix them. ?I do not have
> > ? ? ? ?any idea how many Hackage packages would fail in this way.
>
> Of the 372 direct reverse dependencies of haskell98:
>
>
> http://bifunctor.homelinux.net/~roel/cgi-bin/hackage-scripts/revdeps/haskell98-1.1.0.1#direct
>
> there are 344 which also depend on base (See http://hpaste.org/47933
> for calculating the intersection).
>

Is it easy to check, out of those 344, how many would build if the
dependency on haskell98 were removed?  I suspect it's not needed for the
majority of cases.

+1 for Plan A, but interested in mitigating the negative consequences.

(Bas, your link doesn't work for me BTW, can't resolve the IP.  May be my
uni's dns cache.)

John Lato
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Runtime performance degradation for multi-threaded C FFI callback

2012-01-23 Thread John Lato

Hi Simon,

I'm not certain that your explanation matches what I observed.

All of my tests were done on a 4-core machine, executing with "+RTS
-N", which should be the same as "+RTS -N4" I believe.

With 1 Haskell thread (the main thread) and 4 process threads (via
pthreads), I saw a significant performance degradation compared to 5
Haskell threads (main + 4 via forkIO) and 4 process threads.  As I
understand your explanation, if C callbacks are scheduled according to
available capabilities, there should be no difference between these
situations.

I observed this with GHC-7.2.1, however Daniel Fischer reported that,
with ghc-7.2.2, he observed different behavior (which matches your
explanation AFAICT).  Is it possible that the scheduling of callbacks
into Haskell changed between those versions?

Thanks,
John L.

> From: Simon Marlow 
> Subject: Re: Runtime performance degradation for multi-threaded C FFI
>        callback
> To: Sanket Agrawal 
> Cc: glasgow-haskell-users 
> Message-ID: <4f1d2f4d.9050...@gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> On 21/01/2012 15:35, Sanket Agrawal wrote:
>> Hi Edward,
>>
>> I was just going to get back to you about it. I did find out that the
>> issue was indeed one GHC thread dealing with 5 C threads for callback
>> (1:5 mapping) - so, the C threads were blocking on callback waiting for
>> the only GHC thread to be available. I updated the code to do 1:1
>> mapping - 5 GHC threads for 5 C threads. That proved to be almost
>> linearly scalable.
>
> This is almost right, except that your callbacks are not waiting for a
> GHC *thread*, but what we call a "capability", which is roughly speaking
> "permission to execute Haskell code".  The +RTS -N option chooses the
> number of capabilities.
>
> I expect that with -N1, your program is spending a lot of time just
> switching between the different OS threads.
>
> It's possible that we could make the runtime more flexible here.  I
> recently made it possible to modify the number of capabilities at
> runtime, so it's conceivable that the runtime could automatically add
> capabilities if it is being called from multiple OS threads.
>
>> John Latos suggested the above approach two days back, but I didn't get
>> to test the idea until now.
>>
>> It doesn't seem to matter whether number of GHC threads are increased,
>> if the mapping between GHC threads and C threads is not 1:1. I got 1:1
>> mapping by doing forkIO for each C thread. Is it really possible to do
>> 7:5 mapping (that is 7 GHC threads to choose from, for 5 C threads
>> during callback)? I can't think of a way to do it. Not that I need it. I
>> am just curious if that is possible.
>
> Just think of +RTS -N7 as being 7 *locks*, not 7 threads.  Then it makes
> perfect sense to have 7 locks available for 5 threads.
>
> Cheers,
>        Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Runtime performance degradation for multi-threaded C FFI callback

2012-01-23 Thread John Lato

I agree the OS scheduler is likely to contribute to our different
observations.  I'll try to test with ghc-7.4-rc1 tonight to see if I
get similar results to 7.2.1.

If you want to see some code I'll post it, although I doubt it's
necessary.  I would appreciate it if you (or someone else in the know)
could answer a question for me: does the GHC runtime handle scheduling
of code from Haskell threads (forkIO) and foreign callbacks (via
FunPtr's) in the same way, or are there restrictions on which
capability may handle one or the other (ignoring bound threads and the
like)?

Thank you,
John L.

On Mon, Jan 23, 2012 at 1:26 PM, Simon Marlow  wrote:
> I'll need to analyse the program to see what's going on.  There was a small
> change to the scheduler between 7.2.1 and 7.2.2 that could conceivably have
> made a difference in this scenario, but it was aimed at fixing a bug rather
> than improvement performance.
>
> Another possibility is a difference in OS scheduling behaviour between yours
> and Daniel Fischer's setup.  In microbenchmarks like this, it's easy for a
> difference in OS scheduling behaviour to make a large difference in
> performance if it happens consistently.
>
> Cheers,
>        Simon

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Is it true that an exception is always terminates the thread?

2012-01-24 Thread John Lato

> From: Heka Treep 
> Subject: Re: Is it true that an exception is always terminates the
>        thread?
> To: "Edward Z. Yang" 
> Cc: glasgow-haskell-users 
> Message-ID:
>        
> Content-Type: text/plain; charset=ISO-8859-1
>
> 2012/1/23, Edward Z. Yang :
>> Excerpts from Heka Treep's message of Mon Jan 23 13:56:47 -0500 2012:
>>> adding the message queue (with Chan, MVar or STM) for each process will
>>> not
>>> help in this kind of imitation.
>>
>> Why not? Instead of returning a thread ID, send the write end of a Chan
>> which the thread is waiting on.  You can send messages (normal or
>> errors) using it.
>>
>> Edward
>>
>
> Yes, one can write this:

(others have commented on your actor implementation already)

I'm not certain I understand your comment about synchronization; the
STM implementation handles all of that.  Unless you mean that you'd
rather not write the "atomically"'s when writing to the TChan.  But
you can define:

> ! :: TChan a -> a -> IO ()
> chan ! msg = atomically $ writeTChan chan msg

This allows you to write:

> test = do
>  mbox <- spawn actor
>  mbox ! "1"
>  mbox ! "2"
>  mbox ! "3"

which seems to be exactly what you want.

For the record, it's probably possible to do this with async
exceptions, but I would not want to maintain it.  For one, async
exceptions are how GHC implements a lot of thread management stuff
(e.g. the ThreadKilled exception).  You would need to be careful that
your code doesn't interfere with that.  Another concern is the thread
mask state, which needs to be handled carefully.  For example, if you
perform an "interruptable operation" while processing the message
(e.g. blocking IO), another message could be received at that point,
which I believe would abort processing of the first message.  If you
use "uninterruptableMask", then as I read the docs you can't block *at
all* without making the thread unkillable.

Doing this with async exceptions would be tricky to get right.  STM is
the best approach.

John L.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: parallelizing ghc

2012-01-26 Thread John Lato

> From: Evan Laforge 
>
> On Wed, Jan 25, 2012 at 11:42 AM, Ryan Newton  wrote:
>>> package list for me. ?The time is going to be dominated by linking,
>>> which is single threaded anyway, so either way works.
>>
>> What is the state of incremental linkers? ?I thought those existed now.
>
> I think in some specific cases.  I've heard there's a microsoft one?
> It would be windows only of course.  Is anyone using that with ghc?
>
> gold is supposed to be multi-threaded and fast (don't know about
> incremental), but once again it's ELF-only.  I've heard a few people
> talking about gold with ghc, but I don't know what the results were.
>
> Unfortunately I'm on OS X, I don't know about any incremental or
> multithreaded linking here.

Neither do I.  On my older machine with 2GB RAM, builds are often
dominated by ld because it starts thrashing.  And not many linkers
target Mach-O.

I've been toying with building my own ld replacement.  I don't know
anything about linkers, but I'd say at least even odds that I can do
better than this.

John L.

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

80 matches

Mail list logo