[Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-18 Thread Florian Hartwig
Hi everyone,
I would like to do the GSoC project outlined in
http://hackage.haskell.org/trac/summer-of-code/ticket/1608

One of Haskell's great strengths is its support for all kinds of concurrent
and parallel programmming, so I think that the Haskell ecosystem would
benefit from having a wider range of efficient concurrent data structures.
Also, more selfishly, I remember reading the article in CACM last summer and
thinking that the whole idea of updating data structures directly using
atomic compare-and-swap was really cool, so I would love to have an excuse
to play around with it.

Some (initial) ideas for lock-free data structures worth implementing:
1. A lock-free priority queue, e.g. using the approach based on skip-lists
explained in [2]
2. Stacks, see [1] and [3]
3. Hash tables [4]
but if anyone has other suggestions, I would obviously happy to hear them.

GSoC stretches over 13 weeks. I would estimate that implementing a data
structure, writing tests, benchmarks, documentation etc. should not take more
than 3 weeks (it is supposed to be full-time work, after all), which means
that I could implement 4 of them in the time available and still have some
slack.

About me: My name is Florian Hartwig, I'm a fifth year (Master's) student in
Computing Science at the University of Glasgow. I've been using Haskell for a
bit more than two years now (both for university courses and my recreational
programming) and I'm currently using it for my Master's project, so I won't
have to spend any time learning the Haskell basics.
I don't have any other plans for the summer that might conflict with the
project.

I'd be thankful for any thoughts, questions and suggestions.
Cheers,
Florian

[1] 
http://cacm.acm.org/magazines/2011/3/105308-data-structures-in-the-multicore-age/fulltext
[2] http://www.sciencedirect.com/science/article/pii/S0743731504002333
[3] http://dl.acm.org/citation.cfm?id=1007944
[4] http://dl.acm.org/citation.cfm?id=564881

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data

2012-03-29 Thread John Lato
Slightly related: I think it would be interesting to compare a
Disruptor-based concurrency communications mechanism and compare it to
e.g. TChans

1.  Disruptor - http://code.google.com/p/disruptor/

> From: Ryan Newton 
>
>> I think so. Atomically reading and writing a single memory location
>>> (which CAS does) is just a very simple transaction. But using a CAS
>>> instruction should be more efficient, since STM has to maintain a
>>> transaction log and commit transactions, which creates some overhead.
>>>
>>
>> Ah, I see. In that case, it may be worthwhile to implement the CAS
>> instruction in terms of STM as well and measure the performance difference
>> this makes for the final data structure. After all, STM is a lot more
>> compositional than CAS, so I'd like to know whether the loss of
>> expressiveness is worth the gain in performance.
>
>
> There's one annoying hitch with doing apples-to-apples comparisons here.
>
> The problem is that CAS falls short of being a hardware-accelerated version
> of a simple transaction (read TVar, (==) against expected value,
> conditionally update TVar), because CAS is based on pointer equality rather
> than true equality.  (eq? rather than equal? for any Schemers out there.)
>
> For this reason our "Fake" version of CAS -- for older GHCs and for
> performance comparison -- has to use reallyUnsafePointerEquality#:
>
>
> http://hackage.haskell.org/packages/archive/IORefCAS/0.2/doc/html/Data-CAS-Internal-Fake.html
>
> But we do provide a "CASable" type class in that package which is precisely
> for comparing the performance of:
>
>   - Hardware CAS
>   - Fake CAS -- atomicModifyIORef + ptrEquality
>   - Foreign CAS -- gcc intrinsic + function call

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data

2012-03-30 Thread Krzysztof Skrzętnicki
Hi,

I was interested in Distruptor few months ago and started to write a
Haskell implementation. Soon I discovered the lack of native CAS operations
and abandoned project for a while. I don't really have time to develop it
further right now, but the code is available:
https://github.com/Tener/disruptor-hs

The code as it is now is mostly benchmarking code. I think it is worth
trying out.

You mention benchmarking TChans: one particular problem with TChans and
Chans is that they are unbounded. If the producers outpace consumers it
soon ends with memory exhaustion.

In the process of writing above code I discovered particularly simple data
structure that performs suprisingly well (full implementation:
https://github.com/Tener/disruptor-hs/blob/master/Test/ManyQueue.hs) :

type Queue a = [MVar a]
mkQueue size = cycle `fmap` (replicateM size newEmptyMVar)


The throutput is very high, it is bounded and scales well with the number
of producer/consumers threads. In the presence of multiple
consumers/producers it's not a FIFO queue though, but rather a kind of
buffer with funny ordering.

Best regards,
Krzysztof Skrzętnicki

On Fri, Mar 30, 2012 at 00:03, John Lato  wrote:

> Slightly related: I think it would be interesting to compare a
> Disruptor-based concurrency communications mechanism and compare it to
> e.g. TChans
>
> 1.  Disruptor - http://code.google.com/p/disruptor/
>
> > From: Ryan Newton 
> >
> >> I think so. Atomically reading and writing a single memory location
> >>> (which CAS does) is just a very simple transaction. But using a CAS
> >>> instruction should be more efficient, since STM has to maintain a
> >>> transaction log and commit transactions, which creates some overhead.
> >>>
> >>
> >> Ah, I see. In that case, it may be worthwhile to implement the CAS
> >> instruction in terms of STM as well and measure the performance
> difference
> >> this makes for the final data structure. After all, STM is a lot more
> >> compositional than CAS, so I'd like to know whether the loss of
> >> expressiveness is worth the gain in performance.
> >
> >
> > There's one annoying hitch with doing apples-to-apples comparisons here.
> >
> > The problem is that CAS falls short of being a hardware-accelerated
> version
> > of a simple transaction (read TVar, (==) against expected value,
> > conditionally update TVar), because CAS is based on pointer equality
> rather
> > than true equality.  (eq? rather than equal? for any Schemers out there.)
> >
> > For this reason our "Fake" version of CAS -- for older GHCs and for
> > performance comparison -- has to use reallyUnsafePointerEquality#:
> >
> >
> >
> http://hackage.haskell.org/packages/archive/IORefCAS/0.2/doc/html/Data-CAS-Internal-Fake.html
> >
> > But we do provide a "CASable" type class in that package which is
> precisely
> > for comparing the performance of:
> >
> >   - Hardware CAS
> >   - Fake CAS -- atomicModifyIORef + ptrEquality
> >   - Foreign CAS -- gcc intrinsic + function call
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data

2012-03-31 Thread wren ng thornton

On 3/30/12 4:27 AM, Krzysztof Skrzętnicki wrote:

You mention benchmarking TChans: one particular problem with TChans and
Chans is that they are unbounded. If the producers outpace consumers it
soon ends with memory exhaustion.


If that's the case, then you should consider TBChan[1] which is a 
bounded TChan, to solve exactly that problem. (There's also TMChan for 
closeable channels, and TBMChan for bounded closeable channels, as needed.)



[1] http://hackage.haskell.org/package/stm-chans

--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-18 Thread Chris Smith
On Mar 18, 2012 6:39 PM, "Florian Hartwig" 
wrote:
> GSoC stretches over 13 weeks. I would estimate that implementing a data
> structure, writing tests, benchmarks, documentation etc. should not take
more
> than 3 weeks (it is supposed to be full-time work, after all), which means
> that I could implement 4 of them in the time available and still have some
> slack.

Don't underestimate the time required for performance tuning, and be
careful to leave yourself learning time, unless you have already
extensively used ThreadScope, read GHC Core, and worked with low-level
strictness, unpacking, possibly even rewrite rules.  I suspect that the
measurable performance benefit from lockless data structures might be
tricky to tease out of the noise created by unintentional strictness or
unboxing issues.  And we'd be much happier with one or two really
production quality implementations than even six or seven at a student
project level.

-- 
Chris Smith
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-18 Thread Florian Hartwig
On 19 March 2012 00:59, Chris Smith  wrote:
> On Mar 18, 2012 6:39 PM, "Florian Hartwig" 
> wrote:
>> GSoC stretches over 13 weeks. I would estimate that implementing a data
>> structure, writing tests, benchmarks, documentation etc. should not take
>> more
>> than 3 weeks (it is supposed to be full-time work, after all), which means
>> that I could implement 4 of them in the time available and still have some
>> slack.
>
> Don't underestimate the time required for performance tuning, and be careful
> to leave yourself learning time, unless you have already extensively used
> ThreadScope, read GHC Core, and worked with low-level strictness, unpacking,
> possibly even rewrite rules.  I suspect that the measurable performance
> benefit from lockless data structures might be tricky to tease out of the
> noise created by unintentional strictness or unboxing issues.  And we'd be
> much happier with one or two really production quality implementations than
> even six or seven at a student project level.
>
> --
> Chris Smith

Thank you, Hofstadter's law definitely rears its head in many of my projects.
I do have some experience with ThreadScope and strictness issues, but
you I agree that I'm probably underestimating the time I need to
learn.
I also agree that my focus would be on quality rather than quantity. I
quite like the modularity of this project, because it minimises the
chance of having a lot of half-finished but useless code at the end of
summer.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-19 Thread Gregory Collins
A lock-free concurrent queue alone would be worth a summer project IMO.

G

On Mon, Mar 19, 2012 at 3:25 AM, Florian Hartwig <
florian.j.hart...@gmail.com> wrote:

> On 19 March 2012 00:59, Chris Smith  wrote:
> > On Mar 18, 2012 6:39 PM, "Florian Hartwig" 
> > wrote:
> >> GSoC stretches over 13 weeks. I would estimate that implementing a data
> >> structure, writing tests, benchmarks, documentation etc. should not take
> >> more
> >> than 3 weeks (it is supposed to be full-time work, after all), which
> means
> >> that I could implement 4 of them in the time available and still have
> some
> >> slack.
> >
> > Don't underestimate the time required for performance tuning, and be
> careful
> > to leave yourself learning time, unless you have already extensively used
> > ThreadScope, read GHC Core, and worked with low-level strictness,
> unpacking,
> > possibly even rewrite rules.  I suspect that the measurable performance
> > benefit from lockless data structures might be tricky to tease out of the
> > noise created by unintentional strictness or unboxing issues.  And we'd
> be
> > much happier with one or two really production quality implementations
> than
> > even six or seven at a student project level.
> >
> > --
> > Chris Smith
>
> Thank you, Hofstadter's law definitely rears its head in many of my
> projects.
> I do have some experience with ThreadScope and strictness issues, but
> you I agree that I'm probably underestimating the time I need to
> learn.
> I also agree that my focus would be on quality rather than quantity. I
> quite like the modularity of this project, because it minimises the
> chance of having a lot of half-finished but useless code at the end of
> summer.
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Gregory Collins 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-19 Thread Florian Hartwig
On 19 March 2012 09:56, Gregory Collins  wrote:
> A lock-free concurrent queue alone would be worth a summer project IMO.
>
> G

Ryan Newton is already doing that
(https://github.com/rrnewton/haskell-lockfree-queue).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-19 Thread Ryan Newton
The MichaelScott lockefree queues in that repository pass tests and should
work.  Additional stress testing and feedback would be appreciated.  There
are some other queues in the literature that might be worth implementing
but I think other data structures are higher priority.

As Adam Foltzer mentioned in the trac ticket a really good structure would
be the concurrent bags from this paper:

   http://www.cse.chalmers.se/~tsigas/papers/Lock%20Free%20Bag%20SPAA11.pdf

We separately did a C implementation of those and they performed really
well in our bake-off of producer consumer data structures (vs. TBB queues,
and many other implementations).  By the way, we can share the code for
this little bake-off as a performance baseline for the Haskell versions.

I'm less familiar with the literature on concurrent hash tables.  TBB's
have been a little bit underwhelming in performance.  But it's definitely
an important structure.   Ditto for priority queues.

In any case, I welcome your interest in the topic, Florian!

Cheers,
   -Ryan



On Mon, Mar 19, 2012 at 7:33 AM, Florian Hartwig <
florian.j.hart...@gmail.com> wrote:

> On 19 March 2012 09:56, Gregory Collins  wrote:
> > A lock-free concurrent queue alone would be worth a summer project IMO.
> >
> > G
>
> Ryan Newton is already doing that
> (https://github.com/rrnewton/haskell-lockfree-queue).
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-19 Thread Florian Hartwig
On 19 March 2012 11:46, Ryan Newton  wrote:
> As Adam Foltzer mentioned in the trac ticket a really good structure would
> be the concurrent bags from this paper:
>
>    http://www.cse.chalmers.se/~tsigas/papers/Lock%20Free%20Bag%20SPAA11.pdf
>
> We separately did a C implementation of those and they performed really well
> in our bake-off of producer consumer data structures (vs. TBB queues, and
> many other implementations).

I've just read the paper and they look both useful and interesting to
implement. Adam mentioned that GHC would need to be extended first,
and I can't really judge how much work that would be. Can anyone chime
in on how feasible that is?

> I'm less familiar with the literature on concurrent hash tables.  TBB's have
> been a little bit underwhelming in performance.  But it's definitely an
> important structure.   Ditto for priority queues.

The data structures I mentioned were just my initial ideas, I'm open
to any other suggestions. In my (limited) experience with parallel
programming, queues and priority queues tend to come up quite a bit,
but I'd be happy to get input from anyone with more experience
regarding what would be useful/important.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-20 Thread Gregory Collins
On Tue, Mar 20, 2012 at 2:53 AM, Florian Hartwig <
florian.j.hart...@gmail.com> wrote:

> On 19 March 2012 11:46, Ryan Newton  wrote:
> > As Adam Foltzer mentioned in the trac ticket a really good structure
> would
> > be the concurrent bags from this paper:
> >
> >
> http://www.cse.chalmers.se/~tsigas/papers/Lock%20Free%20Bag%20SPAA11.pdf
>

Looks like they rely on thread-local storage, which would have to be worked
around in Haskell somehow.


> I've just read the paper and they look both useful and interesting to
> implement. Adam mentioned that GHC would need to be extended first,
> and I can't really judge how much work that would be. Can anyone chime
> in on how feasible that is?
>

GHC already has a CAS primitive on MutVar#, it just needs to be extended to
MutableArray# and MutableByteArray# (at all of the bit-widths the CAS
instruction would support, e.g. see readWordXxArray# in
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.2.0.0/GHC-Prim.html).
The implementation should be almost identical to casMutVar#.

In particular I think you need:

casMutArray# :: MutableArray# s a -> Int# -> a -> a -> State# s -> (#
State# s, Int#, a #)
casWord16MutByteArray :: MutableByteArray# s -> Int# -> Word# -> Word#
-> State# s -> (# State# s, Int#, Word#)

and equivalents for Word32. Word64, Int16, Int32, Int64.

G
-- 
Gregory Collins 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-28 Thread Florian Hartwig
Hi again,
I just submitted my proposal on the GSoC website. You can find it here:
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/florianhartwig/1
I would be very grateful if someone could read over it and tell me if
it makes sense and if/how it could be improved.
Cheers,
Florian

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-28 Thread Ryan Newton
> GHC already has a CAS primitive on MutVar#, it just needs to be extended
> to MutableArray# and MutableByteArray# (at all of the bit-widths the CAS
> instruction would support, e.g. see readWordXxArray# in
> http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.2.0.0/GHC-Prim.html).
> The implementation should be almost identical to casMutVar#.
> In particular I think you need:
> casMutArray# :: MutableArray# s a -> Int# -> a -> a -> State# s -> (#
> State# s, Int#, a #)
> casWord16MutByteArray :: MutableByteArray# s -> Int# -> Word# -> Word#
> -> State# s -> (# State# s, Int#, Word#)
>

FYI, I started working on adding these.  I'm hoping to have it working in
GHC HEAD for any students who need to use it.  To my knowledge the only two
patches required to implement casMutVar# were these two (plus the
preexisting cas() definition in SMP.h):


https://github.com/ghc/ghc/commit/521b792553bacbdb0eec138b150ab0626ea6f36b

https://github.com/ghc/ghc/commit/606f6e1cfcb2e79abaadcc5ed643817d2a4585d8

The latter is a bugfix to the former.

Florian, your proposal looks good to me (
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/florianhartwig/1).
 You touched on the major things we need to know.

I just read in your proposal that you started looking into the casMutArray#
issue as well.  How far have you gotten with that?  Do you want to work on
this together a bit?

I've got an implementation of a casArray# primop that passes a basic test,
but I'm not sure if the GC write barrier is correct:


https://github.com/rrnewton/ghc/commit/18ed460be111b47a759486677960093d71eef386

The ByteArray versions will be more annoying, requiring more variations,
but they are also less essential, because the user can always use
ForeignPtr and bits-atomic in this case, and I believe for our concurrent
data structures we want to store arbitrary pointers (hence casArray#).

Cheers,
  -Ryan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Heinrich Apfelmus

Florian Hartwig wrote:

Hi everyone,
I would like to do the GSoC project outlined in
http://hackage.haskell.org/trac/summer-of-code/ticket/1608

One of Haskell's great strengths is its support for all kinds of concurrent
and parallel programmming, so I think that the Haskell ecosystem would
benefit from having a wider range of efficient concurrent data structures.
Also, more selfishly, I remember reading the article in CACM last summer and
thinking that the whole idea of updating data structures directly using
atomic compare-and-swap was really cool, so I would love to have an excuse
to play around with it.

Some (initial) ideas for lock-free data structures worth implementing:
1. A lock-free priority queue, e.g. using the approach based on skip-lists
explained in [2]
2. Stacks, see [1] and [3]
3. Hash tables [4]
but if anyone has other suggestions, I would obviously happy to hear them.


Since I don't know much about concurrency, I have a simple question: 
what is the difference between atomic compare-and-swap and software 
transactional memory? Both are lock-free? Is it possible to implement 
every data structure based on CAS in terms of STM? What are the 
drawbacks? The other way round?



Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Eugene Kirpichov
Though of course you can implement CAS in terms of STM, CAS is much
more low-level and will probably be many times (though not
asymptotically) faster.

On Thu, Mar 29, 2012 at 12:01 PM, Heinrich Apfelmus
 wrote:
> Florian Hartwig wrote:
>>
>> Hi everyone,
>> I would like to do the GSoC project outlined in
>> http://hackage.haskell.org/trac/summer-of-code/ticket/1608
>>
>> One of Haskell's great strengths is its support for all kinds of
>> concurrent
>> and parallel programmming, so I think that the Haskell ecosystem would
>> benefit from having a wider range of efficient concurrent data structures.
>> Also, more selfishly, I remember reading the article in CACM last summer
>> and
>> thinking that the whole idea of updating data structures directly using
>> atomic compare-and-swap was really cool, so I would love to have an excuse
>> to play around with it.
>>
>> Some (initial) ideas for lock-free data structures worth implementing:
>> 1. A lock-free priority queue, e.g. using the approach based on skip-lists
>> explained in [2]
>> 2. Stacks, see [1] and [3]
>> 3. Hash tables [4]
>> but if anyone has other suggestions, I would obviously happy to hear them.
>
>
> Since I don't know much about concurrency, I have a simple question: what is
> the difference between atomic compare-and-swap and software transactional
> memory? Both are lock-free? Is it possible to implement every data structure
> based on CAS in terms of STM? What are the drawbacks? The other way round?
>
>
> Best regards,
> Heinrich Apfelmus
>
> --
> http://apfelmus.nfshost.com
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Eugene Kirpichov
Principal Engineer, Mirantis Inc. http://www.mirantis.com/
Editor, http://fprog.ru/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Florian Hartwig
On 29 March 2012 05:57, Ryan Newton  wrote:
> I just read in your proposal that you started looking into the casMutArray#
> issue as well.  How far have you gotten with that?  Do you want to work on
> this together a bit?
>
> I've got an implementation of a casArray# primop that passes a basic test,
> but I'm not sure if the GC write barrier is correct:
>
>
> https://github.com/rrnewton/ghc/commit/18ed460be111b47a759486677960093d71eef386

The write barrier is what I got stuck on as well. I don't know much
about Haskell's GC. I'm going to read up on it, but my Master's
project is due in in 3 weeks, so it's difficult to find the time right
now. I'd be happy to work with you on this, but it'll probably have to
wait until then.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Florian Hartwig
On 29 March 2012 at 12:01 PM, Heinrich Apfelmus
 wrote:
> Since I don't know much about concurrency, I have a simple question: what is
> the difference between atomic compare-and-swap and software transactional
> memory? Both are lock-free?

Well, terminology-wise it would probably be more accurate to say that
you can implement lock-free algorithms using both (i.e. it's not
really CAS and STM that are lock-free, but the algorithms implemented
using them). But maybe this is a pedantic distinction.

As to the difference between the two: CAS is just a machine
instruction that takes an old "expected" value, a new value and an
address and updates the memory pointed to by the address iff it
contains the old/expected value. All this happens atomically, so you
avoid the potential conflicts between threads concurrently reading
from and writing to the same address.

STM is much higher-level. It's an abstraction that allows the
programmer to treat any sequence of reads and writes as a single
atomic operation. This is, as the name says, implemented in software.

So while the two are related, CAS is a machine primitive that works
for a single operation and on a single word while STM is a software
abstraction that isolates sequences of operations on multiple memory
locations from each other.

> Is it possible to implement every data structure
> based on CAS in terms of STM? What are the drawbacks? The other way round?

I think so. Atomically reading and writing a single memory location
(which CAS does) is just a very simple transaction. But using a CAS
instruction should be more efficient, since STM has to maintain a
transaction log and commit transactions, which creates some overhead.

I hope this makes some sense.
Cheers,
Florian

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Heinrich Apfelmus

Florian Hartwig wrote:

Heinrich Apfelmus wrote:

So while the two are related, CAS is a machine primitive that works
for a single operation and on a single word while STM is a software
abstraction that isolates sequences of operations on multiple memory
locations from each other.


Is it possible to implement every data structure
based on CAS in terms of STM? What are the drawbacks? The other way round?


I think so. Atomically reading and writing a single memory location
(which CAS does) is just a very simple transaction. But using a CAS
instruction should be more efficient, since STM has to maintain a
transaction log and commit transactions, which creates some overhead.


Ah, I see. In that case, it may be worthwhile to implement the CAS 
instruction in terms of STM as well and measure the performance 
difference this makes for the final data structure. After all, STM is a 
lot more compositional than CAS, so I'd like to know whether the loss of 
expressiveness is worth the gain in performance.



Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Gregory Collins
On Thu, Mar 29, 2012 at 6:57 AM, Ryan Newton  wrote:

> The ByteArray versions will be more annoying, requiring more variations,
> but they are also less essential, because the user can always use
> ForeignPtr and bits-atomic in this case, and I believe for our concurrent
> data structures we want to store arbitrary pointers (hence casArray#).
>

This is true, although using bits-atomic does a function call (i.e the
calls are not inlined), which would be pretty bad for performance.

G
-- 
Gregory Collins 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Ryan Newton
On Thu, Mar 29, 2012 at 9:01 AM, Gregory Collins wrote:

> On Thu, Mar 29, 2012 at 6:57 AM, Ryan Newton  wrote:
>
>> The ByteArray versions will be more annoying, requiring more variations,
>> but they are also less essential, because the user can always use
>> ForeignPtr and bits-atomic in this case, and I believe for our concurrent
>> data structures we want to store arbitrary pointers (hence casArray#).
>>
>
> This is true, although using bits-atomic does a function call (i.e the
> calls are not inlined), which would be pretty bad for performance.


Yes, absolutely... I'd like to add the byte array versions.  Actually,
those don't have GC write barriers so they should be much easier to get
right!
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Ryan Newton
>
> I think so. Atomically reading and writing a single memory location
>> (which CAS does) is just a very simple transaction. But using a CAS
>> instruction should be more efficient, since STM has to maintain a
>> transaction log and commit transactions, which creates some overhead.
>>
>
> Ah, I see. In that case, it may be worthwhile to implement the CAS
> instruction in terms of STM as well and measure the performance difference
> this makes for the final data structure. After all, STM is a lot more
> compositional than CAS, so I'd like to know whether the loss of
> expressiveness is worth the gain in performance.


There's one annoying hitch with doing apples-to-apples comparisons here.

The problem is that CAS falls short of being a hardware-accelerated version
of a simple transaction (read TVar, (==) against expected value,
conditionally update TVar), because CAS is based on pointer equality rather
than true equality.  (eq? rather than equal? for any Schemers out there.)

For this reason our "Fake" version of CAS -- for older GHCs and for
performance comparison -- has to use reallyUnsafePointerEquality#:


http://hackage.haskell.org/packages/archive/IORefCAS/0.2/doc/html/Data-CAS-Internal-Fake.html

But we do provide a "CASable" type class in that package which is precisely
for comparing the performance of:

   - Hardware CAS
   - Fake CAS -- atomicModifyIORef + ptrEquality
   - Foreign CAS -- gcc intrinsic + function call

Cheers,
   -Ryan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-03-29 Thread Tim Harris (RESEARCH)
Hi,

Somewhat related to this...

Next month we have a paper coming up at EuroSys about a middle-ground between 
using STM and programming directly with CAS:

http://research.microsoft.com/en-us/um/people/tharris/papers/2012-eurosys.pdf

This was done in the context of shared memory data structures in C/C++, rather 
than Haskell. 

It might be interesting to see how the results carry over to Haskell, e.g. 
adding short forms of specialized transactions that interact correctly with 
normal STM-Haskell transactions.

In the paper we have some examples of using short specialized transactions for 
the fast paths in data structures, while keeping the full STM available as a 
fall-back for expressing the cases that cannot be implemented using short 
transactions.

 --Tim




-Original Message-
From: haskell-cafe-boun...@haskell.org 
[mailto:haskell-cafe-boun...@haskell.org] On Behalf Of Heinrich Apfelmus
Sent: 29 March 2012 13:30
To: haskell-cafe@haskell.org
Subject: Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

Florian Hartwig wrote:
> Heinrich Apfelmus wrote:
> 
> So while the two are related, CAS is a machine primitive that works 
> for a single operation and on a single word while STM is a software 
> abstraction that isolates sequences of operations on multiple memory 
> locations from each other.
> 
>> Is it possible to implement every data structure based on CAS in 
>> terms of STM? What are the drawbacks? The other way round?
> 
> I think so. Atomically reading and writing a single memory location 
> (which CAS does) is just a very simple transaction. But using a CAS 
> instruction should be more efficient, since STM has to maintain a 
> transaction log and commit transactions, which creates some overhead.

Ah, I see. In that case, it may be worthwhile to implement the CAS instruction 
in terms of STM as well and measure the performance difference this makes for 
the final data structure. After all, STM is a lot more compositional than CAS, 
so I'd like to know whether the loss of expressiveness is worth the gain in 
performance.


Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-04-05 Thread Ben
perhaps it is too late to suggest things for GSOC --

but stephen tetley on a different thread pointed at aaron turon's work, which 
there's a very interesting new concurrency framework he calls "reagents" which 
seems to give the best of all worlds : it is declarative and compositional like 
STM, but gives performance akin to hand-coded lock-free data structures.  he 
seems to have straddled the duality of isolation vs message-passing nicely, and 
can subsume things like actors and the join calculus.

http://www.ccs.neu.edu/home/turon/reagents.pdf

he has a BSD licensed library in scala at

https://github.com/aturon/ChemistrySet

if someone doesn't want to pick this up for GSOC i might have a hand at 
implementing it myself.

b

On Mar 29, 2012, at 6:46 AM, Tim Harris (RESEARCH) wrote:

> Hi,
> 
> Somewhat related to this...
> 
> Next month we have a paper coming up at EuroSys about a middle-ground between 
> using STM and programming directly with CAS:
> 
> http://research.microsoft.com/en-us/um/people/tharris/papers/2012-eurosys.pdf
> 
> This was done in the context of shared memory data structures in C/C++, 
> rather than Haskell. 
> 
> It might be interesting to see how the results carry over to Haskell, e.g. 
> adding short forms of specialized transactions that interact correctly with 
> normal STM-Haskell transactions.
> 
> In the paper we have some examples of using short specialized transactions 
> for the fast paths in data structures, while keeping the full STM available 
> as a fall-back for expressing the cases that cannot be implemented using 
> short transactions.
> 
> --Tim
> 
> 
> 
> 
> -Original Message-
> From: haskell-cafe-boun...@haskell.org 
> [mailto:haskell-cafe-boun...@haskell.org] On Behalf Of Heinrich Apfelmus
> Sent: 29 March 2012 13:30
> To: haskell-cafe@haskell.org
> Subject: Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures
> 
> Florian Hartwig wrote:
>> Heinrich Apfelmus wrote:
>> 
>> So while the two are related, CAS is a machine primitive that works 
>> for a single operation and on a single word while STM is a software 
>> abstraction that isolates sequences of operations on multiple memory 
>> locations from each other.
>> 
>>> Is it possible to implement every data structure based on CAS in 
>>> terms of STM? What are the drawbacks? The other way round?
>> 
>> I think so. Atomically reading and writing a single memory location 
>> (which CAS does) is just a very simple transaction. But using a CAS 
>> instruction should be more efficient, since STM has to maintain a 
>> transaction log and commit transactions, which creates some overhead.
> 
> Ah, I see. In that case, it may be worthwhile to implement the CAS 
> instruction in terms of STM as well and measure the performance difference 
> this makes for the final data structure. After all, STM is a lot more 
> compositional than CAS, so I'd like to know whether the loss of 
> expressiveness is worth the gain in performance.
> 
> 
> Best regards,
> Heinrich Apfelmus
> 
> --
> http://apfelmus.nfshost.com
> 
> 
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
> 
> 
> 
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-04-05 Thread Ben Gamari
Ben  writes:

> perhaps it is too late to suggest things for GSOC --
>
> but stephen tetley on a different thread pointed at aaron turon's
> work, which there's a very interesting new concurrency framework he
> calls "reagents" which seems to give the best of all worlds : it is
> declarative and compositional like STM, but gives performance akin to
> hand-coded lock-free data structures.  he seems to have straddled the
> duality of isolation vs message-passing nicely, and can subsume things
> like actors and the join calculus.
>
> http://www.ccs.neu.edu/home/turon/reagents.pdf
>
> he has a BSD licensed library in scala at
>
> https://github.com/aturon/ChemistrySet
>
> if someone doesn't want to pick this up for GSOC i might have a hand
> at implementing it myself.
>
Keep use in the loop if you do. I have a very nice application that has
been needing a nicer approach to concurrency than IORefs but
really can't afford STM.

Cheers,

- Ben


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-04-05 Thread Ryan Newton
+1 -- the reagents model is interesting and it would be good to see a
Haskell implementation.


On Thu, Apr 5, 2012 at 3:05 PM, Ben Gamari  wrote:

> Ben  writes:
>
> > perhaps it is too late to suggest things for GSOC --
> >
> > but stephen tetley on a different thread pointed at aaron turon's
> > work, which there's a very interesting new concurrency framework he
> > calls "reagents" which seems to give the best of all worlds : it is
> > declarative and compositional like STM, but gives performance akin to
> > hand-coded lock-free data structures.  he seems to have straddled the
> > duality of isolation vs message-passing nicely, and can subsume things
> > like actors and the join calculus.
> >
> > http://www.ccs.neu.edu/home/turon/reagents.pdf
> >
> > he has a BSD licensed library in scala at
> >
> > https://github.com/aturon/ChemistrySet
> >
> > if someone doesn't want to pick this up for GSOC i might have a hand
> > at implementing it myself.
> >
> Keep use in the loop if you do. I have a very nice application that has
> been needing a nicer approach to concurrency than IORefs but
> really can't afford STM.
>
> Cheers,
>
> - Ben
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Google Summer of Code - Lock-free data structures

2012-04-06 Thread Heinrich Apfelmus

Ben wrote:

perhaps it is too late to suggest things for GSOC --

but stephen tetley on a different thread pointed at aaron turon's
work, which there's a very interesting new concurrency framework he
calls "reagents" which seems to give the best of all worlds : it is
declarative and compositional like STM, but gives performance akin to
hand-coded lock-free data structures.  he seems to have straddled the
duality of isolation vs message-passing nicely, and can subsume
things like actors and the join calculus.

http://www.ccs.neu.edu/home/turon/reagents.pdf

he has a BSD licensed library in scala at

https://github.com/aturon/ChemistrySet

if someone doesn't want to pick this up for GSOC i might have a hand
at implementing it myself.


That looks great! While I didn't take the time to understand the 
concurrency model in detail, the overall idea is to use arrows that can 
be run atomically


   runAtomically :: Reagent a b -> (a -> IO b)

This is very similar to STM: combining computations within the 
monad/arrow is atomic while combining computations "outside" the 
monad/arrow can interleave them.


   runAtomically (f . g)  -- atomic
   runAtomically f . runAtomically g  -- interleaving


Actually, it turns out that the  Reagent  arrow is also a monad, but the 
author seems to claim that the "static" arrow style enables certain 
optimizations. I haven't checked his model in detail to see whether this 
is really the case and how exactly it differs from STM, but we know that 
situations like this happen for parser combinators. Maybe it's enough to 
recast reagents as an applicative functor?


To summarize: the way I understand it is that it's apparently possible 
to improve the STM monad by turning it into an arrow. (I refer to "STM" 
in a very liberal sense here: whether memory is transactional or not is 
unimportant, the only thing that matters is a computation that composes 
atomically.)



Best regards,
Heinrich Apfelmus

--
http://apfelmus.nfshost.com


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe