Re: Re[6]: [Haskell-cafe] Boxed Mutable Arrays

2009-12-16 Thread Don Stewart
brad.larsen:
 On Tue, Dec 15, 2009 at 11:33 AM, Serguey Zefirov sergu...@gmail.com wrote:
  If the number of buckets was fixed, one could use an array of STRefs
  to lists.  I believe this would avoid the bug from Ticket #650?
  now i see what you mean. no, i mean trivial transformation. #650 says
  about slow GC. why it's slow? because once you made any update to the
  array, the entire array is marked as updated and scanned on next minor GC
  (which occurs after every 512 kbytes allocated, afaik). let's replace
  big array (say, of 10,000 elements) with array of 100 arrays of 100
  elements each. now, between minor GCs only some of arrays will be
  changed and entire amount of memory to be scanned will become less
 
  Data.IntMap?
 
 
 I want to implement a more-or-less traditional, mutable, imperative
 hash table in the ST monad.  Hence, I'm not considering Data.IntMap
 and other persistent tree structures for its implementation, although
 I have thought about it.
 
 The bug described in Ticket #650, AFAICS, prevents implementation of a
 reasonable, generic hash table in Haskell.  :-(

You can certainly implement it, it just requires that you increase the
heap size to a bit bigger than your hash to reduce the pressure on the
GC.

But note, Simon's making progress,
http://hackage.haskell.org/trac/ghc/ticket/650#comment:17

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[6]: [Haskell-cafe] Boxed Mutable Arrays

2009-12-16 Thread Brad Larsen
On Wed, Dec 16, 2009 at 11:27 AM, Don Stewart d...@galois.com wrote:
[...]
 The bug described in Ticket #650, AFAICS, prevents implementation of a
 reasonable, generic hash table in Haskell.  :-(

 You can certainly implement it, it just requires that you increase the
 heap size to a bit bigger than your hash to reduce the pressure on the
 GC.

 But note, Simon's making progress,
    http://hackage.haskell.org/trac/ghc/ticket/650#comment:17

 -- Don

Nice!  Even if it is only an improvement in the constant factor, and
not a genuine fix for the problem, it sounds like a big improvement.

Sincerely,
Brad
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[6]: [Haskell-cafe] Boxed Mutable Arrays

2009-12-15 Thread Serguey Zefirov
 If the number of buckets was fixed, one could use an array of STRefs
 to lists.  I believe this would avoid the bug from Ticket #650?
 now i see what you mean. no, i mean trivial transformation. #650 says
 about slow GC. why it's slow? because once you made any update to the
 array, the entire array is marked as updated and scanned on next minor GC
 (which occurs after every 512 kbytes allocated, afaik). let's replace
 big array (say, of 10,000 elements) with array of 100 arrays of 100
 elements each. now, between minor GCs only some of arrays will be
 changed and entire amount of memory to be scanned will become less

Data.IntMap?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[6]: [Haskell-cafe] Boxed Mutable Arrays

2009-12-15 Thread Brad Larsen
On Tue, Dec 15, 2009 at 11:33 AM, Serguey Zefirov sergu...@gmail.com wrote:
 If the number of buckets was fixed, one could use an array of STRefs
 to lists.  I believe this would avoid the bug from Ticket #650?
 now i see what you mean. no, i mean trivial transformation. #650 says
 about slow GC. why it's slow? because once you made any update to the
 array, the entire array is marked as updated and scanned on next minor GC
 (which occurs after every 512 kbytes allocated, afaik). let's replace
 big array (say, of 10,000 elements) with array of 100 arrays of 100
 elements each. now, between minor GCs only some of arrays will be
 changed and entire amount of memory to be scanned will become less

 Data.IntMap?


I want to implement a more-or-less traditional, mutable, imperative
hash table in the ST monad.  Hence, I'm not considering Data.IntMap
and other persistent tree structures for its implementation, although
I have thought about it.

The bug described in Ticket #650, AFAICS, prevents implementation of a
reasonable, generic hash table in Haskell.  :-(
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[6]: [Haskell-cafe] Boxed Mutable Arrays

2009-12-15 Thread Serguey Zefirov
 now i see what you mean. no, i mean trivial transformation. #650 says
 about slow GC. why it's slow? because once you made any update to the
 array, the entire array is marked as updated and scanned on next minor GC
 (which occurs after every 512 kbytes allocated, afaik). let's replace
 big array (say, of 10,000 elements) with array of 100 arrays of 100
 elements each. now, between minor GCs only some of arrays will be
 changed and entire amount of memory to be scanned will become less
 Data.IntMap?
 I want to implement a more-or-less traditional, mutable, imperative
 hash table in the ST monad.  Hence, I'm not considering Data.IntMap
 and other persistent tree structures for its implementation, although
 I have thought about it.
 The bug described in Ticket #650, AFAICS, prevents implementation of a
 reasonable, generic hash table in Haskell.  :-(

Data.IntMap is just a limit of what Bulat suggested.

So what was you thoughts about Data.IntMap in mutable hashmap?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[6]: [Haskell-cafe] Boxed Mutable Arrays

2009-12-15 Thread Brad Larsen
Serguey,

On Tue, Dec 15, 2009 at 1:07 PM, Serguey Zefirov sergu...@gmail.com wrote:
 now i see what you mean. no, i mean trivial transformation. #650 says
 about slow GC. why it's slow? because once you made any update to the
 array, the entire array is marked as updated and scanned on next minor GC
 (which occurs after every 512 kbytes allocated, afaik). let's replace
 big array (say, of 10,000 elements) with array of 100 arrays of 100
 elements each. now, between minor GCs only some of arrays will be
 changed and entire amount of memory to be scanned will become less
 Data.IntMap?
 I want to implement a more-or-less traditional, mutable, imperative
 hash table in the ST monad.  Hence, I'm not considering Data.IntMap
 and other persistent tree structures for its implementation, although
 I have thought about it.
 The bug described in Ticket #650, AFAICS, prevents implementation of a
 reasonable, generic hash table in Haskell.  :-(

 Data.IntMap is just a limit of what Bulat suggested.

 So what was you thoughts about Data.IntMap in mutable hashmap?


I have considered using Data.IntMap to implement a sort of faux hash
table, e.g., introduce a Hashable class, and then use an IntMap to
lists of Hashable.  The result would be a pure, persistent ``hash
table''.  In such a case, however, lookup/insert/delete operations
would have average complexity logarithmic in the number of buckets.

Similarly, one could use an IntMap of STRefs to lists of items, but
that is bizarre.  If the number of buckets is fixed, it would be
better to use an immutable array of STRefs to lists.

One problem with using an IntMap of STRefs to lists or an immutable
array of STRefs to lists as part of a hash table implementation is
that you get an added indirection compared to a boxed STArray of
lists, which would be bad for performance.

To reiterate my desire:  I want an efficient, generic, mutable hash
table data structure for instances where one is already working in ST.
 This data structure should be written in regular Haskell, without
resorting to the FFI.  In cases where one is already writing code
using mutable state, a good hash table implementation could perform
better in terms of time  memory than IntMap, Map, etc.

Sincerely,
Brad
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe