Re: [Haskell-cafe] weak pointers and memoization (was Re: memoization)

2009-09-18 Thread Peter Verswyvelen
I would also like to see a solution for problems like these.

Haskell provides a lot of nice memoizing / caching data structures -
like a trie - but the ones I know indeed keep growing, so no garbage
collection takes place?

It would be nice to have a data structure that performs caching but
does not grow unlimited.

I had a similar problem with stable names; it is not possible to check
if a stable name is still alive.

On Fri, Sep 18, 2009 at 1:39 AM, Rodney Price rodpr...@raytheon.com wrote:
 In my case, the results of each computation are used to generate a node
 in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
 that gets stored in the data structure after the computation of the
 node finishes.  If I don't memoize the function to build a node, the
 cost of generating the tree is exponential; if I do, it's somewhere
 between linear and quadratic.

 Another process prunes parts of this graph structure as time goes on.
 The entire data structure is intended to be persistent, lasting for
 days at a time in a server-like application.  If the parts pruned
 aren't garbage collected, the space leak will eventually be
 catastrophic.  Either the memo table or the graph structure itself will
 outgrow available memory.

 -Rod


 On Thu, 17 Sep 2009 13:32:13 -0400
 Job Vranish jvran...@gmail.com wrote:

 What are you trying to use this for? It seems to me that for memo
 tables you almost never have references to they keys outside the
 lookup table since the keys are usually computed right at the last
 minute, and then discarded (otherwise it might be easier to just
 cache stuff outside the function).

 For example with a naive fibs, the values you are passing in are
 computed, and probably don't exist before you do the recursive call,
 and then are discarded shortly afterward.

 It seems like putting a cap on the cache size, and then just
 overwriting old entries would be better.
 Am I missing something?

 - Job



 On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price rodpr...@raytheon.com
 wrote:

  How does garbage collection work in an example like the one below?
  You memoize a function with some sort of lookup table, which stores
  function arguments as keys and function results as values.  As long
  as the function remains in scope, the keys in the lookup table
  remain in memory, which means that the keys themselves always
  remain reachable and they cannot be garbage collected.  Right?
 
  So what do you do in the case where you know that, after some
  period of time, some entries in the lookup table will never be
  accessed?  That is, there are no references to the keys for some
  entries remaining, except for the references in the lookup table
  itself.  You'd like to allow the memory occupied by the keys to be
  garbage collected.  Otherwise, if the function stays around for a
  long time, the size of the lookup table always grows.  How do you
  avoid the space leak?
 
  I notice that there is a function in Data.IORef,
 
  mkWeakIORef :: IORef a - IO () - IO (Weak (IORef a))
 
  which looks promising.  In the code below, however, there's only one
  IORef, so either the entire table gets garbage collected or none of
  it does.
 
  I've been reading the paper Stretching the storage manager: weak
  pointers and stable names in Haskell, which seems to answer my
  question.  When I attempt to run the memoization code in the paper
  on the simple fib example, I find that -- apparently due to lazy
  evaluation -- no new entries are entered into the lookup table, and
  therefore no lookups are ever successful!
 
  So apparently there is some interaction between lazy evaluation and
  garbage collection that I don't understand.  My head hurts.  Is it
  necessary to make the table lookup operation strict?  Or is it
  something entirely different that I am missing?
 
  -Rod
 
 
  On Thu, 10 Sep 2009 18:33:47 -0700
  Ryan Ingram ryani.s...@gmail.com wrote:
 
  
   memoIO :: Ord a = (a - b) - IO (a - IO b)
   memoIO f = do
      cache - newIORef M.empty
      return $ \x - do
          m - readIORef cache
          case M.lookup x m of
              Just y - return y
              Nothing - do let res = f x
                            writeIORef cache $ M.insert x res m
                            return res
  
   memo :: Ord a = (a - b) - (a - b)
   memo f = unsafePerformIO $ do
       fmemo - memoIO f
       return (unsafePerformIO . fmemo)
  
   I don't think there is any valid transformation that breaks this,
   since the compiler can't lift anything through unsafePerformIO.
   Am I mistaken?
  
     -- ryan
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
 

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list

Re: [Haskell-cafe] weak pointers and memoization (was Re: memoization)

2009-09-18 Thread Job Vranish
Yeah it seems like the general solution to the problem would be some sort of
map-like datastructure that you add items via a key/value pair, and if the
key gets GC'd, that entry gets removed from the structure.

I've been wanting something like this as well, but didn't know about weak
references so I didn't know if it was possible, but I think I could make
something like this now. I'll give it a shot and let you guys know how it
goes.

Rodney could you post your memo code that uses the weak references?

- Job

On Fri, Sep 18, 2009 at 7:56 AM, Peter Verswyvelen bugf...@gmail.comwrote:

 I would also like to see a solution for problems like these.

 Haskell provides a lot of nice memoizing / caching data structures -
 like a trie - but the ones I know indeed keep growing, so no garbage
 collection takes place?

 It would be nice to have a data structure that performs caching but
 does not grow unlimited.

 I had a similar problem with stable names; it is not possible to check
 if a stable name is still alive.

 On Fri, Sep 18, 2009 at 1:39 AM, Rodney Price rodpr...@raytheon.com
 wrote:
  In my case, the results of each computation are used to generate a node
  in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
  that gets stored in the data structure after the computation of the
  node finishes.  If I don't memoize the function to build a node, the
  cost of generating the tree is exponential; if I do, it's somewhere
  between linear and quadratic.
 
  Another process prunes parts of this graph structure as time goes on.
  The entire data structure is intended to be persistent, lasting for
  days at a time in a server-like application.  If the parts pruned
  aren't garbage collected, the space leak will eventually be
  catastrophic.  Either the memo table or the graph structure itself will
  outgrow available memory.
 
  -Rod
 
 
  On Thu, 17 Sep 2009 13:32:13 -0400
  Job Vranish jvran...@gmail.com wrote:
 
  What are you trying to use this for? It seems to me that for memo
  tables you almost never have references to they keys outside the
  lookup table since the keys are usually computed right at the last
  minute, and then discarded (otherwise it might be easier to just
  cache stuff outside the function).
 
  For example with a naive fibs, the values you are passing in are
  computed, and probably don't exist before you do the recursive call,
  and then are discarded shortly afterward.
 
  It seems like putting a cap on the cache size, and then just
  overwriting old entries would be better.
  Am I missing something?
 
  - Job
 
 
 
  On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price rodpr...@raytheon.com
  wrote:
 
   How does garbage collection work in an example like the one below?
   You memoize a function with some sort of lookup table, which stores
   function arguments as keys and function results as values.  As long
   as the function remains in scope, the keys in the lookup table
   remain in memory, which means that the keys themselves always
   remain reachable and they cannot be garbage collected.  Right?
  
   So what do you do in the case where you know that, after some
   period of time, some entries in the lookup table will never be
   accessed?  That is, there are no references to the keys for some
   entries remaining, except for the references in the lookup table
   itself.  You'd like to allow the memory occupied by the keys to be
   garbage collected.  Otherwise, if the function stays around for a
   long time, the size of the lookup table always grows.  How do you
   avoid the space leak?
  
   I notice that there is a function in Data.IORef,
  
   mkWeakIORef :: IORef a - IO () - IO (Weak (IORef a))
  
   which looks promising.  In the code below, however, there's only one
   IORef, so either the entire table gets garbage collected or none of
   it does.
  
   I've been reading the paper Stretching the storage manager: weak
   pointers and stable names in Haskell, which seems to answer my
   question.  When I attempt to run the memoization code in the paper
   on the simple fib example, I find that -- apparently due to lazy
   evaluation -- no new entries are entered into the lookup table, and
   therefore no lookups are ever successful!
  
   So apparently there is some interaction between lazy evaluation and
   garbage collection that I don't understand.  My head hurts.  Is it
   necessary to make the table lookup operation strict?  Or is it
   something entirely different that I am missing?
  
   -Rod
  
  
   On Thu, 10 Sep 2009 18:33:47 -0700
   Ryan Ingram ryani.s...@gmail.com wrote:
  
   
memoIO :: Ord a = (a - b) - IO (a - IO b)
memoIO f = do
   cache - newIORef M.empty
   return $ \x - do
   m - readIORef cache
   case M.lookup x m of
   Just y - return y
   Nothing - do let res = f x
 writeIORef cache $ M.insert x res m
 return res
   
memo 

Re: [Haskell-cafe] weak pointers and memoization (was Re: memoization)

2009-09-18 Thread Job Vranish
Hey it works :D
Here is a proof of concept:
http://gist.github.com/189104

Maybe later today I'll try to make a version that can be safely used outside
IO.

- Job


On Fri, Sep 18, 2009 at 10:19 AM, Job Vranish jvran...@gmail.com wrote:

 Yeah it seems like the general solution to the problem would be some sort
 of map-like datastructure that you add items via a key/value pair, and if
 the key gets GC'd, that entry gets removed from the structure.

 I've been wanting something like this as well, but didn't know about weak
 references so I didn't know if it was possible, but I think I could make
 something like this now. I'll give it a shot and let you guys know how it
 goes.

 Rodney could you post your memo code that uses the weak references?

 - Job


 On Fri, Sep 18, 2009 at 7:56 AM, Peter Verswyvelen bugf...@gmail.comwrote:

 I would also like to see a solution for problems like these.

 Haskell provides a lot of nice memoizing / caching data structures -
 like a trie - but the ones I know indeed keep growing, so no garbage
 collection takes place?

 It would be nice to have a data structure that performs caching but
 does not grow unlimited.

 I had a similar problem with stable names; it is not possible to check
 if a stable name is still alive.

 On Fri, Sep 18, 2009 at 1:39 AM, Rodney Price rodpr...@raytheon.com
 wrote:
  In my case, the results of each computation are used to generate a node
  in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
  that gets stored in the data structure after the computation of the
  node finishes.  If I don't memoize the function to build a node, the
  cost of generating the tree is exponential; if I do, it's somewhere
  between linear and quadratic.
 
  Another process prunes parts of this graph structure as time goes on.
  The entire data structure is intended to be persistent, lasting for
  days at a time in a server-like application.  If the parts pruned
  aren't garbage collected, the space leak will eventually be
  catastrophic.  Either the memo table or the graph structure itself will
  outgrow available memory.
 
  -Rod
 
 
  On Thu, 17 Sep 2009 13:32:13 -0400
  Job Vranish jvran...@gmail.com wrote:
 
  What are you trying to use this for? It seems to me that for memo
  tables you almost never have references to they keys outside the
  lookup table since the keys are usually computed right at the last
  minute, and then discarded (otherwise it might be easier to just
  cache stuff outside the function).
 
  For example with a naive fibs, the values you are passing in are
  computed, and probably don't exist before you do the recursive call,
  and then are discarded shortly afterward.
 
  It seems like putting a cap on the cache size, and then just
  overwriting old entries would be better.
  Am I missing something?
 
  - Job
 
 
 
  On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price rodpr...@raytheon.com
  wrote:
 
   How does garbage collection work in an example like the one below?
   You memoize a function with some sort of lookup table, which stores
   function arguments as keys and function results as values.  As long
   as the function remains in scope, the keys in the lookup table
   remain in memory, which means that the keys themselves always
   remain reachable and they cannot be garbage collected.  Right?
  
   So what do you do in the case where you know that, after some
   period of time, some entries in the lookup table will never be
   accessed?  That is, there are no references to the keys for some
   entries remaining, except for the references in the lookup table
   itself.  You'd like to allow the memory occupied by the keys to be
   garbage collected.  Otherwise, if the function stays around for a
   long time, the size of the lookup table always grows.  How do you
   avoid the space leak?
  
   I notice that there is a function in Data.IORef,
  
   mkWeakIORef :: IORef a - IO () - IO (Weak (IORef a))
  
   which looks promising.  In the code below, however, there's only one
   IORef, so either the entire table gets garbage collected or none of
   it does.
  
   I've been reading the paper Stretching the storage manager: weak
   pointers and stable names in Haskell, which seems to answer my
   question.  When I attempt to run the memoization code in the paper
   on the simple fib example, I find that -- apparently due to lazy
   evaluation -- no new entries are entered into the lookup table, and
   therefore no lookups are ever successful!
  
   So apparently there is some interaction between lazy evaluation and
   garbage collection that I don't understand.  My head hurts.  Is it
   necessary to make the table lookup operation strict?  Or is it
   something entirely different that I am missing?
  
   -Rod
  
  
   On Thu, 10 Sep 2009 18:33:47 -0700
   Ryan Ingram ryani.s...@gmail.com wrote:
  
   
memoIO :: Ord a = (a - b) - IO (a - IO b)
memoIO f = do
   cache - newIORef M.empty
   return $ \x - do

Re: [Haskell-cafe] weak pointers and memoization (was Re: memoization)

2009-09-17 Thread Job Vranish
What are you trying to use this for? It seems to me that for memo tables you
almost never have references to they keys outside the lookup table since the
keys are usually computed right at the last minute, and then discarded
(otherwise it might be easier to just cache stuff outside the function).

For example with a naive fibs, the values you are passing in are computed,
and probably don't exist before you do the recursive call, and then are
discarded shortly afterward.

It seems like putting a cap on the cache size, and then just overwriting old
entries would be better.
Am I missing something?

- Job



On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price rodpr...@raytheon.com wrote:

 How does garbage collection work in an example like the one below?  You
 memoize a function with some sort of lookup table, which stores function
 arguments as keys and function results as values.  As long as the
 function remains in scope, the keys in the lookup table remain in
 memory, which means that the keys themselves always remain reachable
 and they cannot be garbage collected.  Right?

 So what do you do in the case where you know that, after some period of
 time, some entries in the lookup table will never be accessed?  That is,
 there are no references to the keys for some entries remaining, except
 for the references in the lookup table itself.  You'd like to allow the
 memory occupied by the keys to be garbage collected.  Otherwise, if the
 function stays around for a long time, the size of the lookup table
 always grows.  How do you avoid the space leak?

 I notice that there is a function in Data.IORef,

 mkWeakIORef :: IORef a - IO () - IO (Weak (IORef a))

 which looks promising.  In the code below, however, there's only one
 IORef, so either the entire table gets garbage collected or none of it
 does.

 I've been reading the paper Stretching the storage manager: weak
 pointers and stable names in Haskell, which seems to answer my
 question.  When I attempt to run the memoization code in the paper on
 the simple fib example, I find that -- apparently due to lazy
 evaluation -- no new entries are entered into the lookup table, and
 therefore no lookups are ever successful!

 So apparently there is some interaction between lazy evaluation and
 garbage collection that I don't understand.  My head hurts.  Is it
 necessary to make the table lookup operation strict?  Or is it
 something entirely different that I am missing?

 -Rod


 On Thu, 10 Sep 2009 18:33:47 -0700
 Ryan Ingram ryani.s...@gmail.com wrote:

 
  memoIO :: Ord a = (a - b) - IO (a - IO b)
  memoIO f = do
 cache - newIORef M.empty
 return $ \x - do
 m - readIORef cache
 case M.lookup x m of
 Just y - return y
 Nothing - do let res = f x
   writeIORef cache $ M.insert x res m
   return res
 
  memo :: Ord a = (a - b) - (a - b)
  memo f = unsafePerformIO $ do
  fmemo - memoIO f
  return (unsafePerformIO . fmemo)
 
  I don't think there is any valid transformation that breaks this,
  since the compiler can't lift anything through unsafePerformIO.  Am I
  mistaken?
 
-- ryan

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] weak pointers and memoization (was Re: memoization)

2009-09-17 Thread Rodney Price
In my case, the results of each computation are used to generate a node
in a graph structure (dag).  The key, oddly, is a hash of a two-tuple
that gets stored in the data structure after the computation of the
node finishes.  If I don't memoize the function to build a node, the
cost of generating the tree is exponential; if I do, it's somewhere
between linear and quadratic.

Another process prunes parts of this graph structure as time goes on.
The entire data structure is intended to be persistent, lasting for
days at a time in a server-like application.  If the parts pruned
aren't garbage collected, the space leak will eventually be
catastrophic.  Either the memo table or the graph structure itself will
outgrow available memory.

-Rod


On Thu, 17 Sep 2009 13:32:13 -0400
Job Vranish jvran...@gmail.com wrote:

 What are you trying to use this for? It seems to me that for memo
 tables you almost never have references to they keys outside the
 lookup table since the keys are usually computed right at the last
 minute, and then discarded (otherwise it might be easier to just
 cache stuff outside the function).
 
 For example with a naive fibs, the values you are passing in are
 computed, and probably don't exist before you do the recursive call,
 and then are discarded shortly afterward.
 
 It seems like putting a cap on the cache size, and then just
 overwriting old entries would be better.
 Am I missing something?
 
 - Job
 
 
 
 On Wed, Sep 16, 2009 at 4:48 PM, Rodney Price rodpr...@raytheon.com
 wrote:
 
  How does garbage collection work in an example like the one below?
  You memoize a function with some sort of lookup table, which stores
  function arguments as keys and function results as values.  As long
  as the function remains in scope, the keys in the lookup table
  remain in memory, which means that the keys themselves always
  remain reachable and they cannot be garbage collected.  Right?
 
  So what do you do in the case where you know that, after some
  period of time, some entries in the lookup table will never be
  accessed?  That is, there are no references to the keys for some
  entries remaining, except for the references in the lookup table
  itself.  You'd like to allow the memory occupied by the keys to be
  garbage collected.  Otherwise, if the function stays around for a
  long time, the size of the lookup table always grows.  How do you
  avoid the space leak?
 
  I notice that there is a function in Data.IORef,
 
  mkWeakIORef :: IORef a - IO () - IO (Weak (IORef a))
 
  which looks promising.  In the code below, however, there's only one
  IORef, so either the entire table gets garbage collected or none of
  it does.
 
  I've been reading the paper Stretching the storage manager: weak
  pointers and stable names in Haskell, which seems to answer my
  question.  When I attempt to run the memoization code in the paper
  on the simple fib example, I find that -- apparently due to lazy
  evaluation -- no new entries are entered into the lookup table, and
  therefore no lookups are ever successful!
 
  So apparently there is some interaction between lazy evaluation and
  garbage collection that I don't understand.  My head hurts.  Is it
  necessary to make the table lookup operation strict?  Or is it
  something entirely different that I am missing?
 
  -Rod
 
 
  On Thu, 10 Sep 2009 18:33:47 -0700
  Ryan Ingram ryani.s...@gmail.com wrote:
 
  
   memoIO :: Ord a = (a - b) - IO (a - IO b)
   memoIO f = do
  cache - newIORef M.empty
  return $ \x - do
  m - readIORef cache
  case M.lookup x m of
  Just y - return y
  Nothing - do let res = f x
writeIORef cache $ M.insert x res m
return res
  
   memo :: Ord a = (a - b) - (a - b)
   memo f = unsafePerformIO $ do
   fmemo - memoIO f
   return (unsafePerformIO . fmemo)
  
   I don't think there is any valid transformation that breaks this,
   since the compiler can't lift anything through unsafePerformIO.
   Am I mistaken?
  
 -- ryan
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] weak pointers and memoization (was Re: memoization)

2009-09-16 Thread Rodney Price
How does garbage collection work in an example like the one below?  You
memoize a function with some sort of lookup table, which stores function
arguments as keys and function results as values.  As long as the
function remains in scope, the keys in the lookup table remain in
memory, which means that the keys themselves always remain reachable
and they cannot be garbage collected.  Right?

So what do you do in the case where you know that, after some period of
time, some entries in the lookup table will never be accessed?  That is,
there are no references to the keys for some entries remaining, except
for the references in the lookup table itself.  You'd like to allow the
memory occupied by the keys to be garbage collected.  Otherwise, if the
function stays around for a long time, the size of the lookup table
always grows.  How do you avoid the space leak?

I notice that there is a function in Data.IORef,

mkWeakIORef :: IORef a - IO () - IO (Weak (IORef a))

which looks promising.  In the code below, however, there's only one
IORef, so either the entire table gets garbage collected or none of it
does.

I've been reading the paper Stretching the storage manager: weak
pointers and stable names in Haskell, which seems to answer my
question.  When I attempt to run the memoization code in the paper on
the simple fib example, I find that -- apparently due to lazy
evaluation -- no new entries are entered into the lookup table, and
therefore no lookups are ever successful!

So apparently there is some interaction between lazy evaluation and
garbage collection that I don't understand.  My head hurts.  Is it
necessary to make the table lookup operation strict?  Or is it
something entirely different that I am missing?

-Rod


On Thu, 10 Sep 2009 18:33:47 -0700
Ryan Ingram ryani.s...@gmail.com wrote:

 
 memoIO :: Ord a = (a - b) - IO (a - IO b)
 memoIO f = do
cache - newIORef M.empty
return $ \x - do
m - readIORef cache
case M.lookup x m of
Just y - return y
Nothing - do let res = f x
  writeIORef cache $ M.insert x res m
  return res
 
 memo :: Ord a = (a - b) - (a - b)
 memo f = unsafePerformIO $ do
 fmemo - memoIO f
 return (unsafePerformIO . fmemo)
 
 I don't think there is any valid transformation that breaks this,
 since the compiler can't lift anything through unsafePerformIO.  Am I
 mistaken?
 
   -- ryan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe