---------- Forwarded message ---------- From: Dmitriy Lyubimov <[email protected]> Date: Thu, Jun 11, 2015 at 12:05 PM Subject: Re: [mahout] Mahout 0.10.x ora 0515 MAHOUT-1660 MAHOUT-1713 MAHOUT-1714 MAHOUT-1715 MAHOUT-1716 MAHOUT-1717 MAHOUT-1718 MAHOUT-1719 MAHOUT-1720 MAHOUT-1721 MAHOUT-1722 MAHOUT-1723 MAHOUT-1724 MAHOUT-1725 MAHOUT-1726 MAHOUT-1727 MAHOUT-1728 MAHOUT-1729 MAHOUT-1730 MAHOUT-1731 MAHOUT-1732 (#135) To: apache/mahout < reply+0007fbffaee3ec1297f829d4cef35d71fe241e110a902c0192cf0000000111919b2592a170ce01ec2...@reply.github.com >
yes. it lazily puts it into cache if input is not yet put into cache, with MEMORY_ONLY as to prevent partition recomputation during multiple passes over input. If input is already in the cache (shoved before the call) then it has no additional effect. I was thinking about this situation when functions need to go over inputs multiple times and decided that they do need to take initiative if it is not yet taken as user has no idea when input is going to be needed more than once. Otherwise it may lead to performance degrade that would be hard to track down. On the other hand, in spark 1.2 it's my understanding unpersist is now reference queue-aware, i.e. it will know to garbage-collect RDD from cache with JVM garbage collect says there's no more RDD reference (in our case, checkpointed matrix reference). As to how well it works in practice, i did not investigate, but that has not been causing a problem for me so far in my otherwise stressed tests. On Thu, Jun 11, 2015 at 11:53 AM, Andrew Musselman <[email protected] > wrote: > In > math-scala/src/main/scala/org/apache/mahout/math/decompositions/DSSVD.scala > <https://github.com/apache/mahout/pull/135#discussion_r32254971>: > > > @@ -43,18 +46,22 @@ object DSSVD { > > case (keys, blockA) => > > val blockY = blockA %*% Matrices.symmetricUniformView(n, r, > > omegaSeed) > > keys -> blockY > > - } > > + }.checkpoint() > > This puts results into a cache? > > — > Reply to this email directly or view it on GitHub > <https://github.com/apache/mahout/pull/135/files#r32254971>. >
