Very nice... How would you feel about writing some docs on this? tg
On Tue, Jul 21, 2020 at 1:54 AM Baeriswyl Kuno SBB CFF FFS (Extern) < kuno.baeris...@sbb.ch> wrote: > Hallo Andrew, > thanks for your hint. > > Yes, that's way I've found too. > > def createIndexMap(x : CheckpointedDrm[Int]) : RDD[(Int, Int)] = { > val xIndexFiltered = x.rdd > .filter(r => r._2.get(0) > 0) > .map(r => r._1) > > xIndexFiltered.zipWithIndex > .map(r => (r._1,r._2.toInt)) > } > > First, I filter the DRM and create a map with old and new indexes, as you > mentioned. > > By appling joins this index map, I'm can reduce the rows in my DRM > according to certain condition, do some more calculation and map back the > newly calculated values to the original DRM. > > Like: > def mergeDrm(drmOrig : CheckpointedDrm[Int],drmFiltriert : > CheckpointedDrm[Int], indexMapping: RDD[(Int, Int)]) : > CheckpointedDrm[Int] = { > drmWrap ( > drmOrig.rdd > .map(r => Pair(r._1, r._2)) > .leftOuterJoin(indexMapping.map(r => Pair(r._1, r._2))) > .map(r=> Pair(r._2._2, (r._1, r._2._1))) > .leftOuterJoin(drmFiltriert.rdd.map(r => Pair(Option(r._1), > r._2))) > .map(r=> (r._2._1._1, r._2._2.getOrElse(r._2._1._2))) > ) > } > > Greets > > Kuno > > > > -----Ursprüngliche Nachricht----- > Von: Andrew Musselman <andrew.mussel...@gmail.com> > Gesendet: Dienstag, 7. Juli 2020 23:16 > An: user@mahout.apache.org > Betreff: Re: How to do logical subsetting in Mathout > > Kuno, thanks for your note. I don't know of an equivalent function out of > the box, but if you want to get the indices where a condition is true you > could try something in Scala like: > > myList.zipWithIndex.collect { case (item, index) if item > 1 => index } > > Hope this is helpful. > > On Wed, Jun 10, 2020 at 2:53 AM Baeriswyl Kuno SBB CFF FFS (Extern) < > kuno.baeris...@sbb.ch> wrote: > > > Hi all, > > > > I've pumped into the Mahout, because I need to migrate a R Script > > including matric algebra to Spark Cluster. > > > > Mahouts Scala/Spark Binding provides all of the operations, except of > > logical subsetting. > > > > Example: > > > > x1 = c(1.0,4.0,2.0,5.0) > > x2 = c(0,0,0,0) > > x2[x1 > 1] = 2 > > > > Would set value 2 to return Row 2,3 and 4. > > > > Is there an equivalent function in Mahout? > > > > > > Thanks. > > > > Kuno > > > > >