Very nice... How would you feel about writing some docs on this?

tg


On Tue, Jul 21, 2020 at 1:54 AM Baeriswyl Kuno SBB CFF FFS (Extern) <
kuno.baeris...@sbb.ch> wrote:

> Hallo Andrew,
> thanks for your hint.
>
> Yes, that's way I've found too.
>
> def createIndexMap(x : CheckpointedDrm[Int]) : RDD[(Int, Int)] = {
>     val xIndexFiltered = x.rdd
>     .filter(r => r._2.get(0) > 0)
>     .map(r => r._1)
>
>     xIndexFiltered.zipWithIndex
>     .map(r => (r._1,r._2.toInt))
> }
>
> First, I filter the DRM and create a map with old and new indexes, as you
> mentioned.
>
> By appling joins this index map, I'm can reduce the rows in my DRM
> according to certain condition, do some more calculation and map back the
> newly calculated values to the original DRM.
>
> Like:
> def mergeDrm(drmOrig : CheckpointedDrm[Int],drmFiltriert :
> CheckpointedDrm[Int], indexMapping: RDD[(Int, Int)]) :
> CheckpointedDrm[Int] = {
>    drmWrap (
>             drmOrig.rdd
>             .map(r => Pair(r._1, r._2))
>             .leftOuterJoin(indexMapping.map(r => Pair(r._1, r._2)))
>             .map(r=> Pair(r._2._2, (r._1, r._2._1)))
>             .leftOuterJoin(drmFiltriert.rdd.map(r => Pair(Option(r._1),
> r._2)))
>             .map(r=> (r._2._1._1, r._2._2.getOrElse(r._2._1._2)))
>     )
> }
>
> Greets
>
> Kuno
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: Andrew Musselman <andrew.mussel...@gmail.com>
> Gesendet: Dienstag, 7. Juli 2020 23:16
> An: user@mahout.apache.org
> Betreff: Re: How to do logical subsetting in Mathout
>
> Kuno, thanks for your note. I don't know of an equivalent function out of
> the box, but if you want to get the indices where a condition is true you
> could try something in Scala like:
>
> myList.zipWithIndex.collect { case (item, index) if item > 1 => index }
>
> Hope this is helpful.
>
> On Wed, Jun 10, 2020 at 2:53 AM Baeriswyl Kuno SBB CFF FFS (Extern) <
> kuno.baeris...@sbb.ch> wrote:
>
> > Hi all,
> >
> > I've pumped into the Mahout, because I need to migrate a R Script
> > including matric algebra to Spark Cluster.
> >
> > Mahouts Scala/Spark Binding provides all of the operations, except of
> > logical subsetting.
> >
> > Example:
> >
> > x1 = c(1.0,4.0,2.0,5.0)
> > x2 = c(0,0,0,0)
> > x2[x1 > 1] = 2
> >
> > Would set value 2 to return Row 2,3 and 4.
> >
> > Is there an equivalent function in Mahout?
> >
> >
> > Thanks.
> >
> > Kuno
> >
> >
>

Reply via email to