Dmitriy Lyubimov created MAHOUT-1597:
----------------------------------------

             Summary: A + 1.0 (element-wise scala operation) gives wrong result 
if rdd is missing rows, Spark side
                 Key: MAHOUT-1597
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1597
             Project: Mahout
          Issue Type: Bug
    Affects Versions: 0.9
            Reporter: Dmitriy Lyubimov
            Assignee: Dmitriy Lyubimov
             Fix For: 1.0


{code}
    // Concoct an rdd with missing rows
    val aRdd: DrmRdd[Int] = sc.parallelize(
      0 -> dvec(1, 2, 3) ::
          3 -> dvec(3, 4, 5) :: Nil
    ).map { case (key, vec) => key -> (vec: Vector)}

    val drmA = drmWrap(rdd = aRdd)

    val controlB = inCoreA + 1.0

    val drmB = drmA + 1.0

    (drmB -: controlB).norm should be < 1e-10

{code}

should not fail.

it was failing due to elementwise scalar operator only evaluates rows actually 
present in dataset. 

In case of Int-keyed row matrices, there are implied rows that yet may not be 
present in RDD. 

Our goal is to detect the condition and evaluate missing rows prior to physical 
operators that don't work with missing implied rows.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to