Re: sc.makeRDD bug with NumericRange
Looks like NumericRange in Scala is just a joke. scala val x = 0.0 to 1.0 by 0.1 x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2, 0.30004, 0.4, 0.5, 0.6, 0.7, 0.7999, 0.8999, 0.) scala x.take(3) res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2) scala x.drop(3) res2: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30004, 0.4, 0.5, 0.6, 0.7, 0.7999, 0.8999, 0.) So far so good. scala x.drop(3).take(3) res3: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30004, 0.4) Why only two values? Where's 0.5? scala x.drop(6) res4: scala.collection.immutable.NumericRange[Double] = NumericRange(0.6001, 0.7001, 0.8, 0.9) And where did the last value disappear now? You have to approach Scala with a healthy amount of distrust. You're on the right track with toArray. On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra m...@clearstorydata.comwrote: Please file an issue: Spark Project JIRAhttps://issues.apache.org/jira/browse/SPARK On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia buendia...@gmail.comwrote: Hi, I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell: $ MASTER=local[4] bin/spark-shell scala sc.makeRDD(0.0 to 1 by 0.1).collect().length *8* The expected length is 11. This works correctly when lanching spark with only one core: $ MASTER=local[1] bin/spark-shell scala sc.makeRDD(0.0 to 1 by 0.1).collect().length *11* This also works correctly when using toArray(): $ MASTER=local[4] bin/spark-shell scala sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length *8*
Re: sc.makeRDD bug with NumericRange
To make up for mocking Scala, I've filed a bug ( https://issues.scala-lang.org/browse/SI-8518) and will try to patch this. On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos daniel.dara...@lynxanalytics.com wrote: Looks like NumericRange in Scala is just a joke. scala val x = 0.0 to 1.0 by 0.1 x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2, 0.30004, 0.4, 0.5, 0.6, 0.7, 0.7999, 0.8999, 0.) scala x.take(3) res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2) scala x.drop(3) res2: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30004, 0.4, 0.5, 0.6, 0.7, 0.7999, 0.8999, 0.) So far so good. scala x.drop(3).take(3) res3: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30004, 0.4) Why only two values? Where's 0.5? scala x.drop(6) res4: scala.collection.immutable.NumericRange[Double] = NumericRange(0.6001, 0.7001, 0.8, 0.9) And where did the last value disappear now? You have to approach Scala with a healthy amount of distrust. You're on the right track with toArray. On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra m...@clearstorydata.comwrote: Please file an issue: Spark Project JIRAhttps://issues.apache.org/jira/browse/SPARK On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell: $ MASTER=local[4] bin/spark-shell scala sc.makeRDD(0.0 to 1 by 0.1).collect().length *8* The expected length is 11. This works correctly when lanching spark with only one core: $ MASTER=local[1] bin/spark-shell scala sc.makeRDD(0.0 to 1 by 0.1).collect().length *11* This also works correctly when using toArray(): $ MASTER=local[4] bin/spark-shell scala sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length *8*
Re: sc.makeRDD bug with NumericRange
Good catch, Daniel. Looks like this is a scala bug, not a spark one. Yet, spark users got to be careful not using NumericRange. On Fri, Apr 18, 2014 at 9:05 PM, Daniel Darabos daniel.dara...@lynxanalytics.com wrote: To make up for mocking Scala, I've filed a bug ( https://issues.scala-lang.org/browse/SI-8518) and will try to patch this. On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos daniel.dara...@lynxanalytics.com wrote: Looks like NumericRange in Scala is just a joke. scala val x = 0.0 to 1.0 by 0.1 x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2, 0.30004, 0.4, 0.5, 0.6, 0.7, 0.7999, 0.8999, 0.) scala x.take(3) res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2) scala x.drop(3) res2: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30004, 0.4, 0.5, 0.6, 0.7, 0.7999, 0.8999, 0.) So far so good. scala x.drop(3).take(3) res3: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30004, 0.4) Why only two values? Where's 0.5? scala x.drop(6) res4: scala.collection.immutable.NumericRange[Double] = NumericRange(0.6001, 0.7001, 0.8, 0.9) And where did the last value disappear now? You have to approach Scala with a healthy amount of distrust. You're on the right track with toArray. On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra m...@clearstorydata.comwrote: Please file an issue: Spark Project JIRAhttps://issues.apache.org/jira/browse/SPARK On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia buendia...@gmail.com wrote: Hi, I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell: $ MASTER=local[4] bin/spark-shell scala sc.makeRDD(0.0 to 1 by 0.1).collect().length *8* The expected length is 11. This works correctly when lanching spark with only one core: $ MASTER=local[1] bin/spark-shell scala sc.makeRDD(0.0 to 1 by 0.1).collect().length *11* This also works correctly when using toArray(): $ MASTER=local[4] bin/spark-shell scala sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length *8*