[ https://issues.apache.org/jira/browse/SPARK-16875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405711#comment-15405711 ]
Apache Spark commented on SPARK-16875: -------------------------------------- User 'zhengruifeng' has created a pull request for this issue: https://github.com/apache/spark/pull/14478 > Add args checking for DataSet randomSplit and sample > ---------------------------------------------------- > > Key: SPARK-16875 > URL: https://issues.apache.org/jira/browse/SPARK-16875 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: zhengruifeng > Priority: Minor > > {code} > scala> data > res73: org.apache.spark.sql.DataFrame = [label: double, features: vector] > scala> data.count > res74: Long = 150 > scala> val s = data.randomSplit(Array(1,2,-0.01)) > s: Array[org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]] = > Array([label: double, features: vector], [label: double, features: vector], > [label: double, features: vector]) > scala> s(0).count > res75: Long = 51 > scala> s(2).count > 16/08/03 18:28:27 ERROR Executor: Exception in task 0.0 in stage 76.0 (TID 66) > java.lang.IllegalArgumentException: requirement failed: Upper bound > (1.0033444816053512) must be <= 1.0 > at scala.Predef$.require(Predef.scala:224) > scala> data.sample(false, -0.01) > res80: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [label: > double, features: vector] > scala> data.sample(false, -0.01).count > 16/08/03 18:30:33 ERROR Executor: Exception in task 0.0 in stage 84.0 (TID 71) > java.lang.IllegalArgumentException: requirement failed: Lower bound (0.0) > must be <= upper bound (-0.01) > {code} > {{val s = data.randomSplit(Array(1,2,-0.01))}} run successfully, even if I > use {{s(0)}} in the following lines. > {{data.sample(false, -0.01)}} should also fail immediately. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org