[jira] [Created] (SPARK-21550) approxQuantiles throws "next on empty iterator" on empty data

peay (JIRA) Thu, 27 Jul 2017 10:00:38 -0700

peay created SPARK-21550:
----------------------------

             Summary: approxQuantiles throws "next on empty iterator" on empty 
data
                 Key: SPARK-21550
                 URL: https://issues.apache.org/jira/browse/SPARK-21550
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: peay



The documentation says:
{code}
null and NaN values will be removed from the numerical column before 
calculation. If
the dataframe is empty or the column only contains null or NaN, an empty array 
is returned.
{code}

However, this small pyspark example
{code}
sql_context.range(10).filter(col("id") == 42).approxQuantile("id", [0.99], 
0.001)
{code}

throws

{code}
Py4JJavaError: An error occurred while calling o3493.approxQuantile.
: java.util.NoSuchElementException: next on empty iterator
        at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
        at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
        at 
scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:63)
        at scala.collection.IterableLike$class.head(IterableLike.scala:107)
        at 
scala.collection.mutable.ArrayOps$ofRef.scala$collection$IndexedSeqOptimized$$super$head(ArrayOps.scala:186)
        at 
scala.collection.IndexedSeqOptimized$class.head(IndexedSeqOptimized.scala:126)
        at scala.collection.mutable.ArrayOps$ofRef.head(ArrayOps.scala:186)
        at 
scala.collection.TraversableLike$class.last(TraversableLike.scala:431)
        at 
scala.collection.mutable.ArrayOps$ofRef.scala$collection$IndexedSeqOptimized$$super$last(ArrayOps.scala:186)
        at 
scala.collection.IndexedSeqOptimized$class.last(IndexedSeqOptimized.scala:132)
        at scala.collection.mutable.ArrayOps$ofRef.last(ArrayOps.scala:186)
        at 
org.apache.spark.sql.catalyst.util.QuantileSummaries.query(QuantileSummaries.scala:207)
        at 
org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$multipleApproxQuantiles$1$$anonfun$apply$1.apply$mcDD$sp(StatFunctions.scala:92)
        at 
org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$multipleApproxQuantiles$1$$anonfun$apply$1.apply(StatFunctions.scala:92)
        at 
org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$multipleApproxQuantiles$1$$anonfun$apply$1.apply(StatFunctions.scala:92)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-21550) approxQuantiles throws "next on empty iterator" on empty data

Reply via email to