Joseph K. Bradley created SPARK-5532:
----------------------------------------

             Summary: Repartitioning DataFrame causes saveAsParquetFile to fail 
with VectorUDT
                 Key: SPARK-5532
                 URL: https://issues.apache.org/jira/browse/SPARK-5532
             Project: Spark
          Issue Type: Bug
          Components: MLlib, SQL
    Affects Versions: 1.3.0
            Reporter: Joseph K. Bradley


Deterministic failure:
{code}
import org.apache.spark.mllib.linalg._
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
import sqlContext._
val data = sc.parallelize(Seq((1.0, 
Vectors.dense(1,2,3)))).toDataFrame("label", "features")
data.repartition(1).saveAsParquetFile("blah")
{code}
If you remove the repartition, then this succeeds.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to