Jan Vršovský created SPARK-21723: ------------------------------------ Summary: Can't write LibSVM - key not found: numFeatures Key: SPARK-21723 URL: https://issues.apache.org/jira/browse/SPARK-21723 Project: Spark Issue Type: Bug Components: Input/Output, ML Affects Versions: 2.2.0, 2.3.0 Reporter: Jan Vršovský
Writing a dataset to LibSVM format raises an exception {{java.util.NoSuchElementException: key not found: numFeatures}} Happens only when the dataset was NOT read from a LibSVM format before (because otherwise numFeatures is in its metadata). Steps to reproduce: {{import org.apache.spark.ml.linalg.Vectors val rawData = Seq((1.0, Vectors.sparse(3, Seq((0, 2.0), (1, 3.0)))), (4.0, Vectors.sparse(3, Seq((0, 5.0), (2, 6.0))))) val dfTemp = spark.sparkContext.parallelize(rawData).toDF("label", "features") dfTemp.coalesce(1).write.format("libsvm").save("...filename...")}} PR with a fix and unit test is ready. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org