darion yaphet created SPARK-21066: ------------------------------------- Summary: LibSVM load just one input file Key: SPARK-21066 URL: https://issues.apache.org/jira/browse/SPARK-21066 Project: Spark Issue Type: Bug Components: ML Affects Versions: 2.1.1 Reporter: darion yaphet
Currently when we using SVM to train dataset we found the input files limit only one . the source code as following : {{{ val path = if (dataFiles.length == 1) { dataFiles.head.getPath.toUri.toString } else if (dataFiles.isEmpty) { throw new IOException("No input path specified for libsvm data") } else { throw new IOException("Multiple input paths are not supported for libsvm data.") } }}} The file store on the Distributed File System such as HDFS is split into mutil piece and I think this limit is not necessary . We can join input paths into a string split with comma. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org