ca you share some sample data

On Sun, Mar 15, 2015 at 8:51 PM, Rohit U <rjupadhy...@gmail.com> wrote:

> Hi,
>
> I am trying to run  LogisticRegressionWithSGD on RDD of LabeledPoints
> loaded using loadLibSVMFile:
>
> val logistic: RDD[LabeledPoint] = MLUtils.loadLibSVMFile(sc,
> "s3n://logistic-regression/epsilon_normalized")
>
> val model = LogisticRegressionWithSGD.train(logistic, 100)
>
> It gives an input validation error after about 10 minutes:
>
> org.apache.spark.SparkException: Input validation failed.
>     at
> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:162)
>     at
> org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm.run(GeneralizedLinearAlgorithm.scala:146)
>     at
> org.apache.spark.mllib.classification.LogisticRegressionWithSGD$.train(LogisticRegression.scala:157)
>     at
> org.apache.spark.mllib.classification.LogisticRegressionWithSGD$.train(LogisticRegression.scala:192)
>
> From reading this bug report (
> https://issues.apache.org/jira/browse/SPARK-2575) since I am loading
> LibSVM format file there should be only 0/1 in the dataset and should not
> be facing the issue in the bug report. Is there something else I'm missing
> here?
>
> Thanks!
>

Reply via email to