[jira] [Created] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

Mikhail Shiryaev (JIRA) Tue, 05 Jul 2016 03:30:52 -0700

Mikhail Shiryaev created SPARK-16377:
----------------------------------------

Summary: Spark MLlib: MultilayerPerceptronClassifier - error while
training
Key: SPARK-16377
URL: https://issues.apache.org/jira/browse/SPARK-16377
Project: Spark
Issue Type: Bug
Components: ML, MLilb
Affects Versions: 1.5.2
Reporter: Mikhail Shiryaev

Hi,

I am trying to train model by MultilayerPerceptronClassifier.

It works on sample data from
data/mllib/sample_multiclass_classification_data.txt with 4 features, 3 classes
and layers [4, 4, 3].
But when I try to use other input files with other features and classes (from
here for example:
https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html)
then I get errors.

Example:
Input file aloi (128 features, 1000 classes, layers [128, 128, 1000]):

with block size = 1:
ERROR StrongWolfeLineSearch: Encountered bad values in function evaluation.
Decreasing step size to Infinity
ERROR LBFGS: Failure! Resetting history: breeze.optimize.FirstOrderException:
Line search failed
ERROR LBFGS: Failure again! Giving up and returning. Maybe the objective is
just poorly behaved?

with default block size = 128:
java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at
org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:629)

at
org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:628)

at scala.collection.immutable.List.foreach(List.scala:381)
at
org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:628)

at
org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:624)

Even if I modify sample_multiclass_classification_data.txt file (rename all
4-th features to 5-th) and run with layers [5, 5, 3] then I also get the same
errors as for file above.

So to resume:
I can't run training with default block size and with more than 4 features.
If I set block size to 1 then some actions are happened but I get errors from
LBFGS.
It is reproducible with Spark 1.5.2 and from master branch on github (from 4-th
July).

Did somebody already met with such behavior?
Is there bug in MultilayerPerceptronClassifier or I use it incorrectly?

Thanks.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-16377) Spark MLlib: MultilayerPerceptronClassifier - error while training

Reply via email to