[ https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicholas Chammas updated SPARK-16377: ------------------------------------- Component/s: (was: MLilb) MLlib > Spark MLlib: MultilayerPerceptronClassifier - error while training > ------------------------------------------------------------------ > > Key: SPARK-16377 > URL: https://issues.apache.org/jira/browse/SPARK-16377 > Project: Spark > Issue Type: Bug > Components: ML, MLlib > Affects Versions: 1.5.2 > Reporter: Mikhail Shiryaev > > Hi, > I am trying to train model by MultilayerPerceptronClassifier. > It works on sample data from > data/mllib/sample_multiclass_classification_data.txt with 4 features, 3 > classes and layers [4, 4, 3]. > But when I try to use other input files with other features and classes (from > here for example: > https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html) > then I get errors. > Example: > Input file aloi (128 features, 1000 classes, layers [128, 128, 1000]): > with block size = 1: > ERROR StrongWolfeLineSearch: Encountered bad values in function evaluation. > Decreasing step size to Infinity > ERROR LBFGS: Failure! Resetting history: breeze.optimize.FirstOrderException: > Line search failed > ERROR LBFGS: Failure again! Giving up and returning. Maybe the objective is > just poorly behaved? > with default block size = 128: > java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at > org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:629) > > at > org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:628) > > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:628) > > at > org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:624) > > Even if I modify sample_multiclass_classification_data.txt file (rename all > 4-th features to 5-th) and run with layers [5, 5, 3] then I also get the same > errors as for file above. > So to resume: > I can't run training with default block size and with more than 4 features. > If I set block size to 1 then some actions are happened but I get errors > from LBFGS. > It is reproducible with Spark 1.5.2 and from master branch on github (from > 4-th July). > Did somebody already met with such behavior? > Is there bug in MultilayerPerceptronClassifier or I use it incorrectly? > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org