[ 
https://issues.apache.org/jira/browse/SPARK-16377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363458#comment-15363458
 ] 

Mikhail Shiryaev commented on SPARK-16377:
------------------------------------------

The original issue with ArrayIndexOutOfBoundsException was due to bug in my 
code (inconsistency between layers and real features count).
And issue with "ERROR StrongWolfeLineSearch" isn't reproducible yet.
Sorry for taking your time and thank you for quick responses.

> Spark MLlib: MultilayerPerceptronClassifier - error while training
> ------------------------------------------------------------------
>
>                 Key: SPARK-16377
>                 URL: https://issues.apache.org/jira/browse/SPARK-16377
>             Project: Spark
>          Issue Type: Bug
>          Components: ML, MLilb
>    Affects Versions: 1.5.2
>            Reporter: Mikhail Shiryaev
>
> Hi, 
> I am trying to train model by MultilayerPerceptronClassifier. 
> It works on sample data from 
> data/mllib/sample_multiclass_classification_data.txt with 4 features, 3 
> classes and layers [4, 4, 3]. 
> But when I try to use other input files with other features and classes (from 
> here for example: 
> https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html) 
> then I get errors. 
> Example: 
> Input file aloi (128 features, 1000 classes, layers [128, 128, 1000]): 
> with block size = 1: 
> ERROR StrongWolfeLineSearch: Encountered bad values in function evaluation. 
> Decreasing step size to Infinity 
> ERROR LBFGS: Failure! Resetting history: breeze.optimize.FirstOrderException: 
> Line search failed 
> ERROR LBFGS: Failure again! Giving up and returning. Maybe the objective is 
> just poorly behaved? 
> with default block size = 128: 
>  java.lang.ArrayIndexOutOfBoundsException 
>   at java.lang.System.arraycopy(Native Method) 
>   at 
> org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:629)
>  
>   at 
> org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:628)
>  
>    at scala.collection.immutable.List.foreach(List.scala:381) 
>    at 
> org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:628)
>  
>    at 
> org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:624)
>  
> Even if I modify sample_multiclass_classification_data.txt file (rename all 
> 4-th features to 5-th) and run with layers [5, 5, 3] then I also get the same 
> errors as for file above. 
> So to resume: 
> I can't run training with default block size and with more than 4 features. 
> If I set  block size to 1 then some actions are happened but I get errors 
> from LBFGS. 
> It is reproducible with Spark 1.5.2 and from master branch on github (from 
> 4-th July). 
> Did somebody already met with such behavior? 
> Is there bug in MultilayerPerceptronClassifier or I use it incorrectly? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to