Re: restarting jenkins build system tomorrow (7/8) ~930am PDT
Thanks Shane! BTW, it's getting serious .. e.g) https://github.com/apache/spark/pull/28969 . The tests could not pass in 7 days .. Hopefully restarting the machines will make the current situation better :-) Separately, I am working on a PR to run the Spark tests in Github Actions. We could hopefully use Github Actions and Jenkins together meanwhile. 2020년 7월 9일 (목) 오전 1:07, shane knapp ☠ 님이 작성: > this will be happening tomorrow... today is Meeting Hell Day[tm]. > > On Tue, Jul 7, 2020 at 1:59 PM shane knapp ☠ wrote: > >> i wasn't able to get to it today, so i'm hoping to squeeze in a quick >> trip to the colo tomorrow morning. if not, then first thing thursday. >> >> -- >> Shane Knapp >> Computer Guy / Voice of Reason >> UC Berkeley EECS Research / RISELab Staff Technical Lead >> https://rise.cs.berkeley.edu >> > > > -- > Shane Knapp > Computer Guy / Voice of Reason > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu >
Re: restarting jenkins build system tomorrow (7/8) ~930am PDT
this will be happening tomorrow... today is Meeting Hell Day[tm]. On Tue, Jul 7, 2020 at 1:59 PM shane knapp ☠ wrote: > i wasn't able to get to it today, so i'm hoping to squeeze in a quick trip > to the colo tomorrow morning. if not, then first thing thursday. > > -- > Shane Knapp > Computer Guy / Voice of Reason > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu > -- Shane Knapp Computer Guy / Voice of Reason UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu
Re: Issue on SPARK 3.0.0 loading MultilayerPerceptronClassificationModel
Yeah that's a bug, I can reproduce it. Can you open a JIRA? It works in Scala, so must be an issue with the Python wrapper. The serialized model is fine; it's loading it back. I think it's because the MultilayerPerceptronParams extends HasSolver which defaults to 'auto', but doesn't seem to fully override it correctly, as it picks up this default which isn't valid for MLP. Huaxin maybe you have some insight? I think you have worked on this code recently. On Wed, Jul 8, 2020 at 4:05 AM Steve Taylor wrote: > > Hi, > > > > I’m not sure if this is the right place to raise this, if not hopefully you > can direct me to the right place. > > > > I believe I have discovered a bug when loading > MultilayerPerceptronClassificationModel in spark 3.0.0, scala 2.1.2 which I > have tested and can see is not there in at least Spark 2.4.3, Scala 2.11. > (I’m not sure if the Scala version is important). > > > > I am using pyspark on a databricks cluster and importing the library “from > pyspark.ml.classification import MultilayerPerceptronClassificationModel” > > > > When running model=MultilayerPerceptronClassificationModel.(“load”) and then > model. transform (df) I get the following error: IllegalArgumentException: > MultilayerPerceptronClassifier_8055d1368e78 parameter solver given invalid > value auto. > > > > > > This issue can be easily replicated by running the example given on the spark > documents: > http://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier > > > > Then adding a save model, load model and transform statement as such: > > > > from pyspark.ml.classification import MultilayerPerceptronClassifier > > from pyspark.ml.evaluation import MulticlassClassificationEvaluator > > > > # Load training data > > data = spark.read.format("libsvm")\ > > .load("data/mllib/sample_multiclass_classification_data.txt") > > > > # Split the data into train and test > > splits = data.randomSplit([0.6, 0.4], 1234) > > train = splits[0] > > test = splits[1] > > > > # specify layers for the neural network: > > # input layer of size 4 (features), two intermediate of size 5 and 4 > > # and output of size 3 (classes) > > layers = [4, 5, 4, 3] > > > > # create the trainer and set its parameters > > trainer = MultilayerPerceptronClassifier(maxIter=100, layers=layers, > blockSize=128, seed=1234) > > > > # train the model > > model = trainer.fit(train) > > > > # compute accuracy on the test set > > result = model.transform(test) > > predictionAndLabels = result.select("prediction", "label") > > evaluator = MulticlassClassificationEvaluator(metricName="accuracy") > > print("Test set accuracy = " + str(evaluator.evaluate(predictionAndLabels))) > > > > from pyspark.ml.classification import MultilayerPerceptronClassifier, > MultilayerPerceptronClassificationModel > > model.save(Save_location) > > model2. MultilayerPerceptronClassificationModel.load(Save_location) > > > > result_from_loaded = model2.transform(test) > > > > - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Issue on SPARK 3.0.0 loading MultilayerPerceptronClassificationModel
Hi, I'm not sure if this is the right place to raise this, if not hopefully you can direct me to the right place. I believe I have discovered a bug when loading MultilayerPerceptronClassificationModel in spark 3.0.0, scala 2.1.2 which I have tested and can see is not there in at least Spark 2.4.3, Scala 2.11. (I'm not sure if the Scala version is important). I am using pyspark on a databricks cluster and importing the library "from pyspark.ml.classification import MultilayerPerceptronClassificationModel" When running model=MultilayerPerceptronClassificationModel.("load") and then model. transform (df) I get the following error: IllegalArgumentException: MultilayerPerceptronClassifier_8055d1368e78 parameter solver given invalid value auto. This issue can be easily replicated by running the example given on the spark documents: http://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier Then adding a save model, load model and transform statement as such: from pyspark.ml.classification import MultilayerPerceptronClassifier from pyspark.ml.evaluation import MulticlassClassificationEvaluator # Load training data data = spark.read.format("libsvm")\ .load("data/mllib/sample_multiclass_classification_data.txt") # Split the data into train and test splits = data.randomSplit([0.6, 0.4], 1234) train = splits[0] test = splits[1] # specify layers for the neural network: # input layer of size 4 (features), two intermediate of size 5 and 4 # and output of size 3 (classes) layers = [4, 5, 4, 3] # create the trainer and set its parameters trainer = MultilayerPerceptronClassifier(maxIter=100, layers=layers, blockSize=128, seed=1234) # train the model model = trainer.fit(train) # compute accuracy on the test set result = model.transform(test) predictionAndLabels = result.select("prediction", "label") evaluator = MulticlassClassificationEvaluator(metricName="accuracy") print("Test set accuracy = " + str(evaluator.evaluate(predictionAndLabels))) from pyspark.ml.classification import MultilayerPerceptronClassifier, MultilayerPerceptronClassificationModel model.save(Save_location) model2. MultilayerPerceptronClassificationModel.load(Save_location) result_from_loaded = model2.transform(test)