Re: Using pyspark with Spark 2.4.3 a MultiLayerPerceptron model givens inconsistent outputs if a large amount of data is fed into it and at least one of the model outputs is fed to a Python UDF.

2020-07-21 Thread Ben Smith
I can also recreate with the very latest master branch (3.1.0-SNAPSHOT) if I compile it locally -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Using pyspark with Spark 2.4.3 a MultiLayerPerceptron model givens inconsistent outputs if a large amount of data is fed into it and at least one of the model outputs is fed to a Python UDF.

2020-07-20 Thread Ben Smith
Thanks for that. I have played with this a bit more after your feedback and found: I can only recreate the problem with python 3.6+. If I change between python 2.7, python 3.6 and python 3.7 I find that the problem occurs in the python 3.6 and 3.7 case but not in the python 2.7. - I have used

Using pyspark with Spark 2.4.3 a MultiLayerPerceptron model givens inconsistent outputs if a large amount of data is fed into it and at least one of the model outputs is fed to a Python UDF.

2020-07-17 Thread Ben Smith
Hi, I am having an issue that looks like a potentially serious bug with Spark 2.4.3 as it impacts data accuracy. I have searched in the Spark Jira and mail lists as best I can and cannot find reference to anyone else having this issue. I am not sure if this would be suitable for raising as a bug