Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16782
**[Test build #73709 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73709/testReport)**
for PR 16782 at commit
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/16782
That's correct @jkbradley , thanks for clearing that up - I should have
been more clear in the description. I'll go ahead and remove the static
`_input_kwargs` and update the remaining uses.
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
> it leaves in place the static class variable for all other ML classes
that use the wrapper, and those classes continue to use the static class
variable.
I think this was discussed
Github user avi8tr commented on the issue:
https://github.com/apache/spark/pull/16782
Hi, thanks for explaining that there is a purpose for the retention and
passing of the user-supplied arguments outside of the function call (while not
changing the public api). This fix enabling
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/16782
Hi @avi8tr , what exactly about this proposed fix is not thread-safe?
`_input_kwargs` also performs another function which is to only contain the
params explicitly set by the user. These get
Github user avi8tr commented on the issue:
https://github.com/apache/spark/pull/16782
This patch is not a solution for pyspark users because all of the ML stages
in the pipeline are also not threadsafe in their creation due to this same
wrapper. Note that the wrapper does two
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
I'm OK with the current solution, though if it's easy to check using
```inspection``` then that seems nice to do.
If there are cases in which the wrapper is still not thread-safe, then
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/16782
also, using the `inspection` module it would be possible to check if the
wrapped function is a method. Then we wouldn't need to just make that
assumption.
---
If your project is set up for
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/16782
Thanks @jkbradley and @davies for reviewing. This fix still seems a little
hacky to me and you could still possibly run into trouble if you call a nested
wrapped function and don't consume the
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/16782
Thanks @BryanCutler for the patch! The fix looks reasonable to me, but let
me try to check with @davies to confirm.
If this is the right approach, then I think we should update the other
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16782
**[Test build #3586 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3586/testReport)**
for PR 16782 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16782
**[Test build #3586 has
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3586/testReport)**
for PR 16782 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16782
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72292/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/16782
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16782
**[Test build #72292 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72292/testReport)**
for PR 16782 at commit
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/16782
Ping @holdenk @davies . I reproduced the code in the JIRA and found that
kwargs from one thread were getting overwritten by another, causing a
`ml.Pipeline` to be constructed with incorrect
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/16782
**[Test build #72292 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72292/testReport)**
for PR 16782 at commit
17 matches
Mail list logo