Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18945
**[Test build #82025 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82025/testReport)**
for PR 18945 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18945
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82025/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18945
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18945
**[Test build #82025 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82025/testReport)**
for PR 18945 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18945
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18945
**[Test build #82024 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82024/testReport)**
for PR 18945 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18945
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82024/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18945
**[Test build #82024 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82024/testReport)**
for PR 18945 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
@logannc, mind adding the JIRA number in this PR title as described in the
guide line? Please take a look - http://spark.apache.org/contributing.html.
I'd read carefully the comments
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
ok to test
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/18945
@logannc There are pandas related tests in `python/pyspark/sql/tests.py`.
---
-
To unsubscribe, e-mail:
Github user logannc commented on the issue:
https://github.com/apache/spark/pull/18945
Hm. Where would I add tests?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/18945
@HyukjinKwon I can take over this if @logannc can't find time to continue
it.
---
-
To unsubscribe, e-mail:
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/18945
We also need a proper test for this.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands,
Github user logannc commented on the issue:
https://github.com/apache/spark/pull/18945
Sorry I feel off the face of the earth. I finally had some time to sit down
and do this. I took your suggestions but implemented it a little differently.
Unless I've made a dumb mistake, I think I
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
@BryanCutler, @a10y and @viirya, would you guys be interested in this and
have some time to take over this with the different approach we discussed above
-
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
gentle ping @logannc.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
Hey @logannc, have you had some time to work on this? I want to fix this
issue asap. Ortherwise, would anyone here be interested in submitimg another PR
for the another approach?
---
Github user logannc commented on the issue:
https://github.com/apache/spark/pull/18945
Sorry for the delay. Things got busy and now there is the storm in Houston.
Will update this per these suggestions soon.
---
If your project is set up for it, you can reply to this email and have
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
BTW, I think it'd be nicer if we can go with the approach above ^ (checking
the null in data and setting the correct type). I am okay with any form for the
approach above as we have a decent
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18945
Thanks for clarifying @HyukjinKwon , I see what you mean now. Since pandas
will iterate over `self.collect()` anyway I don't think your solution would
impact performance at all right? So your
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
Yea, I think it is basically similar idea with
https://github.com/apache/spark/pull/18945#discussion_r134033952.
---
If your project is set up for it, you can reply to this email and have your
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
Ah, I had to be clear. I thought something like ...
```python
dtype = {}
for field in self.schema:
pandas_type = _to_corrected_pandas_type(field.dataType)
if
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18945
> Another rough thought for a feasible way I could think to keep the
current behaviour (to be more specific, to match the types with / without Arrow
optimization, IIUC) is, to make a generator
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
@logannc, mind adding the JIRA number in this PR title as described in the
guide line?
I definitely also think this needs a JIRA as before and after are not
virtually same and it looks a
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18945
I agree with @BryanCutler in general. `None` makes more sense to me to
infer the type in this case.
Another rough thought for a feasible way I could think to keep the current
behaviour
Github user logannc commented on the issue:
https://github.com/apache/spark/pull/18945
I read the contributing guide. It said that simple changes didn't need a
JIRA. Certainly this code change is quite simple, I just wasn't sure if there
would be enough discussion to warrant a Jira.
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18945
There should be no way an error like this is raised during the call
`toPandas()` so I am thinking that if there is a nullable int column, the type
should not try to be changed in
Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18945
@logannc , thanks for this. You bring up a big issue here that I think was
overlooked when this code was added in Spark. I filed a JIRA for this
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18945
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
30 matches
Mail list logo