Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
This patch only handled the raw columns, not the vector / array value
columns. So maybe that original JIRA should still be open, or create another
one specific to this.
---
If your project is
Github user catlain commented on the issue:
https://github.com/apache/spark/pull/14783
done
[jira](https://issues.apache.org/jira/browse/SPARK-16785)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/14783
@catlain could you please open a JIRA.
like this, set component to SparkR
https://issues.apache.org/jira/browse/SPARK-21068?filter=12333531
---
If your project is set up for it, you can
Github user catlain commented on the issue:
https://github.com/apache/spark/pull/14783
still have this issue when input data is a array column with different
length each vector, like:
```
test1
key value
1 4dda7d68a202e9e3
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
Thanks for the update. LGTM. Merging this to master and branch-2.0
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65027/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #65027 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65027/consoleFull)**
for PR 14783 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #65027 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65027/consoleFull)**
for PR 14783 at commit
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
Sorry for the delay @clarkfitzg - The code change looks pretty good to me.
I just had one question about mixed type columns.
---
If your project is set up for it, you can reply to this email and
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
I'm presenting something related to this on Thursday- it would be nice to
tell the audience this patch made it in. Can I do anything to help this along?
---
If your project is set up for it,
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
@sun-rui Any other comments ?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/14783
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64756/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64756 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64756/consoleFull)**
for PR 14783 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64756 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64756/consoleFull)**
for PR 14783 at commit
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
Sorry I think this was a break that I just fixed in #14904
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64737/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64737 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64737/consoleFull)**
for PR 14783 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64737 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64737/consoleFull)**
for PR 14783 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64712/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64712 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64712/consoleFull)**
for PR 14783 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64712 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64712/consoleFull)**
for PR 14783 at commit
Github user felixcheung commented on the issue:
https://github.com/apache/spark/pull/14783
should we have a test against DataFrame with binary column?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
Yes, this is only for a bug fix. @shivaram mentioned in a previous email
exchange it would be good to see some performance benchmarks as well.
---
If your project is set up for it, you can
Github user sun-rui commented on the issue:
https://github.com/apache/spark/pull/14783
@clarkfitzg, your patch is for bug fix but not for performance improvement,
right? If so, since there is no performance regression according to your
benchmark, let's focus on the functionality. We
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
@shivaram what do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
Tried some more benchmarks today. Didn't see any difference in speed before
/ after patch. Observing the processes as they run I see the vast majority of
time spent in the local R process, while
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
Not sure why these timings are so bad. Found out today that by using bytes
and calling directly into Java's `org.apache.spark.api.r.RRDD` these can be
improved by 2 orders of magnitude.
---
If
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
This change doesn't appear to make any difference in speed.
```
# Wed Aug 24 14:12:12 KST 2016
# Benchmarking performance before and after dapplyCollect patch
#
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64337/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64337 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64337/consoleFull)**
for PR 14783 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64337 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64337/consoleFull)**
for PR 14783 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64335/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64335 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64335/consoleFull)**
for PR 14783 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14783
**[Test build #64335 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64335/consoleFull)**
for PR 14783 at commit
Github user clarkfitzg commented on the issue:
https://github.com/apache/spark/pull/14783
My pleasure. Let me know if / when I should squash these commits or rebase.
Working on some before and after benchmarks now.
---
If your project is set up for it, you can reply to this
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
Jenkins, ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user shivaram commented on the issue:
https://github.com/apache/spark/pull/14783
Thanks @clarkfitzg -- I'll take a look at this tomorrow
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14783
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
47 matches
Mail list logo