Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@gatorsmile ok, I'll close this for now. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/14038
Could we first close this PR? We can revisit it later?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75132/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #75132 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75132/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #75132 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75132/testReport)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74964/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #74964 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74964/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #74964 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74964/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #74927 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74927/testReport)**
for PR 14038 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74927/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #74927 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74927/testReport)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng ping
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng ping
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng ping
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71899/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71899 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71899/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71899 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71899/testReport)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng Could you check this?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71391/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71391 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71391/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71391 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71391/testReport)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71389/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71389 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71389/testReport)**
for PR 14038 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71388/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71388 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71388/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71388 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71388/testReport)**
for PR 14038 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71384/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71384 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71384/testReport)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #71384 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71384/testReport)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
yea, as for data files, it's okay to filter out '_' and '.'. But, the file
pattens of metadata depend on file formats as suggested in
Github user rxin commented on the issue:
https://github.com/apache/spark/pull/14038
Do we have a strong need for this? Anything wrong with just filtering out
all file names that start with _ and .?
---
If your project is set up for it, you can reply to this email and have your
reply
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
@maropu if you create a PR for your work I'll comment on it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng I'm not sure that the original motivation keeps alive in
SPARK-16317 though, if I need to do something, please let me know. I made new
code based on this pr
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
Understood though, it seems this is a difficult issue because I'm not 100%
sure how rich we should need for the filter interface (at least `timestamp` and
`file type` is not used for now when
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
There's no performance problem from filtering just on names. It's when
people try to filter on more complex things (file type, timestamp) they need to
call `getFileStatus(path)` and that's
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
If my understanding is correct, `PathFilter` is not passed into
`FileSystem.listFiles` in `ListingFileCatalog#listLeafFiles` inside. If even
so, the performance degrades you pointed out occur?
---
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
Oh, i don't want to take on any more work...I just think you should make
the predicate passed in something that goes `FileStatus => Boolean` instead of
`String => Boolean`, and doing the
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@steveloughran Thank for the comment and good suggestion. Seems you'd
better off opening a new JIRA ticket to discuss more there. btw, do you know
how the recursion you pointed out makes big impacts
Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/14038
Path filtering in Hadoop FS calls on anything other than filename is very
suboptimal; in #14731 you can see where the filtering has been postoned until
after the listing, when the full
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
ping @rxin @liancheng
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64075/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #64075 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64075/consoleFull)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #64075 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64075/consoleFull)**
for PR 14038 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64062/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #64062 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64062/consoleFull)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #64062 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64062/consoleFull)**
for PR 14038 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/64049/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #64049 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64049/consoleFull)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #64049 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/64049/consoleFull)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
ping @rxin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61846/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #61846 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61846/consoleFull)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #61846 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61846/consoleFull)**
for PR 14038 at commit
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
okay, updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14038
cc @rxin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng okay, re-check please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61750/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #61750 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61750/consoleFull)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #61750 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61750/consoleFull)**
for PR 14038 at commit
Github user liancheng commented on the issue:
https://github.com/apache/spark/pull/14038
Left some comments, the overall structure looks good. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/14038
@liancheng Could you review this after v2.0 released?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/14038
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61701/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #61701 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61701/consoleFull)**
for PR 14038 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/14038
**[Test build #61701 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61701/consoleFull)**
for PR 14038 at commit
80 matches
Mail list logo