Github user koertkuipers commented on the issue:
https://github.com/apache/spark/pull/23052
it is pretty common for us to write empty dataframe to parquet and later
read it back in
same for writing to csv with header and reading it back in (with type
inference disabled, we assume
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/23052
Merged to master
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99407/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99407 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99407/testReport)**
for PR 23052 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99407 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99407/testReport)**
for PR 23052 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99380/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99380 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99380/testReport)**
for PR 23052 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99380 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99380/testReport)**
for PR 23052 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23052
jenkins, retest this, please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99372/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99372 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99372/testReport)**
for PR 23052 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99372 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99372/testReport)**
for PR 23052 at commit
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23052
Actually it needs similar changes like in
https://github.com/apache/spark/pull/23130
---
-
To unsubscribe, e-mail:
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23052
> seems like a real failure
I am looking at it. It seems the test is not deterministic.
---
-
To unsubscribe, e-mail:
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23052
seems like a real failure
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99361/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99361 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99361/testReport)**
for PR 23052 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99361 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99361/testReport)**
for PR 23052 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23052
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99354/
Test FAILed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99354 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99354/testReport)**
for PR 23052 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99354 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99354/testReport)**
for PR 23052 at commit
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23052
retest this please
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99221/
Test FAILed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test FAILed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99221 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99221/testReport)**
for PR 23052 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99221 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99221/testReport)**
for PR 23052 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
There are two more things to deal with:
https://github.com/apache/spark/pull/23052#issuecomment-440687200 comment
will still be valid - at least it should be double checked because
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/23052
First of all, sometimes we do need to write "empty" files, so that we can
infer schema of a parquet directory. Empty parquet file is not really empty, as
it has header/footer.
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
cc @cloud-fan as well
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
Also, it's not always for Parquet to write empty files. That does not write
empty files when data frames are created from emptyRDD (the one pointed out in
the PR link I gave). We should match
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
@MaxGekk I didn't mean to block this PR. Since we're going ahead for 3.0,
it should be good to match and fix the behaviours across data sources. For
instance, CSV should still be able to read
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23052
I have read the tickets you pointed out but haven't found what could
potentially block the changes. One of corner cases is saving an empty
dataframe. In this case, no files would be written, but
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99107/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99107 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99107/testReport)**
for PR 23052 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #99107 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99107/testReport)**
for PR 23052 at commit
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
I think now it should be good timing to match the behaviours.
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
related another try https://github.com/apache/spark/pull/13252
---
-
To unsubscribe, e-mail:
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
One try to add some tests for reading/writing empty dataframes was here
https://github.com/apache/spark/pull/13253 fyi
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
Which should be ... this https://github.com/apache/spark/pull/12855
---
-
To unsubscribe, e-mail:
Github user MaxGekk commented on the issue:
https://github.com/apache/spark/pull/23052
> Similar changes were proposed in Parquet few years ago (by me) and
reverted.
What was the main reason to revert it? If it is possible could you give me
a link to your PR.
---
Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/23052
@MaxGekk, actually this is kind of important behaviour change. This
basically means we're unable to read the empty files back. Similar changes were
proposed in Parquet few years ago (by me) and
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Merged build finished. Test PASSed.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98887/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #98887 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98887/testReport)**
for PR 23052 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/23052
**[Test build #98887 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98887/testReport)**
for PR 23052 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/23052
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
58 matches
Mail list logo