[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815763#comment-16815763
]
Victor Tso commented on SPARK-20144:
This one was clearly decided against. I ended up writing my own
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815755#comment-16815755
]
David Greenberg commented on SPARK-20144:
-
Hello, this issue is also a major one for me. Almost
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697410#comment-16697410
]
Dongjoon Hyun commented on SPARK-20144:
---
Sorry, [~darabos]. IMHO, the proposed way is not
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697319#comment-16697319
]
Daniel Darabos commented on SPARK-20144:
So where do we go from here? Should I try to find a
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650817#comment-16650817
]
Daniel Darabos commented on SPARK-20144:
Thanks, those are good questions.
# The global option
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650779#comment-16650779
]
Dongjoon Hyun commented on SPARK-20144:
---
[~silvermast] and [~darabos].
1. The proposed
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650722#comment-16650722
]
Daniel Darabos commented on SPARK-20144:
Yeah, I'm not too happy about the alphabetical ordering
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650704#comment-16650704
]
Victor Tso commented on SPARK-20144:
It should, because by convention the parquet files are 0-padded
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650642#comment-16650642
]
Dongjoon Hyun commented on SPARK-20144:
---
For me, I don't think that PR resolve this issue,
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650492#comment-16650492
]
Daniel Darabos commented on SPARK-20144:
Thanks Victor! I've expanded the test with a case where
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16650303#comment-16650303
]
Victor Tso commented on SPARK-20144:
I looked at the PR and liked what I saw. I would only suggest
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642401#comment-16642401
]
Daniel Darabos commented on SPARK-20144:
Sorry, I had an idea for a quick fix for this and sent
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642038#comment-16642038
]
Apache Spark commented on SPARK-20144:
--
User 'darabos' has created a pull request for this issue:
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642036#comment-16642036
]
Apache Spark commented on SPARK-20144:
--
User 'darabos' has created a pull request for this issue:
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494852#comment-16494852
]
sam commented on SPARK-20144:
-
Regarding the original issue of sorting, I agree with [~srowen] in that it
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494810#comment-16494810
]
Unai Sarasola commented on SPARK-20144:
---
But if you want to have exactly a copy from your data in
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203882#comment-16203882
]
sam commented on SPARK-20144:
-
I think this is a regression. We used to be able to easily control the number
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996517#comment-15996517
]
Bill commented on SPARK-20144:
--
Increasing {{spark.sql.files.openCostInBytes}} prevents the individual
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15961170#comment-15961170
]
Andrew Ash commented on SPARK-20144:
This is a regression from 1.6 to the 2.x line. [~marmbrus]
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956030#comment-15956030
]
Li Jin commented on SPARK-20144:
> When you save the sorted data into Parquet, only the data in
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15954559#comment-15954559
]
Liang-Chi Hsieh commented on SPARK-20144:
-
I don't think the API has the guarantee about the data
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951202#comment-15951202
]
Li Jin commented on SPARK-20144:
Thanks Sean! I appreciate your time and help very much.
>
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951095#comment-15951095
]
Sean Owen commented on SPARK-20144:
---
Probably best to wait for an informed opinion but I would assume
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951084#comment-15951084
]
Li Jin commented on SPARK-20144:
Also, I am not sure about "If the data were sorted, sorting would be
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15951073#comment-15951073
]
Li Jin commented on SPARK-20144:
I totally agree Correctness takes precedence. If sorting is the only
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950988#comment-15950988
]
Sean Owen commented on SPARK-20144:
---
If the data were sorted, sorting would be pretty cheap, in
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950979#comment-15950979
]
Li Jin commented on SPARK-20144:
Thanks for getting back to me.
Sorting in this case will just add extra
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950627#comment-15950627
]
Sean Owen commented on SPARK-20144:
---
If you need a particular ordering, I think you need to sort. I am
[
https://issues.apache.org/jira/browse/SPARK-20144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950221#comment-15950221
]
Li Jin commented on SPARK-20144:
Ping, anyone? This is a pretty big blocker for us.
> spark.read.parquet
29 matches
Mail list logo