triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524783325
Hi, @cloud-fan :) we can use partitionBy to do this but I don't do that for
two reasons:
1. partitionBy
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524560474
> > I have add a test for the change, it can prove it works lining with
expectation.
>
> I meant a r
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524560341
@srowen @cloud-fan emmm, actually I don't assume that similarly-named
paths tend to store in the same loc
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524290002
> @triplesheep Thanks for the work. Could you provide benchmark result to
prove that it is helpful in certa
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524289704
> A few second ago, I left a comment but deleted it. Sorry about that.
Never mind :)
---
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524289597
> Hm, I'm uneasy that this may introduce behavior and performance changes,
but I don't have a specific prob
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to
organize file partitions by file path
URL: https://github.com/apache/spark/pull/25556#issuecomment-524143163
cc @srowen @dongjoon-hyun @gengliangwang