[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-26 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524783325 Hi, @cloud-fan :) we can use partitionBy to do this but I don't do that for two reasons: 1. partitionBy

[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-24 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524560474 > > I have add a test for the change, it can prove it works lining with expectation. > > I meant a r

[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-24 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524560341 @srowen @cloud-fan emmm, actually I don't assume that similarly-named paths tend to store in the same loc

[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-23 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524290002 > @triplesheep Thanks for the work. Could you provide benchmark result to prove that it is helpful in certa

[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-23 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524289704 > A few second ago, I left a comment but deleted it. Sorry about that. Never mind :) ---

[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-23 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524289597 > Hm, I'm uneasy that this may introduce behavior and performance changes, but I don't have a specific prob

[GitHub] [spark] triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path

2019-08-22 Thread GitBox
triplesheep commented on issue #25556: [SPARK-28853][SQL] Support conf to organize file partitions by file path URL: https://github.com/apache/spark/pull/25556#issuecomment-524143163 cc @srowen @dongjoon-hyun @gengliangwang