[GitHub] [spark] holdenk commented on pull request #29179: [WIP][SPARK-32381][CORE][SQL] Explore allowing parallel listing & non-location sensitive listing in core

2020-08-20 Thread GitBox
holdenk commented on pull request #29179: URL: https://github.com/apache/spark/pull/29179#issuecomment-677906964 Closing since @sunchao is doing follow on work in a separate PR :) This is an automated message from the Apache

[GitHub] [spark] holdenk commented on pull request #29179: [WIP][SPARK-32381][CORE][SQL] Explore allowing parallel listing & non-location sensitive listing in core

2020-07-24 Thread GitBox
holdenk commented on pull request #29179: URL: https://github.com/apache/spark/pull/29179#issuecomment-663753558 > > Interesting. Is this specific to the S3A impl or is there a higher base class? I want to make it work with multiple file formats if possible. > > it's in hadoop common

[GitHub] [spark] holdenk commented on pull request #29179: [WIP][SPARK-32381][CORE][SQL] Explore allowing parallel listing & non-location sensitive listing in core

2020-07-23 Thread GitBox
holdenk commented on pull request #29179: URL: https://github.com/apache/spark/pull/29179#issuecomment-663255596 > There's potential here, I'm curious about the numbers > > * try and do incremental result generation though remote iterators, yield etc. That way the ability to do async