Re: Resolving all JIRAs affecting EOL releases

2019-09-07 Thread Takeshi Yamamuro
Thanks for the kind explanation. I see and it looks ok to me. On Sun, Sep 8, 2019 at 1:54 PM Hyukjin Kwon wrote: > Thanks for checking it. > > I think it's fine by two reasons below: > > 1. It has another condition for such cases - one year time range. > Basically, such PRs have not been

Re: Resolving all JIRAs affecting EOL releases

2019-09-07 Thread Hyukjin Kwon
Thanks for checking it. I think it's fine by two reasons below: 1. It has another condition for such cases - one year time range. Basically, such PRs have not been merged for one year, which I believe are not likely merged soon. The JIRA status will be updated when such PRs are merged

Re: Resolving all JIRAs affecting EOL releases

2019-09-07 Thread Takeshi Yamamuro
Hi, Hyukjin, I checked entries in the list and I found that some entries have 'In-Progress' in their status and have oepn prs (e.g., SPARK-25211 ). We can also close these PRs according to the bulk close? (But, we might need to check the

Resolving all JIRAs affecting EOL releases

2019-09-07 Thread Hyukjin Kwon
HI all, We have resolved JIRAs that targets EOL releases (up to Spark 2.2.x) in order to make it the manageable size before. Since Spark 2.3.4 will be EOL release, I plan to do this again roughly in a week. The JIRAs that has not been updated for the last year, and having affect version of EOL

Re: DataFrameReader bottleneck in DataSource#checkAndGlobPathIfNecessary when reading S3 files

2019-09-07 Thread Steve Loughran
On Fri, Sep 6, 2019 at 10:56 PM Arwin Tio wrote: > I think the problem is calling globStatus to expand all 300K files. > > In my particular case I did not use any glob patterns, so my bottleneck > came from the FileSystem#exists specifically. I do concur that the > globStatus expansion could

[Spark SQL] Any intersted in do SQL between two hive metastore.

2019-09-07 Thread angers . zhu
Best RegardsAngers