ASF board report draft for August

2019-08-12 Thread Matei Zaharia
Hi all, It’s time to submit our quarterly report to the ASF board again this Wednesday. Here is my draft about what’s new — feel free to suggest changes. Apache Spark is a fast and general engine for large-scale data processing. It offers high-level APIs in Java, Scala, Python and

Re: displaying "Test build" in PR

2019-08-12 Thread Shane Knapp
when you create a PR, the jenkins pull request builder job polls every ~5 or so minutes and will trigger jobs based on creation/approval to test/code updates/etc. On Mon, Aug 12, 2019 at 11:25 AM Younggyu Chun wrote: > Hi All, > > I have a quick question about PR. Once I create a PR I'm not

displaying "Test build" in PR

2019-08-12 Thread Younggyu Chun
Hi All, I have a quick question about PR. Once I create a PR I'm not able to see if "Test build" is being processed. But I can see this after a few minutes or hours later. Is it possible to see if "Test Build" is being processed after PR is created right away? Thank you, Younggyu Chun

Re: [SPARK-23207] Repro

2019-08-12 Thread Yuanjian Li
Hi Tyson, Thanks for the reporting! I reproduced this locally based on your code with some changes, which only keep the wrong answer job. The code as below: import scala.sys.process._ import org.apache.spark.TaskContext val res = spark.range(0, 1 * 1, 1).map{ x => (x % 1000, x)} // kill

Re: [SPARK-23207] Repro

2019-08-12 Thread Wenchen Fan
Hi Tyson, Thanks for reporting it! I quickly checked the related scheduler code but can't find an obvious place that can go wrong with cached RDD. Sean said that he can't produce it, but the second job fails. This is actually expected. We need a lot more changes to completely fix this problem,