Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Dong Joon Hyun
+1 (non-binding) I built and tested on CentOS 7.3.1611 / OpenJDK 1.8.131 / R 3.3.3 with “-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver –Psparkr”. Java/Scala/R tests passed as expected. There are two minor things. 1. For the deprecation documentation issue

Re: a stage can belong to more than one job please?

2017-06-06 Thread ??????????
Hi Mark, Thanks. ---Original--- From: "Mark Hamstra" Date: 2017/6/6 23:27:43 To: "dev"; Cc: "user"; Subject: Re: a stage can belong to more than one job please? Yes, a Stage can be part of more than one Job. The

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-06-06 Thread Zoltan Ivanfi
Hi Michael, To answer this I think we should distinguish between the long-term fix and the short-term fix. If understand the replies correctly, everyone agrees that the desired long-term fix is to have two separate SQL types (TIMESTAMP [WITH|WITHOUT] TIME ZONE). Because of having separate types,

[build system] RISELab is @ the spark summit, come say hi!

2017-06-06 Thread shane knapp
we've got a booth in the expo center, feel free to stop by, say hi and get some stickers! (complaining about jenkins is also welcome, and i will happily join in!) :) shane (formerly amplab, now riselab) - To unsubscribe

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Holden Karau
+1 pip install to local virtual env works, no local version string (was blocking the pypi upload). On Tue, Jun 6, 2017 at 8:03 AM, Felix Cheung wrote: > All tasks on the R QA umbrella are completed > SPARK-20512 > > We can close this. > > > >

Re: a stage can belong to more than one job please?

2017-06-06 Thread Mark Hamstra
Yes, a Stage can be part of more than one Job. The jobIds field of Stage is used repeatedly in the DAGScheduler. On Tue, Jun 6, 2017 at 5:04 AM, 萝卜丝炒饭 <1427357...@qq.com> wrote: > Hi all, > > I read same code of spark about stage. > > The constructor of stage keep the first job ID the stage was

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Felix Cheung
All tasks on the R QA umbrella are completed SPARK-20512 We can close this. _ From: Sean Owen > Sent: Tuesday, June 6, 2017 1:16 AM Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4) To: Michael Armbrust

Performance regression for partitioned parquet data

2017-06-06 Thread Bertrand Bossy
Hi, since moving to spark 2.1 from 2.0, we experience a performance regression when reading a large, partitioned parquet dataset: We observe many (hundreds) very short jobs executing before the job that reads the data is starting. I looked into this issue and pinned it down to

a stage can belong to more than one job please?

2017-06-06 Thread ??????????
Hi all, I read same code of spark about stage. The constructor of stage keep the first job ID the stage was part of. does that means a stage can belong to more than one job please? And I find the member jobIds is never used. It looks strange. thanks adv

Are release docs part of a release?

2017-06-06 Thread Sean Owen
That's good, but, I think we should agree on whether release docs are part of a release. It's important to reasoning about releases. To be clear, you're suggesting that, say, right now you are OK with updating this page with a few more paragraphs?

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Nick Pentreath
Now, on the subject of (ML) QA JIRAs. >From the ML side, I believe they are required (I think others such as Joseph will agree and in fact have already said as much). Most are marked as Blockers, though of those the Python API coverage is strictly not a Blocker as we will never hold the release

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Nick Pentreath
The website updates for ML QA (SPARK-20507) are not *actually* critical as the project website certainly can be updated separately from the source code guide and is not part of the release to be voted on. In future that particular work item for the QA process could be marked down in priority, and

Re: [VOTE] Apache Spark 2.2.0 (RC4)

2017-06-06 Thread Sean Owen
On Tue, Jun 6, 2017 at 1:06 AM Michael Armbrust wrote: > Regarding the readiness of this and previous RCs. I did cut RC1 & RC2 > knowing that they were unlikely to pass. That said, I still think these > early RCs are valuable. I know several users that wanted to test