Re: [VOTE] Apache Spark 2.2.0 (RC6)

2017-06-30 Thread Michael Armbrust
I'll kick off the vote with a +1.

On Fri, Jun 30, 2017 at 6:44 PM, Michael Armbrust 
wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.2.0. The vote is open until Friday, July 7th, 2017 at 18:00 PST and
> passes if a majority of at least 3 +1 PMC votes are cast.
>
> [ ] +1 Release this package as Apache Spark 2.2.0
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see https://spark.apache.org/
>
> The tag to be voted on is v2.2.0-rc6
>  (a2c7b2133cfee7f
> a9abfaa2bfbfb637155466783)
>
> List of JIRA tickets resolved can be found with this filter
> 
> .
>
> The release files, including signatures, digests, etc. can be found at:
> https://home.apache.org/~pwendell/spark-releases/spark-2.2.0-rc6-bin/
>
> Release artifacts are signed with the following key:
> https://people.apache.org/keys/committer/pwendell.asc
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1245/
>
> The documentation corresponding to this release can be found at:
> https://people.apache.org/~pwendell/spark-releases/spark-2.2.0-rc6-docs/
>
>
> *FAQ*
>
> *How can I help test this release?*
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> *What should happen to JIRA tickets still targeting 2.2.0?*
>
> Committers should look at those and triage. Extremely important bug fixes,
> documentation, and API tweaks that impact compatibility should be worked on
> immediately. Everything else please retarget to 2.3.0 or 2.2.1.
>
> *But my bug isn't fixed!??!*
>
> In order to make timely releases, we will typically not hold the release
> unless the bug in question is a regression from 2.1.1.
>


[VOTE] Apache Spark 2.2.0 (RC6)

2017-06-30 Thread Michael Armbrust
Please vote on releasing the following candidate as Apache Spark version
2.2.0. The vote is open until Friday, July 7th, 2017 at 18:00 PST and
passes if a majority of at least 3 +1 PMC votes are cast.

[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see https://spark.apache.org/

The tag to be voted on is v2.2.0-rc6
 (
a2c7b2133cfee7fa9abfaa2bfbfb637155466783)

List of JIRA tickets resolved can be found with this filter

.

The release files, including signatures, digests, etc. can be found at:
https://home.apache.org/~pwendell/spark-releases/spark-2.2.0-rc6-bin/

Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1245/

The documentation corresponding to this release can be found at:
https://people.apache.org/~pwendell/spark-releases/spark-2.2.0-rc6-docs/


*FAQ*

*How can I help test this release?*

If you are a Spark user, you can help us test this release by taking an
existing Spark workload and running on this release candidate, then
reporting any regressions.

*What should happen to JIRA tickets still targeting 2.2.0?*

Committers should look at those and triage. Extremely important bug fixes,
documentation, and API tweaks that impact compatibility should be worked on
immediately. Everything else please retarget to 2.3.0 or 2.2.1.

*But my bug isn't fixed!??!*

In order to make timely releases, we will typically not hold the release
unless the bug in question is a regression from 2.1.1.


Improvement for memory config.

2017-06-30 Thread jinxing
1. For executor memory, we have spark.executor.memory for heap size, and 
spark.memory.offHeap.size for off-heap size, and these 2 together is the total 
memory consumption for each executor process.
From the user side, what they always care is the total memory consumption, no 
matter it is on-heap or off-heap. It seems that it is more friendly to have 
only one memory config for the user.
Can we merge the two configs to be one, and hide the complexity within internal 
system?
2. spark.memory.offHeap.size is originally designed for MemoryManager, which is 
to manage off-heap memory explicitly allocated by Spark itself when creating 
its own buffers / pages or caching blocks, not to account for off-heap memory 
used by lower-level code or third-party libraries, for example Netty. But 
spark.memory.offHeap.size and spark.memory.offHeap.enable are more or less 
confusing. Sometimes user can ask – "I've already set 
spark.memory.offHeap.enable to be false, but why Netty is reading remote blocks 
to off-heap?". Also I think we need to document more about
spark.memory.offHeap.size and spark.memory.offHeap.enable on 
http://spark.apache.org/docs/latest/configuration.html