Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Holden Karau
Hmm, this isn’t the first time we’ve had the mirror issues happen if that’s
the case. Maybe we should log the IP if this happens so we can report it to
infra?

On Sat, Nov 25, 2017 at 7:47 PM Felix Cheung  wrote:

> Ah sorry digging through the history it looks like this is changed
> relatively recently and should only download previous releases.
>
> Perhaps we are intermittently hitting a mirror that doesn’t have the
> files?
>
>
>
> https://github.com/apache/spark/commit/daa838b8886496e64700b55d1301d348f1d5c9ae
>
>
> On Sat, Nov 25, 2017 at 10:36 AM Felix Cheung 
> wrote:
>
>> Thanks Sean.
>>
>> For the second one, it looks like the
>>  HiveExternalCatalogVersionsSuite is trying to download the release tgz
>> from the official Apache mirror, which won’t work unless the release is
>> actually, released?
>>
>> val preferredMirror =
>> Seq("wget", "https://www.apache.org/dyn/closer.lua?preferred=true;, "-q",
>> "-O", "-").!!.trim
>> val url = s"
>> $preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.tgz"
>>
>> It’s proabbly getting an error page instead.
>>
>>
>> On Sat, Nov 25, 2017 at 10:28 AM Sean Owen  wrote:
>>
>>> I hit the same StackOverflowError as in the previous RC test, but,
>>> pretty sure this is just because the increased thread stack size JVM flag
>>> isn't applied consistently. This seems to resolve it:
>>>
>>> https://github.com/apache/spark/pull/19820
>>>
>>> This wouldn't block release IMHO.
>>>
>>>
>>> I am currently investigating this failure though -- seems like the
>>> mechanism that downloads Spark tarballs needs fixing, or updating, in the
>>> 2.2 branch?
>>>
>>> HiveExternalCatalogVersionsSuite:
>>>
>>> gzip: stdin: not in gzip format
>>>
>>> tar: Child returned status 1
>>>
>>> tar: Error is not recoverable: exiting now
>>>
>>> *** RUN ABORTED ***
>>>
>>>   java.io.IOException: Cannot run program "./bin/spark-submit" (in
>>> directory "/tmp/test-spark/spark-2.0.2"): error=2, No such file or directory
>>>
>>> On Sat, Nov 25, 2017 at 12:34 AM Felix Cheung 
>>> wrote:
>>>
 Please vote on releasing the following candidate as Apache Spark
 version 2.2.1. The vote is open until Friday December 1, 2017 at
 8:00:00 am UTC and passes if a majority of at least 3 PMC +1 votes are
 cast.


 [ ] +1 Release this package as Apache Spark 2.2.1

 [ ] -1 Do not release this package because ...


 To learn more about Apache Spark, please see https://spark.apache.org/


 The tag to be voted on is v2.2.1-rc2
 https://github.com/apache/spark/tree/v2.2.1-rc2  (
 e30e2698a2193f0bbdcd4edb884710819ab6397c)

 List of JIRA tickets resolved in this release can be found here
 https://issues.apache.org/jira/projects/SPARK/versions/12340470


 The release files, including signatures, digests, etc. can be found at:
 https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-bin/

 Release artifacts are signed with the following key:
 https://dist.apache.org/repos/dist/dev/spark/KEYS

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1257/

 The documentation corresponding to this release can be found at:

 https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-docs/_site/index.html


 *FAQ*

 *How can I help test this release?*

 If you are a Spark user, you can help us test this release by taking an
 existing Spark workload and running on this release candidate, then
 reporting any regressions.

 If you're working in PySpark you can set up a virtual env and install
 the current RC and see if anything important breaks, in the Java/Scala you
 can add the staging repository to your projects resolvers and test with the
 RC (make sure to clean up the artifact cache before/after so you don't end
 up building with a out of date RC going forward).

 *What should happen to JIRA tickets still targeting 2.2.1?*

 Committers should look at those and triage. Extremely important bug
 fixes, documentation, and API tweaks that impact compatibility should be
 worked on immediately. Everything else please retarget to 2.2.2.

 *But my bug isn't fixed!??!*

 In order to make timely releases, we will typically not hold the
 release unless the bug in question is a regression from 2.2.0. That being
 said if there is something which is a regression form 2.2.0 that has not
 been correctly targeted please ping a committer to help target the issue
 (you can see the open issues listed as impacting Spark 2.2.1 / 2.2.2
 here
 
 

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Felix Cheung
Ah sorry digging through the history it looks like this is changed
relatively recently and should only download previous releases.

Perhaps we are intermittently hitting a mirror that doesn’t have the files?


https://github.com/apache/spark/commit/daa838b8886496e64700b55d1301d348f1d5c9ae


On Sat, Nov 25, 2017 at 10:36 AM Felix Cheung 
wrote:

> Thanks Sean.
>
> For the second one, it looks like the  HiveExternalCatalogVersionsSuite is
> trying to download the release tgz from the official Apache mirror, which
> won’t work unless the release is actually, released?
>
> val preferredMirror =
> Seq("wget", "https://www.apache.org/dyn/closer.lua?preferred=true;, "-q",
> "-O", "-").!!.trim
> val url = s"
> $preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.tgz"
>
> It’s proabbly getting an error page instead.
>
>
> On Sat, Nov 25, 2017 at 10:28 AM Sean Owen  wrote:
>
>> I hit the same StackOverflowError as in the previous RC test, but, pretty
>> sure this is just because the increased thread stack size JVM flag isn't
>> applied consistently. This seems to resolve it:
>>
>> https://github.com/apache/spark/pull/19820
>>
>> This wouldn't block release IMHO.
>>
>>
>> I am currently investigating this failure though -- seems like the
>> mechanism that downloads Spark tarballs needs fixing, or updating, in the
>> 2.2 branch?
>>
>> HiveExternalCatalogVersionsSuite:
>>
>> gzip: stdin: not in gzip format
>>
>> tar: Child returned status 1
>>
>> tar: Error is not recoverable: exiting now
>>
>> *** RUN ABORTED ***
>>
>>   java.io.IOException: Cannot run program "./bin/spark-submit" (in
>> directory "/tmp/test-spark/spark-2.0.2"): error=2, No such file or directory
>>
>> On Sat, Nov 25, 2017 at 12:34 AM Felix Cheung 
>> wrote:
>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 2.2.1. The vote is open until Friday December 1, 2017 at 8:00:00 am UTC
>>> and passes if a majority of at least 3 PMC +1 votes are cast.
>>>
>>>
>>> [ ] +1 Release this package as Apache Spark 2.2.1
>>>
>>> [ ] -1 Do not release this package because ...
>>>
>>>
>>> To learn more about Apache Spark, please see https://spark.apache.org/
>>>
>>>
>>> The tag to be voted on is v2.2.1-rc2
>>> https://github.com/apache/spark/tree/v2.2.1-rc2  (
>>> e30e2698a2193f0bbdcd4edb884710819ab6397c)
>>>
>>> List of JIRA tickets resolved in this release can be found here
>>> https://issues.apache.org/jira/projects/SPARK/versions/12340470
>>>
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-bin/
>>>
>>> Release artifacts are signed with the following key:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1257/
>>>
>>> The documentation corresponding to this release can be found at:
>>>
>>> https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-docs/_site/index.html
>>>
>>>
>>> *FAQ*
>>>
>>> *How can I help test this release?*
>>>
>>> If you are a Spark user, you can help us test this release by taking an
>>> existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala you
>>> can add the staging repository to your projects resolvers and test with the
>>> RC (make sure to clean up the artifact cache before/after so you don't end
>>> up building with a out of date RC going forward).
>>>
>>> *What should happen to JIRA tickets still targeting 2.2.1?*
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should be
>>> worked on immediately. Everything else please retarget to 2.2.2.
>>>
>>> *But my bug isn't fixed!??!*
>>>
>>> In order to make timely releases, we will typically not hold the release
>>> unless the bug in question is a regression from 2.2.0. That being said if
>>> there is something which is a regression form 2.2.0 that has not been
>>> correctly targeted please ping a committer to help target the issue (you
>>> can see the open issues listed as impacting Spark 2.2.1 / 2.2.2 here
>>> 
>>> .
>>>
>>> *What are the unresolved issues targeted for 2.2.1
>>> ?*
>>>
>>> At the time of the writing, there is one intermited failure SPARK-20201
>>> 

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Felix Cheung
Thanks Sean.

For the second one, it looks like the  HiveExternalCatalogVersionsSuite is
trying to download the release tgz from the official Apache mirror, which
won’t work unless the release is actually, released?

val preferredMirror =
Seq("wget", "https://www.apache.org/dyn/closer.lua?preferred=true;, "-q", "
-O", "-").!!.trim
val url = s"
$preferredMirror/spark/spark-$version/spark-$version-bin-hadoop2.7.tgz"

It’s proabbly getting an error page instead.


On Sat, Nov 25, 2017 at 10:28 AM Sean Owen  wrote:

> I hit the same StackOverflowError as in the previous RC test, but, pretty
> sure this is just because the increased thread stack size JVM flag isn't
> applied consistently. This seems to resolve it:
>
> https://github.com/apache/spark/pull/19820
>
> This wouldn't block release IMHO.
>
>
> I am currently investigating this failure though -- seems like the
> mechanism that downloads Spark tarballs needs fixing, or updating, in the
> 2.2 branch?
>
> HiveExternalCatalogVersionsSuite:
>
> gzip: stdin: not in gzip format
>
> tar: Child returned status 1
>
> tar: Error is not recoverable: exiting now
>
> *** RUN ABORTED ***
>
>   java.io.IOException: Cannot run program "./bin/spark-submit" (in
> directory "/tmp/test-spark/spark-2.0.2"): error=2, No such file or directory
>
> On Sat, Nov 25, 2017 at 12:34 AM Felix Cheung 
> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.2.1. The vote is open until Friday December 1, 2017 at 8:00:00 am UTC
>> and passes if a majority of at least 3 PMC +1 votes are cast.
>>
>>
>> [ ] +1 Release this package as Apache Spark 2.2.1
>>
>> [ ] -1 Do not release this package because ...
>>
>>
>> To learn more about Apache Spark, please see https://spark.apache.org/
>>
>>
>> The tag to be voted on is v2.2.1-rc2
>> https://github.com/apache/spark/tree/v2.2.1-rc2  (
>> e30e2698a2193f0bbdcd4edb884710819ab6397c)
>>
>> List of JIRA tickets resolved in this release can be found here
>> https://issues.apache.org/jira/projects/SPARK/versions/12340470
>>
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-bin/
>>
>> Release artifacts are signed with the following key:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1257/
>>
>> The documentation corresponding to this release can be found at:
>>
>> https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-docs/_site/index.html
>>
>>
>> *FAQ*
>>
>> *How can I help test this release?*
>>
>> If you are a Spark user, you can help us test this release by taking an
>> existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install the
>> current RC and see if anything important breaks, in the Java/Scala you can
>> add the staging repository to your projects resolvers and test with the RC
>> (make sure to clean up the artifact cache before/after so you don't end up
>> building with a out of date RC going forward).
>>
>> *What should happen to JIRA tickets still targeting 2.2.1?*
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should be
>> worked on immediately. Everything else please retarget to 2.2.2.
>>
>> *But my bug isn't fixed!??!*
>>
>> In order to make timely releases, we will typically not hold the release
>> unless the bug in question is a regression from 2.2.0. That being said if
>> there is something which is a regression form 2.2.0 that has not been
>> correctly targeted please ping a committer to help target the issue (you
>> can see the open issues listed as impacting Spark 2.2.1 / 2.2.2 here
>> 
>> .
>>
>> *What are the unresolved issues targeted for 2.2.1
>> ?*
>>
>> At the time of the writing, there is one intermited failure SPARK-20201
>>  which we are
>> tracking since 2.2.0.
>>
>>


Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Sean Owen
I hit the same StackOverflowError as in the previous RC test, but, pretty
sure this is just because the increased thread stack size JVM flag isn't
applied consistently. This seems to resolve it:

https://github.com/apache/spark/pull/19820

This wouldn't block release IMHO.


I am currently investigating this failure though -- seems like the
mechanism that downloads Spark tarballs needs fixing, or updating, in the
2.2 branch?

HiveExternalCatalogVersionsSuite:

gzip: stdin: not in gzip format

tar: Child returned status 1

tar: Error is not recoverable: exiting now

*** RUN ABORTED ***

  java.io.IOException: Cannot run program "./bin/spark-submit" (in
directory "/tmp/test-spark/spark-2.0.2"): error=2, No such file or directory

On Sat, Nov 25, 2017 at 12:34 AM Felix Cheung 
wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.2.1. The vote is open until Friday December 1, 2017 at 8:00:00 am UTC
> and passes if a majority of at least 3 PMC +1 votes are cast.
>
>
> [ ] +1 Release this package as Apache Spark 2.2.1
>
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see https://spark.apache.org/
>
>
> The tag to be voted on is v2.2.1-rc2
> https://github.com/apache/spark/tree/v2.2.1-rc2  (
> e30e2698a2193f0bbdcd4edb884710819ab6397c)
>
> List of JIRA tickets resolved in this release can be found here
> https://issues.apache.org/jira/projects/SPARK/versions/12340470
>
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-bin/
>
> Release artifacts are signed with the following key:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1257/
>
> The documentation corresponding to this release can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/spark-2.2.1-rc2-docs/_site/index.html
>
>
> *FAQ*
>
> *How can I help test this release?*
>
> If you are a Spark user, you can help us test this release by taking an
> existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install the
> current RC and see if anything important breaks, in the Java/Scala you can
> add the staging repository to your projects resolvers and test with the RC
> (make sure to clean up the artifact cache before/after so you don't end up
> building with a out of date RC going forward).
>
> *What should happen to JIRA tickets still targeting 2.2.1?*
>
> Committers should look at those and triage. Extremely important bug fixes,
> documentation, and API tweaks that impact compatibility should be worked on
> immediately. Everything else please retarget to 2.2.2.
>
> *But my bug isn't fixed!??!*
>
> In order to make timely releases, we will typically not hold the release
> unless the bug in question is a regression from 2.2.0. That being said if
> there is something which is a regression form 2.2.0 that has not been
> correctly targeted please ping a committer to help target the issue (you
> can see the open issues listed as impacting Spark 2.2.1 / 2.2.2 here
> 
> .
>
> *What are the unresolved issues targeted for 2.2.1
> ?*
>
> At the time of the writing, there is one intermited failure SPARK-20201
>  which we are tracking
> since 2.2.0.
>
>