Re: Signing releases with pwendell or release manager's key?

2017-09-17 Thread Patrick Wendell
Sparks release pipeline is automated and part of that automation includes
securely injecting this key for the purpose of signing. I asked the ASF to
provide a service account key several years ago but they suggested that we
use a key attributed to an individual even if the process is automated.

I believe other projects that release with high frequency also have
automated the signing process.

This key is injected during the build process. A really ambitious release
manager could reverse engineer this in a way that reveals the private key,
however if someone is a release manager then they themselves can do quite a
bit of nefarious things anyways.

It is true that we trust all previous release managers instead of only one.
We could probably rotate the jenkins credentials periodically in order to
compensate for this, if we think this is a nontrivial risk.

- Patrick

On Sun, Sep 17, 2017 at 7:04 PM, Holden Karau  wrote:

> Would any of Patrick/Josh/Shane (or other PMC folks with
> understanding/opinions on this setup) care to comment? If this is a
> blocking issue I can cancel the current release vote thread while we
> discuss this some more.
>
> On Fri, Sep 15, 2017 at 5:18 PM Holden Karau  wrote:
>
>> Oh yes and to keep people more informed I've been updating a PR for the
>> release documentation as I go to write down some of this unwritten
>> knowledge -- https://github.com/apache/spark-website/pull/66
>>
>>
>> On Fri, Sep 15, 2017 at 5:12 PM Holden Karau 
>> wrote:
>>
>>> Also continuing the discussion from the vote threads, Shane probably has
>>> the best idea on the ACLs for Jenkins so I've CC'd him as well.
>>>
>>>
>>> On Fri, Sep 15, 2017 at 5:09 PM Holden Karau 
>>> wrote:
>>>
 Changing the release jobs, beyond the available parameters, right now
 depends on Josh arisen as there are some scripts which generate the jobs
 which aren't public. I've done temporary fixes in the past with the Python
 packaging but my understanding is that in the medium term it requires
 access to the scripts.

 So +CC Josh.

 On Fri, Sep 15, 2017 at 4:38 PM Ryan Blue  wrote:

> I think this needs to be fixed. It's true that there are barriers to
> publication, but the signature is what we use to authenticate Apache
> releases.
>
> If Patrick's key is available on Jenkins for any Spark committer to
> use, then the chance of a compromise are much higher than for a normal RM
> key.
>
> rb
>
> On Fri, Sep 15, 2017 at 12:34 PM, Sean Owen 
> wrote:
>
>> Yeah I had meant to ask about that in the past. While I presume
>> Patrick consents to this and all that, it does mean that anyone with 
>> access
>> to said Jenkins scripts can create a signed Spark release, regardless of
>> who they are.
>>
>> I haven't thought through whether that's a theoretical issue we can
>> ignore or something we need to fix up. For example you can't get a 
>> release
>> on the ASF mirrors without more authentication.
>>
>> How hard would it be to make the script take in a key? it sort of
>> looks like the script already takes GPG_KEY, but don't know how to modify
>> the jobs. I suppose it would be ideal, in any event, for the actual 
>> release
>> manager to sign.
>>
>> On Fri, Sep 15, 2017 at 8:28 PM Holden Karau 
>> wrote:
>>
>>> That's a good question, I built the release candidate however the
>>> Jenkins scripts don't take a parameter for configuring who signs them
>>> rather it always signs them with Patrick's key. You can see this from
>>> previous releases which were managed by other folks but still signed by
>>> Patrick.
>>>
>>> On Fri, Sep 15, 2017 at 12:16 PM, Ryan Blue 
>>> wrote:
>>>
 The signature is valid, but why was the release signed with Patrick
 Wendell's private key? Did Patrick build the release candidate?

>>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>
 --
 Twitter: https://twitter.com/holdenkarau

>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>>
> --
> Twitter: https://twitter.com/holdenkarau
>


Re: Signing releases with pwendell or release manager's key?

2017-09-17 Thread Holden Karau
Would any of Patrick/Josh/Shane (or other PMC folks with
understanding/opinions on this setup) care to comment? If this is a
blocking issue I can cancel the current release vote thread while we
discuss this some more.

On Fri, Sep 15, 2017 at 5:18 PM Holden Karau  wrote:

> Oh yes and to keep people more informed I've been updating a PR for the
> release documentation as I go to write down some of this unwritten
> knowledge -- https://github.com/apache/spark-website/pull/66
>
>
> On Fri, Sep 15, 2017 at 5:12 PM Holden Karau  wrote:
>
>> Also continuing the discussion from the vote threads, Shane probably has
>> the best idea on the ACLs for Jenkins so I've CC'd him as well.
>>
>>
>> On Fri, Sep 15, 2017 at 5:09 PM Holden Karau 
>> wrote:
>>
>>> Changing the release jobs, beyond the available parameters, right now
>>> depends on Josh arisen as there are some scripts which generate the jobs
>>> which aren't public. I've done temporary fixes in the past with the Python
>>> packaging but my understanding is that in the medium term it requires
>>> access to the scripts.
>>>
>>> So +CC Josh.
>>>
>>> On Fri, Sep 15, 2017 at 4:38 PM Ryan Blue  wrote:
>>>
 I think this needs to be fixed. It's true that there are barriers to
 publication, but the signature is what we use to authenticate Apache
 releases.

 If Patrick's key is available on Jenkins for any Spark committer to
 use, then the chance of a compromise are much higher than for a normal RM
 key.

 rb

 On Fri, Sep 15, 2017 at 12:34 PM, Sean Owen  wrote:

> Yeah I had meant to ask about that in the past. While I presume
> Patrick consents to this and all that, it does mean that anyone with 
> access
> to said Jenkins scripts can create a signed Spark release, regardless of
> who they are.
>
> I haven't thought through whether that's a theoretical issue we can
> ignore or something we need to fix up. For example you can't get a release
> on the ASF mirrors without more authentication.
>
> How hard would it be to make the script take in a key? it sort of
> looks like the script already takes GPG_KEY, but don't know how to modify
> the jobs. I suppose it would be ideal, in any event, for the actual 
> release
> manager to sign.
>
> On Fri, Sep 15, 2017 at 8:28 PM Holden Karau 
> wrote:
>
>> That's a good question, I built the release candidate however the
>> Jenkins scripts don't take a parameter for configuring who signs them
>> rather it always signs them with Patrick's key. You can see this from
>> previous releases which were managed by other folks but still signed by
>> Patrick.
>>
>> On Fri, Sep 15, 2017 at 12:16 PM, Ryan Blue 
>> wrote:
>>
>>> The signature is valid, but why was the release signed with Patrick
>>> Wendell's private key? Did Patrick build the release candidate?
>>>
>>


 --
 Ryan Blue
 Software Engineer
 Netflix

>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>>
> --
> Twitter: https://twitter.com/holdenkarau
>
-- 
Twitter: https://twitter.com/holdenkarau


Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-17 Thread yuming wang
Yes, It doesn’t work in 2.1.0 and 2.1.1, I create a PR for this: 
https://github.com/apache/spark/pull/19259 
.


> 在 2017年9月17日,16:14,Sean Owen  写道:
> 
> So, didn't work in 2.1.0 or 2.1.1? If it's not a regression and not critical, 
> it shouldn't block a release. It seems like this can only affect Docker 
> and/or Oracle JDBC? Well, if we need to roll another release anyway, seems OK.
> 
> On Sun, Sep 17, 2017 at 6:06 AM Xiao Li  > wrote:
> This is a bug introduced in 2.1. It works fine in 2.0
> 
> 2017-09-16 16:15 GMT-07:00 Holden Karau  >:
> Ok :) Was this working in 2.1.1?
> 
> On Sat, Sep 16, 2017 at 3:59 PM Xiao Li  > wrote:
> Still -1
> 
> Unable to pass the tests in my local environment. Open a JIRA 
> https://issues.apache.org/jira/browse/SPARK-22041 
> 
> - SPARK-16625: General data types to be mapped to Oracle *** FAILED ***
> 
>   types.apply(9).equals(org.apache.spark.sql.types.DateType) was false 
> (OracleIntegrationSuite.scala:158)
> 
> Xiao
> 
> 
> 2017-09-15 17:35 GMT-07:00 Ryan Blue  >:
> -1 (with my Apache member hat on, non-binding)
> 
> I'll continue discussion in the other thread, but I don't think we should 
> share signing keys.
> 
> On Fri, Sep 15, 2017 at 5:14 PM, Holden Karau  > wrote:
> Indeed it's limited to a people with login permissions on the Jenkins host 
> (and perhaps further limited, I'm not certain). Shane probably knows more 
> about the ACLs, so I'll ask him in the other thread for specifics.
> 
> This is maybe branching a bit from the question of the current RC though, so 
> I'd suggest we continue this discussion on the thread Sean Owen made.
> 
> On Fri, Sep 15, 2017 at 4:04 PM Ryan Blue  > wrote:
> I'm not familiar with the release procedure, can you send a link to this 
> Jenkins job? Can anyone run this job, or is it limited to committers?
> 
> rb
> 
> On Fri, Sep 15, 2017 at 12:28 PM, Holden Karau  > wrote:
> That's a good question, I built the release candidate however the Jenkins 
> scripts don't take a parameter for configuring who signs them rather it 
> always signs them with Patrick's key. You can see this from previous releases 
> which were managed by other folks but still signed by Patrick.
> 
> On Fri, Sep 15, 2017 at 12:16 PM, Ryan Blue  > wrote:
> The signature is valid, but why was the release signed with Patrick Wendell's 
> private key? Did Patrick build the release candidate?
> 
> rb
> 
> On Fri, Sep 15, 2017 at 6:36 AM, Denny Lee  > wrote:
> +1 (non-binding)
> 
> On Thu, Sep 14, 2017 at 10:57 PM Felix Cheung  > wrote:
> +1 tested SparkR package on Windows, r-hub, Ubuntu.
> 
> _
> From: Sean Owen >
> Sent: Thursday, September 14, 2017 3:12 PM
> Subject: Re: [VOTE] Spark 2.1.2 (RC1)
> To: Holden Karau >, 
> >
> 
> 
> 
> +1
> Very nice. The sigs and hashes look fine, it builds fine for me on Debian 
> Stretch with Java 8, yarn/hive/hadoop-2.7 profiles, and passes tests. 
> 
> Yes as you say, no outstanding issues except for this which doesn't look 
> critical, as it's not a regression.
> 
> SPARK-21985 PySpark PairDeserializer is broken for double-zipped RDDs
> 
> 
> On Thu, Sep 14, 2017 at 7:47 PM Holden Karau  > wrote:
> Please vote on releasing the following candidate as Apache Spark version 
> 2.1.2. The vote is open until Friday September 22nd at 18:00 PST and passes 
> if a majority of at least 3 +1 PMC votes are cast.
> 
> [ ] +1 Release this package as Apache Spark 2.1.2
> [ ] -1 Do not release this package because ...
> 
> 
> To learn more about Apache Spark, please see https://spark.apache.org/ 
> 
> 
> The tag to be voted on is v2.1.2-rc1 
>  
> (6f470323a0363656999dd36cb33f528afe627c12)
> 
> List of JIRA tickets resolved in this release can be found with this filter. 
> 
> 
> The release files, including signatures, digests, etc. can be found at:
> https://home.apache.org/~pwendell/spark-releases/spark-2.1.2-rc1-bin/ 
> 

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-17 Thread Sean Owen
So, didn't work in 2.1.0 or 2.1.1? If it's not a regression and not
critical, it shouldn't block a release. It seems like this can only affect
Docker and/or Oracle JDBC? Well, if we need to roll another release anyway,
seems OK.

On Sun, Sep 17, 2017 at 6:06 AM Xiao Li  wrote:

> This is a bug introduced in 2.1. It works fine in 2.0
>
> 2017-09-16 16:15 GMT-07:00 Holden Karau :
>
>> Ok :) Was this working in 2.1.1?
>>
>> On Sat, Sep 16, 2017 at 3:59 PM Xiao Li  wrote:
>>
>>> Still -1
>>>
>>> Unable to pass the tests in my local environment. Open a JIRA
>>> https://issues.apache.org/jira/browse/SPARK-22041
>>>
>>> - SPARK-16625: General data types to be mapped to Oracle *** FAILED ***
>>>
>>>   types.apply(9).equals(org.apache.spark.sql.types.DateType) was false
>>> (OracleIntegrationSuite.scala:158)
>>>
>>> Xiao
>>>
>>> 2017-09-15 17:35 GMT-07:00 Ryan Blue :
>>>
 -1 (with my Apache member hat on, non-binding)

 I'll continue discussion in the other thread, but I don't think we
 should share signing keys.

 On Fri, Sep 15, 2017 at 5:14 PM, Holden Karau 
 wrote:

> Indeed it's limited to a people with login permissions on the Jenkins
> host (and perhaps further limited, I'm not certain). Shane probably knows
> more about the ACLs, so I'll ask him in the other thread for specifics.
>
> This is maybe branching a bit from the question of the current RC
> though, so I'd suggest we continue this discussion on the thread Sean Owen
> made.
>
> On Fri, Sep 15, 2017 at 4:04 PM Ryan Blue  wrote:
>
>> I'm not familiar with the release procedure, can you send a link to
>> this Jenkins job? Can anyone run this job, or is it limited to 
>> committers?
>>
>> rb
>>
>> On Fri, Sep 15, 2017 at 12:28 PM, Holden Karau 
>> wrote:
>>
>>> That's a good question, I built the release candidate however the
>>> Jenkins scripts don't take a parameter for configuring who signs them
>>> rather it always signs them with Patrick's key. You can see this from
>>> previous releases which were managed by other folks but still signed by
>>> Patrick.
>>>
>>> On Fri, Sep 15, 2017 at 12:16 PM, Ryan Blue 
>>> wrote:
>>>
 The signature is valid, but why was the release signed with Patrick
 Wendell's private key? Did Patrick build the release candidate?

 rb

 On Fri, Sep 15, 2017 at 6:36 AM, Denny Lee 
 wrote:

> +1 (non-binding)
>
> On Thu, Sep 14, 2017 at 10:57 PM Felix Cheung <
> felixcheun...@hotmail.com> wrote:
>
>> +1 tested SparkR package on Windows, r-hub, Ubuntu.
>>
>> _
>> From: Sean Owen 
>> Sent: Thursday, September 14, 2017 3:12 PM
>> Subject: Re: [VOTE] Spark 2.1.2 (RC1)
>> To: Holden Karau , 
>>
>>
>>
>> +1
>> Very nice. The sigs and hashes look fine, it builds fine for me
>> on Debian Stretch with Java 8, yarn/hive/hadoop-2.7 profiles, and 
>> passes
>> tests.
>>
>> Yes as you say, no outstanding issues except for this which
>> doesn't look critical, as it's not a regression.
>>
>> SPARK-21985 PySpark PairDeserializer is broken for double-zipped
>> RDDs
>>
>>
>> On Thu, Sep 14, 2017 at 7:47 PM Holden Karau <
>> hol...@pigscanfly.ca> wrote:
>>
>>> Please vote on releasing the following candidate as Apache
>>> Spark version 2.1.2. The vote is open until Friday September
>>> 22nd at 18:00 PST and passes if a majority of at least 3 +1 PMC
>>> votes are cast.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.1.2
>>> [ ] -1 Do not release this package because ...
>>>
>>>
>>> To learn more about Apache Spark, please see
>>> https://spark.apache.org/
>>>
>>> The tag to be voted on is v2.1.2-rc1
>>>  (
>>> 6f470323a0363656999dd36cb33f528afe627c12)
>>>
>>> List of JIRA tickets resolved in this release can be found with
>>> this filter.
>>> 
>>>
>>> The release files, including signatures, digests, etc. can be
>>> found at:
>>>
>>> https://home.apache.org/~pwendell/spark-releases/spark-2.1.2-rc1-bin/
>>>
>>> Release artifacts are signed with the