Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-12 Thread Krishna Sankar
I think the key is to vote a specific set of source tarballs without any
binary artifacts. The specific binaries are useful but shouldn't be part of
the voting process. Makes sense, we really cannot prove (and no need to)
that the  binaries do not contain malware, but the source can be proven to
be clean by inspection, I assume.
Cheers


On Mon, Oct 12, 2015 at 6:56 AM, Tom Graves 
wrote:

> I know there are multiple things being talked about here, but  I agree
> with Patrick here, we vote on the source distribution - src tarball (and of
> course the tag should match).  Perhaps in principle we vote on all the
> other specific binary distributions since they are generated from source
> tarball but that isn't the main thing and I surely don't test and verify
> each one of those.
>
> Tom
>
>
>
> On Monday, October 12, 2015 12:13 AM, Sean Owen 
> wrote:
>
>
> No we are voting on the artifacts being released (too) in principle.
> Although of course the artifacts should be a deterministic function of the
> source at a certain point in time.
> I think the concern is about putting Spark binaries or its dependencies
> into a source release. That should not happen, but it is not what has
> happened here.
>
> On Mon, Oct 12, 2015, 6:03 AM Patrick Wendell  wrote:
>
> Oh I see - yes it's the build/. I always thought release votes related to
> a source tag rather than specific binaries. But maybe we can just fix it in
> 1.5.2 if there is concern about mutating binaries. It seems reasonable to
> me.
>
> For tests... in the past we've tried to avoid having jars inside of the
> source tree, including some effort to generate jars on the fly which a lot
> of our tests use. I am not sure whether it's a firm policy that you can't
> have jars in test folders, though. If it is, we could probably do some
> magic to get rid of these few ones that have crept in.
>
> - Patrick
>
> On Sun, Oct 11, 2015 at 9:57 PM, Sean Owen  wrote:
>
> Agree, but we are talking about the build/ bit right?
> I don't agree that it invalidates the release, which is probably the more
> important idea. As a point of process, you would not want to modify and
> republish the artifact that was already released after being voted on -
> unless it was invalid in which case we spin up 1.5.1.1 or something.
> But that build/ directory should go in future releases.
> I think he is talking about more than this though and the other jars look
> like they are part of tests, and still nothing to do with Spark binaries.
> Those can and should stay.
>
> On Mon, Oct 12, 2015, 5:35 AM Patrick Wendell  wrote:
>
> I think Daniel is correct here. The source artifact incorrectly includes
> jars. It is inadvertent and not part of our intended release process. This
> was something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by
> updating our build scripts to fix it. However, our build environment was
> not using the most current version of the build scripts. See related links:
>
> https://issues.apache.org/jira/browse/SPARK-10511
> https://github.com/apache/spark/pull/8774/files
>
> I can update our build environment and we can repackage the Spark 1.5.1
> source tarball. To not include sources.
>
>
> - Patrick
>
> On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:
>
> Daniel: we did not vote on a tag. Please again read the VOTE email I
> linked to you:
>
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
>
> among other things, it contains a link to the concrete source (and
> binary) distribution under vote:
>
> http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/
>
> You can still examine it, sure.
>
> Dependencies are *not* bundled in the source release. You're again
> misunderstanding what you are seeing. Read my email again.
>
> I am still pretty confused about what the problem is. This is entirely
> business as usual for ASF projects. I'll follow up with you offline if
> you have any more doubts.
>
> On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
> wrote:
> > Here's my issue:
> >
> > How am I to audit that the dependencies you bundle are in fact what you
> > claim they are?  How do I know they don't contain malware or - in light
> > of recent events - emissions test rigging? ;)
> >
> > I am not interested in a git tag - that means nothing in the ASF voting
> > process, you cannot vote on a tag, only on a release candidate. The VCS
> > in use is irrelevant in this issue. If you can point me to a release
> > candidate archive that was voted upon and does not contain binary
> > applications, all is well.
> >
> > If there is no such thing, and we cannot come to an understanding, I
> > will exercise my ASF Members' rights and bring this to the attention of
> > the board of directors and ask for a clarification of the legality of
> 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-12 Thread Tom Graves
I know there are multiple things being talked about here, but  I agree with 
Patrick here, we vote on the source distribution - src tarball (and of course 
the tag should match).  Perhaps in principle we vote on all the other specific 
binary distributions since they are generated from source tarball but that 
isn't the main thing and I surely don't test and verify each one of those.
Tom 


 On Monday, October 12, 2015 12:13 AM, Sean Owen  wrote:
   

 No we are voting on the artifacts being released (too) in principle. Although 
of course the artifacts should be a deterministic function of the source at a 
certain point in time. I think the concern is about putting Spark binaries or 
its dependencies into a source release. That should not happen, but it is not 
what has happened here.

On Mon, Oct 12, 2015, 6:03 AM Patrick Wendell  wrote:

Oh I see - yes it's the build/. I always thought release votes related to a 
source tag rather than specific binaries. But maybe we can just fix it in 1.5.2 
if there is concern about mutating binaries. It seems reasonable to me.
For tests... in the past we've tried to avoid having jars inside of the source 
tree, including some effort to generate jars on the fly which a lot of our 
tests use. I am not sure whether it's a firm policy that you can't have jars in 
test folders, though. If it is, we could probably do some magic to get rid of 
these few ones that have crept in.
- Patrick
On Sun, Oct 11, 2015 at 9:57 PM, Sean Owen  wrote:

Agree, but we are talking about the build/ bit right?I don't agree that it 
invalidates the release, which is probably the more important idea. As a point 
of process, you would not want to modify and republish the artifact that was 
already released after being voted on - unless it was invalid in which case we 
spin up 1.5.1.1 or something. But that build/ directory should go in future 
releases. I think he is talking about more than this though and the other jars 
look like they are part of tests, and still nothing to do with Spark binaries. 
Those can and should stay.

On Mon, Oct 12, 2015, 5:35 AM Patrick Wendell  wrote:

I think Daniel is correct here. The source artifact incorrectly includes jars. 
It is inadvertent and not part of our intended release process. This was 
something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by updating 
our build scripts to fix it. However, our build environment was not using the 
most current version of the build scripts. See related links:
https://issues.apache.org/jira/browse/SPARK-10511https://github.com/apache/spark/pull/8774/files
I can update our build environment and we can repackage the Spark 1.5.1 source 
tarball. To not include sources.

- Patrick
On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:

Daniel: we did not vote on a tag. Please again read the VOTE email I
linked to you:

http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none

among other things, it contains a link to the concrete source (and
binary) distribution under vote:

http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/

You can still examine it, sure.

Dependencies are *not* bundled in the source release. You're again
misunderstanding what you are seeing. Read my email again.

I am still pretty confused about what the problem is. This is entirely
business as usual for ASF projects. I'll follow up with you offline if
you have any more doubts.

On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno  wrote:
> Here's my issue:
>
> How am I to audit that the dependencies you bundle are in fact what you
> claim they are?  How do I know they don't contain malware or - in light
> of recent events - emissions test rigging? ;)
>
> I am not interested in a git tag - that means nothing in the ASF voting
> process, you cannot vote on a tag, only on a release candidate. The VCS
> in use is irrelevant in this issue. If you can point me to a release
> candidate archive that was voted upon and does not contain binary
> applications, all is well.
>
> If there is no such thing, and we cannot come to an understanding, I
> will exercise my ASF Members' rights and bring this to the attention of
> the board of directors and ask for a clarification of the legality of this.
>
> I find it highly irregular. Perhaps it is something some projects do in
> the Java community, but that doesn't make it permissible in my view.
>
> With regards,
> Daniel.
>
>
> On 10/11/2015 05:42 PM, Sean Owen wrote:
>> Still confused. Why are you saying we didn't vote on an archive? refer
>> to the email I linked, which includes both the git tag and a link to
>> all generated artifacts (also in my email).
>>
>> So, there are two things at play here:
>>
>> First, I am not sure what you mean that a source distro can't have
>> binary files. It's supposed to have the 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Sean Owen
Still confused. Why are you saying we didn't vote on an archive? refer
to the email I linked, which includes both the git tag and a link to
all generated artifacts (also in my email).

So, there are two things at play here:

First, I am not sure what you mean that a source distro can't have
binary files. It's supposed to have the source code of Spark, and
shouldn't contain binary Spark. Nothing you listed are Spark binaries.
However, a distribution might have a lot of things in it that support
the source build, like copies of tools, test files, etc.  That
explains I think the first couple lines that you identified.

Still, I am curious why you are saying that would invalidate a source
release? I have never heard anything like that.

Second, I do think there are some binaries in here that aren't
supposed to be there, like the build/ directory stuff. IIRC these were
included accidentally and won't be in the next release. At least, I
don't see why they need to be bundled. These are just local copies of
third party tools though, and don't really matter. As it happens, the
licenses that get distributed with the source distro even cover all of
this stuff. I think that's not supposed to be there, but, also don't
see it's 'invalid' as a result.


On Sun, Oct 11, 2015 at 4:33 PM, Daniel Gruno  wrote:
> On 10/11/2015 05:29 PM, Sean Owen wrote:
>> Of course, but what's making you think this was a binary-only
>> distribution?
>
> I'm not saying binary-only, I am saying your source release contains
> binary programs, which would invalidate a release vote. Is there a
> release candidate package, that is voted on (saying you have a git tag
> does not satisfy this criteria, you need to vote on an actual archive of
> files, otherwise there is no cogent proof of the release being from that
> specific git tag).
>
> Here's what I found in your source release:
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/sql/hive/src/test/resources/data/files/TestSerDe.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/sql/hive/src/test/resources/TestUDTF.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/R/pkg/inst/test_support/sparktestjar_2.10-1.0.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-reflect.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/sbt-interface.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/compiler-interface-sources.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/incremental-compiler.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-compiler.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/zinc.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-library.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/misc/scala-devel/plugins/continuations.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scala-reflect.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/akka-actors.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/typesafe-config.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scala-actors-migration.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scala-actors.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scalap.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scala-swing.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scala-compiler.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/lib/scala-library.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/src/scala-reflect-src.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/src/scala-swing-src.jar
>
> Binary application (application/jar; charset=binary) found in
> spark-1.5.1/build/scala-2.10.4/src/scalap-src.jar
>
> Binary application (application/jar; charset=binary) found in
> 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Daniel Gruno
Out of curiosity: How can you vote on a release that contains 34 binary files? 
Surely a source code release should only contain source code and not binaries, 
as you cannot verify the content of these.

Looking forward to a response.

With regards,
Daniel.

On 10/2/2015, 4:42:31 AM, Reynold Xin  wrote: 
> Hi All,
> 
> Spark 1.5.1 is a maintenance release containing stability fixes. This
> release is based on the branch-1.5 maintenance branch of Spark. We
> *strongly recommend* all 1.5.0 users to upgrade to this release.
> 
> The full list of bug fixes is here: http://s.apache.org/spark-1.5.1
> 
> http://spark.apache.org/releases/spark-release-1-5-1.html
> 
> 
> (note: it can take a few hours for everything to be propagated, so you
> might get 404 on some download links, but everything should be in maven
> central already)
> 

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Sean Owen
The Spark releases include a source distribution and several binary
distributions. This is pretty normal for Apache projects. What are you
referring to here?

On Sun, Oct 11, 2015 at 3:26 PM, Daniel Gruno  wrote:
> Out of curiosity: How can you vote on a release that contains 34 binary 
> files? Surely a source code release should only contain source code and not 
> binaries, as you cannot verify the content of these.
>
> Looking forward to a response.
>
> With regards,
> Daniel.
>
> On 10/2/2015, 4:42:31 AM, Reynold Xin  wrote:
>> Hi All,
>>
>> Spark 1.5.1 is a maintenance release containing stability fixes. This
>> release is based on the branch-1.5 maintenance branch of Spark. We
>> *strongly recommend* all 1.5.0 users to upgrade to this release.
>>
>> The full list of bug fixes is here: http://s.apache.org/spark-1.5.1
>>
>> http://spark.apache.org/releases/spark-release-1-5-1.html
>>
>>
>> (note: it can take a few hours for everything to be propagated, so you
>> might get 404 on some download links, but everything should be in maven
>> central already)
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Daniel Gruno
On 10/11/2015 05:12 PM, Sean Owen wrote:
> The Spark releases include a source distribution and several binary
> distributions. This is pretty normal for Apache projects. What are you
> referring to here?

Surely the _source_ distribution does not contain binaries? How else can
you vote on a release if you don't know what it contains?

You can produce convenience downloads that contain binary files, yes,
but surely you need a source-only package which is the one you vote on,
that does not contain any binaries. Do you have such a thing? And where
may I find it?

With regards,
Daniel.

> 
> On Sun, Oct 11, 2015 at 3:26 PM, Daniel Gruno  wrote:
>> Out of curiosity: How can you vote on a release that contains 34 binary 
>> files? Surely a source code release should only contain source code and not 
>> binaries, as you cannot verify the content of these.
>>
>> Looking forward to a response.
>>
>> With regards,
>> Daniel.
>>
>> On 10/2/2015, 4:42:31 AM, Reynold Xin  wrote:
>>> Hi All,
>>>
>>> Spark 1.5.1 is a maintenance release containing stability fixes. This
>>> release is based on the branch-1.5 maintenance branch of Spark. We
>>> *strongly recommend* all 1.5.0 users to upgrade to this release.
>>>
>>> The full list of bug fixes is here: http://s.apache.org/spark-1.5.1
>>>
>>> http://spark.apache.org/releases/spark-release-1-5-1.html
>>>
>>>
>>> (note: it can take a few hours for everything to be propagated, so you
>>> might get 404 on some download links, but everything should be in maven
>>> central already)
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Sean Owen
Daniel: we did not vote on a tag. Please again read the VOTE email I
linked to you:

http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none

among other things, it contains a link to the concrete source (and
binary) distribution under vote:

http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/

You can still examine it, sure.

Dependencies are *not* bundled in the source release. You're again
misunderstanding what you are seeing. Read my email again.

I am still pretty confused about what the problem is. This is entirely
business as usual for ASF projects. I'll follow up with you offline if
you have any more doubts.

On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno  wrote:
> Here's my issue:
>
> How am I to audit that the dependencies you bundle are in fact what you
> claim they are?  How do I know they don't contain malware or - in light
> of recent events - emissions test rigging? ;)
>
> I am not interested in a git tag - that means nothing in the ASF voting
> process, you cannot vote on a tag, only on a release candidate. The VCS
> in use is irrelevant in this issue. If you can point me to a release
> candidate archive that was voted upon and does not contain binary
> applications, all is well.
>
> If there is no such thing, and we cannot come to an understanding, I
> will exercise my ASF Members' rights and bring this to the attention of
> the board of directors and ask for a clarification of the legality of this.
>
> I find it highly irregular. Perhaps it is something some projects do in
> the Java community, but that doesn't make it permissible in my view.
>
> With regards,
> Daniel.
>
>
> On 10/11/2015 05:42 PM, Sean Owen wrote:
>> Still confused. Why are you saying we didn't vote on an archive? refer
>> to the email I linked, which includes both the git tag and a link to
>> all generated artifacts (also in my email).
>>
>> So, there are two things at play here:
>>
>> First, I am not sure what you mean that a source distro can't have
>> binary files. It's supposed to have the source code of Spark, and
>> shouldn't contain binary Spark. Nothing you listed are Spark binaries.
>> However, a distribution might have a lot of things in it that support
>> the source build, like copies of tools, test files, etc.  That
>> explains I think the first couple lines that you identified.
>>
>> Still, I am curious why you are saying that would invalidate a source
>> release? I have never heard anything like that.
>>
>> Second, I do think there are some binaries in here that aren't
>> supposed to be there, like the build/ directory stuff. IIRC these were
>> included accidentally and won't be in the next release. At least, I
>> don't see why they need to be bundled. These are just local copies of
>> third party tools though, and don't really matter. As it happens, the
>> licenses that get distributed with the source distro even cover all of
>> this stuff. I think that's not supposed to be there, but, also don't
>> see it's 'invalid' as a result.
>>
>>
>> On Sun, Oct 11, 2015 at 4:33 PM, Daniel Gruno  wrote:
>>> On 10/11/2015 05:29 PM, Sean Owen wrote:
 Of course, but what's making you think this was a binary-only
 distribution?
>>>
>>> I'm not saying binary-only, I am saying your source release contains
>>> binary programs, which would invalidate a release vote. Is there a
>>> release candidate package, that is voted on (saying you have a git tag
>>> does not satisfy this criteria, you need to vote on an actual archive of
>>> files, otherwise there is no cogent proof of the release being from that
>>> specific git tag).
>>>
>>> Here's what I found in your source release:
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/sql/hive/src/test/resources/data/files/TestSerDe.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/sql/hive/src/test/resources/TestUDTF.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/R/pkg/inst/test_support/sparktestjar_2.10-1.0.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-reflect.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/build/zinc-0.3.5.3/lib/sbt-interface.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/build/zinc-0.3.5.3/lib/compiler-interface-sources.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/build/zinc-0.3.5.3/lib/incremental-compiler.jar
>>>
>>> Binary application (application/jar; charset=binary) found in
>>> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-compiler.jar
>>>
>>> Binary application 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Nicholas Chammas
You can find the source tagged for release on GitHub
, as was clearly
linked to in the thread to vote on the release (titled "[VOTE] Release
Apache Spark 1.5.1 (RC1)").

Is there something about that thread that was unclear?

Nick


On Sun, Oct 11, 2015 at 11:23 AM Daniel Gruno  wrote:

> On 10/11/2015 05:12 PM, Sean Owen wrote:
> > The Spark releases include a source distribution and several binary
> > distributions. This is pretty normal for Apache projects. What are you
> > referring to here?
>
> Surely the _source_ distribution does not contain binaries? How else can
> you vote on a release if you don't know what it contains?
>
> You can produce convenience downloads that contain binary files, yes,
> but surely you need a source-only package which is the one you vote on,
> that does not contain any binaries. Do you have such a thing? And where
> may I find it?
>
> With regards,
> Daniel.
>
> >
> > On Sun, Oct 11, 2015 at 3:26 PM, Daniel Gruno 
> wrote:
> >> Out of curiosity: How can you vote on a release that contains 34 binary
> files? Surely a source code release should only contain source code and not
> binaries, as you cannot verify the content of these.
> >>
> >> Looking forward to a response.
> >>
> >> With regards,
> >> Daniel.
> >>
> >> On 10/2/2015, 4:42:31 AM, Reynold Xin  wrote:
> >>> Hi All,
> >>>
> >>> Spark 1.5.1 is a maintenance release containing stability fixes. This
> >>> release is based on the branch-1.5 maintenance branch of Spark. We
> >>> *strongly recommend* all 1.5.0 users to upgrade to this release.
> >>>
> >>> The full list of bug fixes is here: http://s.apache.org/spark-1.5.1
> >>>
> >>> http://spark.apache.org/releases/spark-release-1-5-1.html
> >>>
> >>>
> >>> (note: it can take a few hours for everything to be propagated, so you
> >>> might get 404 on some download links, but everything should be in maven
> >>> central already)
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >> For additional commands, e-mail: dev-h...@spark.apache.org
> >>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Sean Owen
Of course, but what's making you think this was a binary-only
distribution? The downloads page points you directly to the source
distro: http://spark.apache.org/downloads.html

Look for the last vote, and you'll find it was of course a vote on
source (and binary) artifacts:
http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/

On Sun, Oct 11, 2015 at 4:23 PM, Daniel Gruno  wrote:
> On 10/11/2015 05:12 PM, Sean Owen wrote:
>> The Spark releases include a source distribution and several binary
>> distributions. This is pretty normal for Apache projects. What are you
>> referring to here?
>
> Surely the _source_ distribution does not contain binaries? How else can
> you vote on a release if you don't know what it contains?
>
> You can produce convenience downloads that contain binary files, yes,
> but surely you need a source-only package which is the one you vote on,
> that does not contain any binaries. Do you have such a thing? And where
> may I find it?
>
> With regards,
> Daniel.
>
>>
>> On Sun, Oct 11, 2015 at 3:26 PM, Daniel Gruno  wrote:
>>> Out of curiosity: How can you vote on a release that contains 34 binary 
>>> files? Surely a source code release should only contain source code and not 
>>> binaries, as you cannot verify the content of these.
>>>
>>> Looking forward to a response.
>>>
>>> With regards,
>>> Daniel.
>>>
>>> On 10/2/2015, 4:42:31 AM, Reynold Xin  wrote:
 Hi All,

 Spark 1.5.1 is a maintenance release containing stability fixes. This
 release is based on the branch-1.5 maintenance branch of Spark. We
 *strongly recommend* all 1.5.0 users to upgrade to this release.

 The full list of bug fixes is here: http://s.apache.org/spark-1.5.1

 http://spark.apache.org/releases/spark-release-1-5-1.html


 (note: it can take a few hours for everything to be propagated, so you
 might get 404 on some download links, but everything should be in maven
 central already)

>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Daniel Gruno
On 10/11/2015 05:29 PM, Sean Owen wrote:
> Of course, but what's making you think this was a binary-only
> distribution? 

I'm not saying binary-only, I am saying your source release contains
binary programs, which would invalidate a release vote. Is there a
release candidate package, that is voted on (saying you have a git tag
does not satisfy this criteria, you need to vote on an actual archive of
files, otherwise there is no cogent proof of the release being from that
specific git tag).

Here's what I found in your source release:

Binary application (application/jar; charset=binary) found in
spark-1.5.1/sql/hive/src/test/resources/data/files/TestSerDe.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/sql/hive/src/test/resources/TestUDTF.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/R/pkg/inst/test_support/sparktestjar_2.10-1.0.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/scala-reflect.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/sbt-interface.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/compiler-interface-sources.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/incremental-compiler.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/scala-compiler.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/zinc.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/zinc-0.3.5.3/lib/scala-library.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/misc/scala-devel/plugins/continuations.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scala-reflect.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/akka-actors.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/typesafe-config.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scala-actors-migration.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scala-actors.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scalap.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scala-swing.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scala-compiler.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/lib/scala-library.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scala-reflect-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scala-swing-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scalap-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scala-actors-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scala-partest-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scala-library-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/fjbg-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/scala-compiler-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/scala-2.10.4/src/msil-src.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/apache-maven-3.3.3/boot/plexus-classworlds-2.5.2.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/apache-maven-3.3.3/lib/guava-18.0.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/apache-maven-3.3.3/lib/wagon-http-2.9-shaded.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/apache-maven-3.3.3/lib/jsr250-api-1.0.jar

Binary application (application/jar; charset=binary) found in
spark-1.5.1/build/apache-maven-3.3.3/lib/javax.inject-1.jar



The downloads page points you directly to the source
> distro: http://spark.apache.org/downloads.html
> 
> Look for the last vote, and you'll find it was of course a vote on
> source (and binary) artifacts:
> 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Daniel Gruno
Here's my issue:

How am I to audit that the dependencies you bundle are in fact what you
claim they are?  How do I know they don't contain malware or - in light
of recent events - emissions test rigging? ;)

I am not interested in a git tag - that means nothing in the ASF voting
process, you cannot vote on a tag, only on a release candidate. The VCS
in use is irrelevant in this issue. If you can point me to a release
candidate archive that was voted upon and does not contain binary
applications, all is well.

If there is no such thing, and we cannot come to an understanding, I
will exercise my ASF Members' rights and bring this to the attention of
the board of directors and ask for a clarification of the legality of this.

I find it highly irregular. Perhaps it is something some projects do in
the Java community, but that doesn't make it permissible in my view.

With regards,
Daniel.


On 10/11/2015 05:42 PM, Sean Owen wrote:
> Still confused. Why are you saying we didn't vote on an archive? refer
> to the email I linked, which includes both the git tag and a link to
> all generated artifacts (also in my email).
> 
> So, there are two things at play here:
> 
> First, I am not sure what you mean that a source distro can't have
> binary files. It's supposed to have the source code of Spark, and
> shouldn't contain binary Spark. Nothing you listed are Spark binaries.
> However, a distribution might have a lot of things in it that support
> the source build, like copies of tools, test files, etc.  That
> explains I think the first couple lines that you identified.
> 
> Still, I am curious why you are saying that would invalidate a source
> release? I have never heard anything like that.
> 
> Second, I do think there are some binaries in here that aren't
> supposed to be there, like the build/ directory stuff. IIRC these were
> included accidentally and won't be in the next release. At least, I
> don't see why they need to be bundled. These are just local copies of
> third party tools though, and don't really matter. As it happens, the
> licenses that get distributed with the source distro even cover all of
> this stuff. I think that's not supposed to be there, but, also don't
> see it's 'invalid' as a result.
> 
> 
> On Sun, Oct 11, 2015 at 4:33 PM, Daniel Gruno  wrote:
>> On 10/11/2015 05:29 PM, Sean Owen wrote:
>>> Of course, but what's making you think this was a binary-only
>>> distribution?
>>
>> I'm not saying binary-only, I am saying your source release contains
>> binary programs, which would invalidate a release vote. Is there a
>> release candidate package, that is voted on (saying you have a git tag
>> does not satisfy this criteria, you need to vote on an actual archive of
>> files, otherwise there is no cogent proof of the release being from that
>> specific git tag).
>>
>> Here's what I found in your source release:
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/sql/hive/src/test/resources/data/files/TestSerDe.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/sql/hive/src/test/resources/TestUDTF.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/R/pkg/inst/test_support/sparktestjar_2.10-1.0.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-reflect.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/sbt-interface.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/compiler-interface-sources.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/incremental-compiler.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-compiler.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/zinc.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/zinc-0.3.5.3/lib/scala-library.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/scala-2.10.4/misc/scala-devel/plugins/continuations.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/scala-2.10.4/lib/scala-reflect.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/scala-2.10.4/lib/akka-actors.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/scala-2.10.4/lib/typesafe-config.jar
>>
>> Binary application (application/jar; charset=binary) found in
>> spark-1.5.1/build/scala-2.10.4/lib/scala-actors-migration.jar
>>
>> Binary application 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Sean Owen
Agree, but we are talking about the build/ bit right?

I don't agree that it invalidates the release, which is probably the more
important idea. As a point of process, you would not want to modify and
republish the artifact that was already released after being voted on -
unless it was invalid in which case we spin up 1.5.1.1 or something.

But that build/ directory should go in future releases.

I think he is talking about more than this though and the other jars look
like they are part of tests, and still nothing to do with Spark binaries.
Those can and should stay.

On Mon, Oct 12, 2015, 5:35 AM Patrick Wendell  wrote:

> I think Daniel is correct here. The source artifact incorrectly includes
> jars. It is inadvertent and not part of our intended release process. This
> was something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by
> updating our build scripts to fix it. However, our build environment was
> not using the most current version of the build scripts. See related links:
>
> https://issues.apache.org/jira/browse/SPARK-10511
> https://github.com/apache/spark/pull/8774/files
>
> I can update our build environment and we can repackage the Spark 1.5.1
> source tarball. To not include sources.
>
>
> - Patrick
>
> On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:
>
>> Daniel: we did not vote on a tag. Please again read the VOTE email I
>> linked to you:
>>
>>
>> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
>>
>> among other things, it contains a link to the concrete source (and
>> binary) distribution under vote:
>>
>> http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/
>>
>> You can still examine it, sure.
>>
>> Dependencies are *not* bundled in the source release. You're again
>> misunderstanding what you are seeing. Read my email again.
>>
>> I am still pretty confused about what the problem is. This is entirely
>> business as usual for ASF projects. I'll follow up with you offline if
>> you have any more doubts.
>>
>> On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
>> wrote:
>> > Here's my issue:
>> >
>> > How am I to audit that the dependencies you bundle are in fact what you
>> > claim they are?  How do I know they don't contain malware or - in light
>> > of recent events - emissions test rigging? ;)
>> >
>> > I am not interested in a git tag - that means nothing in the ASF voting
>> > process, you cannot vote on a tag, only on a release candidate. The VCS
>> > in use is irrelevant in this issue. If you can point me to a release
>> > candidate archive that was voted upon and does not contain binary
>> > applications, all is well.
>> >
>> > If there is no such thing, and we cannot come to an understanding, I
>> > will exercise my ASF Members' rights and bring this to the attention of
>> > the board of directors and ask for a clarification of the legality of
>> this.
>> >
>> > I find it highly irregular. Perhaps it is something some projects do in
>> > the Java community, but that doesn't make it permissible in my view.
>> >
>> > With regards,
>> > Daniel.
>> >
>> >
>> > On 10/11/2015 05:42 PM, Sean Owen wrote:
>> >> Still confused. Why are you saying we didn't vote on an archive? refer
>> >> to the email I linked, which includes both the git tag and a link to
>> >> all generated artifacts (also in my email).
>> >>
>> >> So, there are two things at play here:
>> >>
>> >> First, I am not sure what you mean that a source distro can't have
>> >> binary files. It's supposed to have the source code of Spark, and
>> >> shouldn't contain binary Spark. Nothing you listed are Spark binaries.
>> >> However, a distribution might have a lot of things in it that support
>> >> the source build, like copies of tools, test files, etc.  That
>> >> explains I think the first couple lines that you identified.
>> >>
>> >> Still, I am curious why you are saying that would invalidate a source
>> >> release? I have never heard anything like that.
>> >>
>> >> Second, I do think there are some binaries in here that aren't
>> >> supposed to be there, like the build/ directory stuff. IIRC these were
>> >> included accidentally and won't be in the next release. At least, I
>> >> don't see why they need to be bundled. These are just local copies of
>> >> third party tools though, and don't really matter. As it happens, the
>> >> licenses that get distributed with the source distro even cover all of
>> >> this stuff. I think that's not supposed to be there, but, also don't
>> >> see it's 'invalid' as a result.
>> >>
>> >>
>> >> On Sun, Oct 11, 2015 at 4:33 PM, Daniel Gruno 
>> wrote:
>> >>> On 10/11/2015 05:29 PM, Sean Owen wrote:
>>  Of course, but what's making you think this was a binary-only
>>  distribution?
>> >>>
>> >>> I'm not saying binary-only, I am saying your source release contains
>> >>> binary programs, which would 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Patrick Wendell
I think Daniel is correct here. The source artifact incorrectly includes
jars. It is inadvertent and not part of our intended release process. This
was something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by
updating our build scripts to fix it. However, our build environment was
not using the most current version of the build scripts. See related links:

https://issues.apache.org/jira/browse/SPARK-10511
https://github.com/apache/spark/pull/8774/files

I can update our build environment and we can repackage the Spark 1.5.1
source tarball. To not include sources.

- Patrick

On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:

> Daniel: we did not vote on a tag. Please again read the VOTE email I
> linked to you:
>
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
>
> among other things, it contains a link to the concrete source (and
> binary) distribution under vote:
>
> http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/
>
> You can still examine it, sure.
>
> Dependencies are *not* bundled in the source release. You're again
> misunderstanding what you are seeing. Read my email again.
>
> I am still pretty confused about what the problem is. This is entirely
> business as usual for ASF projects. I'll follow up with you offline if
> you have any more doubts.
>
> On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
> wrote:
> > Here's my issue:
> >
> > How am I to audit that the dependencies you bundle are in fact what you
> > claim they are?  How do I know they don't contain malware or - in light
> > of recent events - emissions test rigging? ;)
> >
> > I am not interested in a git tag - that means nothing in the ASF voting
> > process, you cannot vote on a tag, only on a release candidate. The VCS
> > in use is irrelevant in this issue. If you can point me to a release
> > candidate archive that was voted upon and does not contain binary
> > applications, all is well.
> >
> > If there is no such thing, and we cannot come to an understanding, I
> > will exercise my ASF Members' rights and bring this to the attention of
> > the board of directors and ask for a clarification of the legality of
> this.
> >
> > I find it highly irregular. Perhaps it is something some projects do in
> > the Java community, but that doesn't make it permissible in my view.
> >
> > With regards,
> > Daniel.
> >
> >
> > On 10/11/2015 05:42 PM, Sean Owen wrote:
> >> Still confused. Why are you saying we didn't vote on an archive? refer
> >> to the email I linked, which includes both the git tag and a link to
> >> all generated artifacts (also in my email).
> >>
> >> So, there are two things at play here:
> >>
> >> First, I am not sure what you mean that a source distro can't have
> >> binary files. It's supposed to have the source code of Spark, and
> >> shouldn't contain binary Spark. Nothing you listed are Spark binaries.
> >> However, a distribution might have a lot of things in it that support
> >> the source build, like copies of tools, test files, etc.  That
> >> explains I think the first couple lines that you identified.
> >>
> >> Still, I am curious why you are saying that would invalidate a source
> >> release? I have never heard anything like that.
> >>
> >> Second, I do think there are some binaries in here that aren't
> >> supposed to be there, like the build/ directory stuff. IIRC these were
> >> included accidentally and won't be in the next release. At least, I
> >> don't see why they need to be bundled. These are just local copies of
> >> third party tools though, and don't really matter. As it happens, the
> >> licenses that get distributed with the source distro even cover all of
> >> this stuff. I think that's not supposed to be there, but, also don't
> >> see it's 'invalid' as a result.
> >>
> >>
> >> On Sun, Oct 11, 2015 at 4:33 PM, Daniel Gruno 
> wrote:
> >>> On 10/11/2015 05:29 PM, Sean Owen wrote:
>  Of course, but what's making you think this was a binary-only
>  distribution?
> >>>
> >>> I'm not saying binary-only, I am saying your source release contains
> >>> binary programs, which would invalidate a release vote. Is there a
> >>> release candidate package, that is voted on (saying you have a git tag
> >>> does not satisfy this criteria, you need to vote on an actual archive
> of
> >>> files, otherwise there is no cogent proof of the release being from
> that
> >>> specific git tag).
> >>>
> >>> Here's what I found in your source release:
> >>>
> >>> Binary application (application/jar; charset=binary) found in
> >>> spark-1.5.1/sql/hive/src/test/resources/data/files/TestSerDe.jar
> >>>
> >>> Binary application (application/jar; charset=binary) found in
> >>>
> spark-1.5.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test.jar
> >>>
> >>> Binary application (application/jar; charset=binary) found in
> >>> 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Patrick Wendell
*to not include binaries.

On Sun, Oct 11, 2015 at 9:35 PM, Patrick Wendell  wrote:

> I think Daniel is correct here. The source artifact incorrectly includes
> jars. It is inadvertent and not part of our intended release process. This
> was something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by
> updating our build scripts to fix it. However, our build environment was
> not using the most current version of the build scripts. See related links:
>
> https://issues.apache.org/jira/browse/SPARK-10511
> https://github.com/apache/spark/pull/8774/files
>
> I can update our build environment and we can repackage the Spark 1.5.1
> source tarball. To not include sources.
>
> - Patrick
>
> On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:
>
>> Daniel: we did not vote on a tag. Please again read the VOTE email I
>> linked to you:
>>
>>
>> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
>>
>> among other things, it contains a link to the concrete source (and
>> binary) distribution under vote:
>>
>> http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/
>>
>> You can still examine it, sure.
>>
>> Dependencies are *not* bundled in the source release. You're again
>> misunderstanding what you are seeing. Read my email again.
>>
>> I am still pretty confused about what the problem is. This is entirely
>> business as usual for ASF projects. I'll follow up with you offline if
>> you have any more doubts.
>>
>> On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
>> wrote:
>> > Here's my issue:
>> >
>> > How am I to audit that the dependencies you bundle are in fact what you
>> > claim they are?  How do I know they don't contain malware or - in light
>> > of recent events - emissions test rigging? ;)
>> >
>> > I am not interested in a git tag - that means nothing in the ASF voting
>> > process, you cannot vote on a tag, only on a release candidate. The VCS
>> > in use is irrelevant in this issue. If you can point me to a release
>> > candidate archive that was voted upon and does not contain binary
>> > applications, all is well.
>> >
>> > If there is no such thing, and we cannot come to an understanding, I
>> > will exercise my ASF Members' rights and bring this to the attention of
>> > the board of directors and ask for a clarification of the legality of
>> this.
>> >
>> > I find it highly irregular. Perhaps it is something some projects do in
>> > the Java community, but that doesn't make it permissible in my view.
>> >
>> > With regards,
>> > Daniel.
>> >
>> >
>> > On 10/11/2015 05:42 PM, Sean Owen wrote:
>> >> Still confused. Why are you saying we didn't vote on an archive? refer
>> >> to the email I linked, which includes both the git tag and a link to
>> >> all generated artifacts (also in my email).
>> >>
>> >> So, there are two things at play here:
>> >>
>> >> First, I am not sure what you mean that a source distro can't have
>> >> binary files. It's supposed to have the source code of Spark, and
>> >> shouldn't contain binary Spark. Nothing you listed are Spark binaries.
>> >> However, a distribution might have a lot of things in it that support
>> >> the source build, like copies of tools, test files, etc.  That
>> >> explains I think the first couple lines that you identified.
>> >>
>> >> Still, I am curious why you are saying that would invalidate a source
>> >> release? I have never heard anything like that.
>> >>
>> >> Second, I do think there are some binaries in here that aren't
>> >> supposed to be there, like the build/ directory stuff. IIRC these were
>> >> included accidentally and won't be in the next release. At least, I
>> >> don't see why they need to be bundled. These are just local copies of
>> >> third party tools though, and don't really matter. As it happens, the
>> >> licenses that get distributed with the source distro even cover all of
>> >> this stuff. I think that's not supposed to be there, but, also don't
>> >> see it's 'invalid' as a result.
>> >>
>> >>
>> >> On Sun, Oct 11, 2015 at 4:33 PM, Daniel Gruno 
>> wrote:
>> >>> On 10/11/2015 05:29 PM, Sean Owen wrote:
>>  Of course, but what's making you think this was a binary-only
>>  distribution?
>> >>>
>> >>> I'm not saying binary-only, I am saying your source release contains
>> >>> binary programs, which would invalidate a release vote. Is there a
>> >>> release candidate package, that is voted on (saying you have a git tag
>> >>> does not satisfy this criteria, you need to vote on an actual archive
>> of
>> >>> files, otherwise there is no cogent proof of the release being from
>> that
>> >>> specific git tag).
>> >>>
>> >>> Here's what I found in your source release:
>> >>>
>> >>> Binary application (application/jar; charset=binary) found in
>> >>> spark-1.5.1/sql/hive/src/test/resources/data/files/TestSerDe.jar
>> >>>
>> >>> Binary application 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Patrick Wendell
Oh I see - yes it's the build/. I always thought release votes related to a
source tag rather than specific binaries. But maybe we can just fix it in
1.5.2 if there is concern about mutating binaries. It seems reasonable to
me.

For tests... in the past we've tried to avoid having jars inside of the
source tree, including some effort to generate jars on the fly which a lot
of our tests use. I am not sure whether it's a firm policy that you can't
have jars in test folders, though. If it is, we could probably do some
magic to get rid of these few ones that have crept in.

- Patrick

On Sun, Oct 11, 2015 at 9:57 PM, Sean Owen  wrote:

> Agree, but we are talking about the build/ bit right?
>
> I don't agree that it invalidates the release, which is probably the more
> important idea. As a point of process, you would not want to modify and
> republish the artifact that was already released after being voted on -
> unless it was invalid in which case we spin up 1.5.1.1 or something.
>
> But that build/ directory should go in future releases.
>
> I think he is talking about more than this though and the other jars look
> like they are part of tests, and still nothing to do with Spark binaries.
> Those can and should stay.
>
> On Mon, Oct 12, 2015, 5:35 AM Patrick Wendell  wrote:
>
>> I think Daniel is correct here. The source artifact incorrectly includes
>> jars. It is inadvertent and not part of our intended release process. This
>> was something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by
>> updating our build scripts to fix it. However, our build environment was
>> not using the most current version of the build scripts. See related links:
>>
>> https://issues.apache.org/jira/browse/SPARK-10511
>> https://github.com/apache/spark/pull/8774/files
>>
>> I can update our build environment and we can repackage the Spark 1.5.1
>> source tarball. To not include sources.
>>
>>
>> - Patrick
>>
>> On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:
>>
>>> Daniel: we did not vote on a tag. Please again read the VOTE email I
>>> linked to you:
>>>
>>>
>>> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
>>>
>>> among other things, it contains a link to the concrete source (and
>>> binary) distribution under vote:
>>>
>>> http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/
>>>
>>> You can still examine it, sure.
>>>
>>> Dependencies are *not* bundled in the source release. You're again
>>> misunderstanding what you are seeing. Read my email again.
>>>
>>> I am still pretty confused about what the problem is. This is entirely
>>> business as usual for ASF projects. I'll follow up with you offline if
>>> you have any more doubts.
>>>
>>> On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
>>> wrote:
>>> > Here's my issue:
>>> >
>>> > How am I to audit that the dependencies you bundle are in fact what you
>>> > claim they are?  How do I know they don't contain malware or - in light
>>> > of recent events - emissions test rigging? ;)
>>> >
>>> > I am not interested in a git tag - that means nothing in the ASF voting
>>> > process, you cannot vote on a tag, only on a release candidate. The VCS
>>> > in use is irrelevant in this issue. If you can point me to a release
>>> > candidate archive that was voted upon and does not contain binary
>>> > applications, all is well.
>>> >
>>> > If there is no such thing, and we cannot come to an understanding, I
>>> > will exercise my ASF Members' rights and bring this to the attention of
>>> > the board of directors and ask for a clarification of the legality of
>>> this.
>>> >
>>> > I find it highly irregular. Perhaps it is something some projects do in
>>> > the Java community, but that doesn't make it permissible in my view.
>>> >
>>> > With regards,
>>> > Daniel.
>>> >
>>> >
>>> > On 10/11/2015 05:42 PM, Sean Owen wrote:
>>> >> Still confused. Why are you saying we didn't vote on an archive? refer
>>> >> to the email I linked, which includes both the git tag and a link to
>>> >> all generated artifacts (also in my email).
>>> >>
>>> >> So, there are two things at play here:
>>> >>
>>> >> First, I am not sure what you mean that a source distro can't have
>>> >> binary files. It's supposed to have the source code of Spark, and
>>> >> shouldn't contain binary Spark. Nothing you listed are Spark binaries.
>>> >> However, a distribution might have a lot of things in it that support
>>> >> the source build, like copies of tools, test files, etc.  That
>>> >> explains I think the first couple lines that you identified.
>>> >>
>>> >> Still, I am curious why you are saying that would invalidate a source
>>> >> release? I have never heard anything like that.
>>> >>
>>> >> Second, I do think there are some binaries in here that aren't
>>> >> supposed to be there, like the build/ directory stuff. IIRC these were
>>> >> 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Patrick Wendell
Yeah I mean I definitely think we're not violating the *spirit* of the "no
binaries" policy, in that we do not include any binary code that is used at
runtime. This is because the binaries we distribute relate only to build
and testing.

Whether we are violating the *letter* of the policy, I'm not so sure. In
the very strictest interpretation of "there cannot be any binary files in
your downloaded tarball" - we aren't honoring that. We got a lot of people
complaining about the sbt jar for instance when we were in the incubator. I
found those complaints a little pedantic, but we ended up removing it from
our source tree and adding things to download it for the user.

- Patrick

On Sun, Oct 11, 2015 at 10:12 PM, Sean Owen  wrote:

> No we are voting on the artifacts being released (too) in principle.
> Although of course the artifacts should be a deterministic function of the
> source at a certain point in time.
>
> I think the concern is about putting Spark binaries or its dependencies
> into a source release. That should not happen, but it is not what has
> happened here.
>
> On Mon, Oct 12, 2015, 6:03 AM Patrick Wendell  wrote:
>
>> Oh I see - yes it's the build/. I always thought release votes related to
>> a source tag rather than specific binaries. But maybe we can just fix it in
>> 1.5.2 if there is concern about mutating binaries. It seems reasonable to
>> me.
>>
>> For tests... in the past we've tried to avoid having jars inside of the
>> source tree, including some effort to generate jars on the fly which a lot
>> of our tests use. I am not sure whether it's a firm policy that you can't
>> have jars in test folders, though. If it is, we could probably do some
>> magic to get rid of these few ones that have crept in.
>>
>> - Patrick
>>
>> On Sun, Oct 11, 2015 at 9:57 PM, Sean Owen  wrote:
>>
>>> Agree, but we are talking about the build/ bit right?
>>>
>>> I don't agree that it invalidates the release, which is probably the
>>> more important idea. As a point of process, you would not want to modify
>>> and republish the artifact that was already released after being voted on -
>>> unless it was invalid in which case we spin up 1.5.1.1 or something.
>>>
>>> But that build/ directory should go in future releases.
>>>
>>> I think he is talking about more than this though and the other jars
>>> look like they are part of tests, and still nothing to do with Spark
>>> binaries. Those can and should stay.
>>>
>>> On Mon, Oct 12, 2015, 5:35 AM Patrick Wendell 
>>> wrote:
>>>
 I think Daniel is correct here. The source artifact incorrectly
 includes jars. It is inadvertent and not part of our intended release
 process. This was something I noticed in Spark 1.5.0 and filed a JIRA and
 was fixed by updating our build scripts to fix it. However, our build
 environment was not using the most current version of the build scripts.
 See related links:

 https://issues.apache.org/jira/browse/SPARK-10511
 https://github.com/apache/spark/pull/8774/files

 I can update our build environment and we can repackage the Spark 1.5.1
 source tarball. To not include sources.


 - Patrick

 On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:

> Daniel: we did not vote on a tag. Please again read the VOTE email I
> linked to you:
>
>
> http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none
>
> among other things, it contains a link to the concrete source (and
> binary) distribution under vote:
>
> http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/
>
> You can still examine it, sure.
>
> Dependencies are *not* bundled in the source release. You're again
> misunderstanding what you are seeing. Read my email again.
>
> I am still pretty confused about what the problem is. This is entirely
> business as usual for ASF projects. I'll follow up with you offline if
> you have any more doubts.
>
> On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
> wrote:
> > Here's my issue:
> >
> > How am I to audit that the dependencies you bundle are in fact what
> you
> > claim they are?  How do I know they don't contain malware or - in
> light
> > of recent events - emissions test rigging? ;)
> >
> > I am not interested in a git tag - that means nothing in the ASF
> voting
> > process, you cannot vote on a tag, only on a release candidate. The
> VCS
> > in use is irrelevant in this issue. If you can point me to a release
> > candidate archive that was voted upon and does not contain binary
> > applications, all is well.
> >
> > If there is no such thing, and we cannot come to an understanding, I
> > will exercise my ASF 

Re: [ANNOUNCE] Announcing Spark 1.5.1

2015-10-11 Thread Sean Owen
No we are voting on the artifacts being released (too) in principle.
Although of course the artifacts should be a deterministic function of the
source at a certain point in time.

I think the concern is about putting Spark binaries or its dependencies
into a source release. That should not happen, but it is not what has
happened here.

On Mon, Oct 12, 2015, 6:03 AM Patrick Wendell  wrote:

> Oh I see - yes it's the build/. I always thought release votes related to
> a source tag rather than specific binaries. But maybe we can just fix it in
> 1.5.2 if there is concern about mutating binaries. It seems reasonable to
> me.
>
> For tests... in the past we've tried to avoid having jars inside of the
> source tree, including some effort to generate jars on the fly which a lot
> of our tests use. I am not sure whether it's a firm policy that you can't
> have jars in test folders, though. If it is, we could probably do some
> magic to get rid of these few ones that have crept in.
>
> - Patrick
>
> On Sun, Oct 11, 2015 at 9:57 PM, Sean Owen  wrote:
>
>> Agree, but we are talking about the build/ bit right?
>>
>> I don't agree that it invalidates the release, which is probably the more
>> important idea. As a point of process, you would not want to modify and
>> republish the artifact that was already released after being voted on -
>> unless it was invalid in which case we spin up 1.5.1.1 or something.
>>
>> But that build/ directory should go in future releases.
>>
>> I think he is talking about more than this though and the other jars look
>> like they are part of tests, and still nothing to do with Spark binaries.
>> Those can and should stay.
>>
>> On Mon, Oct 12, 2015, 5:35 AM Patrick Wendell  wrote:
>>
>>> I think Daniel is correct here. The source artifact incorrectly includes
>>> jars. It is inadvertent and not part of our intended release process. This
>>> was something I noticed in Spark 1.5.0 and filed a JIRA and was fixed by
>>> updating our build scripts to fix it. However, our build environment was
>>> not using the most current version of the build scripts. See related links:
>>>
>>> https://issues.apache.org/jira/browse/SPARK-10511
>>> https://github.com/apache/spark/pull/8774/files
>>>
>>> I can update our build environment and we can repackage the Spark 1.5.1
>>> source tarball. To not include sources.
>>>
>>>
>>> - Patrick
>>>
>>> On Sun, Oct 11, 2015 at 8:53 AM, Sean Owen  wrote:
>>>
 Daniel: we did not vote on a tag. Please again read the VOTE email I
 linked to you:


 http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-5-1-RC1-tt14310.html#none

 among other things, it contains a link to the concrete source (and
 binary) distribution under vote:

 http://people.apache.org/~pwendell/spark-releases/spark-1.5.1-rc1-bin/

 You can still examine it, sure.

 Dependencies are *not* bundled in the source release. You're again
 misunderstanding what you are seeing. Read my email again.

 I am still pretty confused about what the problem is. This is entirely
 business as usual for ASF projects. I'll follow up with you offline if
 you have any more doubts.

 On Sun, Oct 11, 2015 at 4:49 PM, Daniel Gruno 
 wrote:
 > Here's my issue:
 >
 > How am I to audit that the dependencies you bundle are in fact what
 you
 > claim they are?  How do I know they don't contain malware or - in
 light
 > of recent events - emissions test rigging? ;)
 >
 > I am not interested in a git tag - that means nothing in the ASF
 voting
 > process, you cannot vote on a tag, only on a release candidate. The
 VCS
 > in use is irrelevant in this issue. If you can point me to a release
 > candidate archive that was voted upon and does not contain binary
 > applications, all is well.
 >
 > If there is no such thing, and we cannot come to an understanding, I
 > will exercise my ASF Members' rights and bring this to the attention
 of
 > the board of directors and ask for a clarification of the legality of
 this.
 >
 > I find it highly irregular. Perhaps it is something some projects do
 in
 > the Java community, but that doesn't make it permissible in my view.
 >
 > With regards,
 > Daniel.
 >
 >
 > On 10/11/2015 05:42 PM, Sean Owen wrote:
 >> Still confused. Why are you saying we didn't vote on an archive?
 refer
 >> to the email I linked, which includes both the git tag and a link to
 >> all generated artifacts (also in my email).
 >>
 >> So, there are two things at play here:
 >>
 >> First, I am not sure what you mean that a source distro can't have
 >> binary files. It's supposed to have the source code of Spark, and
 >> shouldn't contain binary 

[ANNOUNCE] Announcing Spark 1.5.1

2015-10-01 Thread Reynold Xin
Hi All,

Spark 1.5.1 is a maintenance release containing stability fixes. This
release is based on the branch-1.5 maintenance branch of Spark. We
*strongly recommend* all 1.5.0 users to upgrade to this release.

The full list of bug fixes is here: http://s.apache.org/spark-1.5.1

http://spark.apache.org/releases/spark-release-1-5-1.html


(note: it can take a few hours for everything to be propagated, so you
might get 404 on some download links, but everything should be in maven
central already)