Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-28 Thread Patrick Wendell
Hey Folks,

I updated the site to include the Apache mirror list for each file. I
actually put it is the first download location, before the direct
download link.

I played around with the mirror network a bit, the performance was not
bad, based on sampling from a few vantage points. I found it to be
always worse then CloudFront, but typically not *much* worse. So I
actually think if we can find a way to have a direct link to the
nearest Apache mirror, we could just remove the CloudFront link
entirely.

I looked into it and apparently we're not the first apache project to
have this problem. Many of the bigger projects already use some fancy
selection to embed a direct link to the closest mirror:

http://httpd.apache.org/download.cgi

- Patrick

On Fri, Sep 27, 2013 at 10:10 AM, Chris Mattmann  wrote:
> Yeah the sigs are usually available from here:
>
> http://www.apache.org/dist//../
>
> So for us:
>
> http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/
>
>
> Cheers,
> Chris
>
>
>
> -Original Message-
> From: Matei Zaharia 
> Reply-To: "dev@spark.incubator.apache.org" 
> Date: Friday, September 27, 2013 10:06 AM
> To: "dev@spark.incubator.apache.org" 
> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>
>>If the mirrors don't have the signatures, then we should probably link to
>>the mirrors and the signatures separately. It's definitely important to
>>have a link to the mirrors so people can get this through ASF
>>infrastructure without hitting only the main server.
>>
>>It's true that they don't seem to have them, even for other projects --
>>for example check out
>>http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-2.0.5-alpha/.
>>
>>Matei
>>
>>On Sep 27, 2013, at 12:04 AM, Patrick Wendell  wrote:
>>
>>> On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann 
>>>wrote:
>>>> Hey Matei yep they have the signatures on them too.
>>>>
>>>> Cheers,
>>>> Chris
>>>
>>> Chris - I checked a bunch of the mirrors and none of them have the
>>> signatures... am I looking in the wrong place?
>>>
>>>
>>>http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubatin
>>>g/
>>> http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/
>>> http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/
>>>
>>> In response to the comment about thousands of downloads - I didn't
>>> mean at all to suggest that Apache couldn't handle this number, in
>>> fact, I'm sure they can! I just wanted to point out that we are
>>> keeping in mind the existing habits and processes of our user base,
>>> and trying to make the transition smooth for them.
>>>
>>> - Patrick
>>
>
>


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-27 Thread Chris Mattmann
Yeah the sigs are usually available from here:

http://www.apache.org/dist//../

So for us:

http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/


Cheers,
Chris



-Original Message-
From: Matei Zaharia 
Reply-To: "dev@spark.incubator.apache.org" 
Date: Friday, September 27, 2013 10:06 AM
To: "dev@spark.incubator.apache.org" 
Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure

>If the mirrors don't have the signatures, then we should probably link to
>the mirrors and the signatures separately. It's definitely important to
>have a link to the mirrors so people can get this through ASF
>infrastructure without hitting only the main server.
>
>It's true that they don't seem to have them, even for other projects --
>for example check out
>http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-2.0.5-alpha/.
>
>Matei
>
>On Sep 27, 2013, at 12:04 AM, Patrick Wendell  wrote:
>
>> On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann 
>>wrote:
>>> Hey Matei yep they have the signatures on them too.
>>> 
>>> Cheers,
>>> Chris
>> 
>> Chris - I checked a bunch of the mirrors and none of them have the
>> signatures... am I looking in the wrong place?
>> 
>> 
>>http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubatin
>>g/
>> http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/
>> http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/
>> 
>> In response to the comment about thousands of downloads - I didn't
>> mean at all to suggest that Apache couldn't handle this number, in
>> fact, I'm sure they can! I just wanted to point out that we are
>> keeping in mind the existing habits and processes of our user base,
>> and trying to make the transition smooth for them.
>> 
>> - Patrick
>




Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-27 Thread Matei Zaharia
If the mirrors don't have the signatures, then we should probably link to the 
mirrors and the signatures separately. It's definitely important to have a link 
to the mirrors so people can get this through ASF infrastructure without 
hitting only the main server.

It's true that they don't seem to have them, even for other projects -- for 
example check out 
http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-2.0.5-alpha/.

Matei

On Sep 27, 2013, at 12:04 AM, Patrick Wendell  wrote:

> On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann  wrote:
>> Hey Matei yep they have the signatures on them too.
>> 
>> Cheers,
>> Chris
> 
> Chris - I checked a bunch of the mirrors and none of them have the
> signatures... am I looking in the wrong place?
> 
> http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubating/
> http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/
> http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/
> 
> In response to the comment about thousands of downloads - I didn't
> mean at all to suggest that Apache couldn't handle this number, in
> fact, I'm sure they can! I just wanted to point out that we are
> keeping in mind the existing habits and processes of our user base,
> and trying to make the transition smooth for them.
> 
> - Patrick



Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Patrick Wendell
On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann  wrote:
> Hey Matei yep they have the signatures on them too.
>
> Cheers,
> Chris

Chris - I checked a bunch of the mirrors and none of them have the
signatures... am I looking in the wrong place?

http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubating/
http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/
http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/

In response to the comment about thousands of downloads - I didn't
mean at all to suggest that Apache couldn't handle this number, in
fact, I'm sure they can! I just wanted to point out that we are
keeping in mind the existing habits and processes of our user base,
and trying to make the transition smooth for them.

- Patrick


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Reynold Xin
Is there a way we can track the number of downloads with the apache mirrors?

On Thursday, September 26, 2013, Chris Mattmann wrote:

> Hey Matei yep they have the signatures on them too.
>
> Cheers,
> Chris
>
>
> -Original Message-
> From: Matei Zaharia >
> Reply-To: "dev@spark.incubator.apache.org " <
> dev@spark.incubator.apache.org >
> Date: Thursday, September 26, 2013 8:11 PM
> To: "dev@spark.incubator.apache.org" 
> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>
> >Maybe we can replace the link to "official Apache download site" in the
> >release notes to point to the mirrors? Do the mirrors all have signatures
> >on them too?
> >
> >Matei
> >
> >On Sep 26, 2013, at 10:59 PM, Andy Konwinski 
> >wrote:
> >
> >> Thanks Roman and Chris,
> >>
> >> I see here http://www.apache.org/dev/release.html#mirroring that
> >>"Project
> >> download pages must link to the mirrors" but I don't see anything about
> >> ordering.
> >>
> >> I'm definitely +1 for including a link to the apache mirrors as required
> >> and providing the Cloudfront link first since this seems to satisfy the
> >> apache requirements and provide a better experience for users.
> >>
> >> Patrick. Thanks again for all your hard work on this release and for
> >> pushing back on parts of the Apache process as you go. That's how
> >> do-ocracies stay healthy and evolve.
> >> On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" <
> >> chris.a.mattm...@jpl.nasa.gov> wrote:
> >>
> >>> Hi Patrick will reply in more detail later but please know that
> >>>linking to
> >>> the apache download page is not a request it's a requirement. I will
> >>> explain more in a bit.
> >>>
> >>> Cheers,
> >>> Chris
> >>>
> >>> Sent from my iPhone
> >>>
> >>> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" 
> >>>wrote:
> >>>
> >>>> Chris et al,
> >>>>
> >>>> I'm -1 on this because it has many negative consequences for our
> >>> existing users:
> >>>>
> >>>> 1. Users who do automated downloads based on our posted URL's (of
> >>>> which we get many thousands each release) will no longer work. Now if
> >>>> they do "wget XXX" with our posted link, it will fail in a weird way
> >>>> to due to the redirect page. Is there a version of the closer.cgi
> >>>> script which just performs 302 redirects instead of asking me to click
> >>>> on a link?
> >>>>
> >>>> 2. All other users have to click through an additional page to
> >>>> download the software.
> >>>>
> >>>> 3. Amazon Cloudfront is, as a whole, much more reliable and higher
> >>>> bandwidth than the mirror network.
> >>>>
> >>>> These are my concerns, that basically we're causing our users to have
> >>>> a much worse experience. I've identified these concerns with moving to
> >>>> the apache mirror, but perhaps I've overlooked some benefits that
> >>>> would counteract these. Are there benefits?
> >>>>
> >>>> I completely agree that we need to send users to the signatures and
> >>>> hashes at the Apache release site (to verify the release). So I did
> >>>> add the link to this directly adjacent to the download.
> >>>>
> >>>> - Patrick
> >>>>
> >>>> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann 
> >>> wrote:
> >>>>> Hey Guys,
> >>>>>
> >>>>> Yep the link should by the dyn/closer.cgi link on the website and +1
> >>>>> to Roman's comment about auditing spark-project.org links to be
> >>> replaced
> >>>>> with ASF counterparts.
> >>>>>
> >>>>> Cheers,
> >>>>> Chris
> >>>>>
> >>>>>
> >>>>>
> >>>>> -Original Message



-- 

--
Reynold Xin, AMPLab, UC Berkeley
http://rxin.org


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Chris Mattmann
Hey Matei yep they have the signatures on them too.

Cheers,
Chris


-Original Message-
From: Matei Zaharia 
Reply-To: "dev@spark.incubator.apache.org" 
Date: Thursday, September 26, 2013 8:11 PM
To: "dev@spark.incubator.apache.org" 
Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure

>Maybe we can replace the link to "official Apache download site" in the
>release notes to point to the mirrors? Do the mirrors all have signatures
>on them too?
>
>Matei
>
>On Sep 26, 2013, at 10:59 PM, Andy Konwinski 
>wrote:
>
>> Thanks Roman and Chris,
>> 
>> I see here http://www.apache.org/dev/release.html#mirroring that
>>"Project
>> download pages must link to the mirrors" but I don't see anything about
>> ordering.
>> 
>> I'm definitely +1 for including a link to the apache mirrors as required
>> and providing the Cloudfront link first since this seems to satisfy the
>> apache requirements and provide a better experience for users.
>> 
>> Patrick. Thanks again for all your hard work on this release and for
>> pushing back on parts of the Apache process as you go. That's how
>> do-ocracies stay healthy and evolve.
>> On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" <
>> chris.a.mattm...@jpl.nasa.gov> wrote:
>> 
>>> Hi Patrick will reply in more detail later but please know that
>>>linking to
>>> the apache download page is not a request it's a requirement. I will
>>> explain more in a bit.
>>> 
>>> Cheers,
>>> Chris
>>> 
>>> Sent from my iPhone
>>> 
>>> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" 
>>>wrote:
>>> 
>>>> Chris et al,
>>>> 
>>>> I'm -1 on this because it has many negative consequences for our
>>> existing users:
>>>> 
>>>> 1. Users who do automated downloads based on our posted URL's (of
>>>> which we get many thousands each release) will no longer work. Now if
>>>> they do "wget XXX" with our posted link, it will fail in a weird way
>>>> to due to the redirect page. Is there a version of the closer.cgi
>>>> script which just performs 302 redirects instead of asking me to click
>>>> on a link?
>>>> 
>>>> 2. All other users have to click through an additional page to
>>>> download the software.
>>>> 
>>>> 3. Amazon Cloudfront is, as a whole, much more reliable and higher
>>>> bandwidth than the mirror network.
>>>> 
>>>> These are my concerns, that basically we're causing our users to have
>>>> a much worse experience. I've identified these concerns with moving to
>>>> the apache mirror, but perhaps I've overlooked some benefits that
>>>> would counteract these. Are there benefits?
>>>> 
>>>> I completely agree that we need to send users to the signatures and
>>>> hashes at the Apache release site (to verify the release). So I did
>>>> add the link to this directly adjacent to the download.
>>>> 
>>>> - Patrick
>>>> 
>>>> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann 
>>> wrote:
>>>>> Hey Guys,
>>>>> 
>>>>> Yep the link should by the dyn/closer.cgi link on the website and +1
>>>>> to Roman's comment about auditing spark-project.org links to be
>>> replaced
>>>>> with ASF counterparts.
>>>>> 
>>>>> Cheers,
>>>>> Chris
>>>>> 
>>>>> 
>>>>> 
>>>>> -Original Message-
>>>>> From: Patrick Wendell 
>>>>> Reply-To: "dev@spark.incubator.apache.org" <
>>> dev@spark.incubator.apache.org>
>>>>> Date: Wednesday, September 25, 2013 4:08 PM
>>>>> To: "dev@spark.incubator.apache.org" 
>>>>> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>>>>> 
>>>>>> Yep, we definitely need to just directly point people the location
>>>>>>at
>>>>>> apache.org where they can find the hashes. I just updated the
>>>>>>release
>>>>>> notes and downloads page to point to that site.
>>>>>> 
>>>>>> I just wanted to point out that mirroring these through a CDN seems
>>>>>> philosophically the same as mirroring t

Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Chris Mattmann
Hi Andy,

-Original Message-

From: Andy Konwinski 
Reply-To: "dev@spark.incubator.apache.org" 
Date: Thursday, September 26, 2013 7:59 PM
To: "dev@spark.incubator.apache.org" 
Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure

>Thanks Roman and Chris,
>
>I see here http://www.apache.org/dev/release.html#mirroring that "Project
>download pages must link to the mirrors" but I don't see anything about
>ordering.

Technically we ought to promote Apache's mirroring system as the Project
Endorsed
home for the project. As an Apache member and someone who values what the
Foundation
does for its projects and communities I don't think that's much to ask.

If you guys feel strongly about the ordering of the Cloud Front first I'm
open to it, I would just appreciate seeing some existing data showing that
you guys have users who have tried the mirroring system from the ASF and it
hasn't performed as well as the Amazon one.

>
>I'm definitely +1 for including a link to the apache mirrors as required
>and providing the Cloudfront link first since this seems to satisfy the
>apache requirements and provide a better experience for users.
>
>Patrick. Thanks again for all your hard work on this release and for
>pushing back on parts of the Apache process as you go. That's how
>do-ocracies stay healthy and evolve.

Here here. This project doesn't have a "boss" and it's not me :)
I'm just trying to spread my Apache knowledge and help you guys wear your
Apache hats too since the project lives at the ASF now. I think you'll find
the benefits of wearing those hats are many :)

Cheers,
Chris

>On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" <
>chris.a.mattm...@jpl.nasa.gov> wrote:
>
>> Hi Patrick will reply in more detail later but please know that linking
>>to
>> the apache download page is not a request it's a requirement. I will
>> explain more in a bit.
>>
>> Cheers,
>> Chris
>>
>> Sent from my iPhone
>>
>> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" 
>>wrote:
>>
>> > Chris et al,
>> >
>> > I'm -1 on this because it has many negative consequences for our
>> existing users:
>> >
>> > 1. Users who do automated downloads based on our posted URL's (of
>> > which we get many thousands each release) will no longer work. Now if
>> > they do "wget XXX" with our posted link, it will fail in a weird way
>> > to due to the redirect page. Is there a version of the closer.cgi
>> > script which just performs 302 redirects instead of asking me to click
>> > on a link?
>> >
>> > 2. All other users have to click through an additional page to
>> > download the software.
>> >
>> > 3. Amazon Cloudfront is, as a whole, much more reliable and higher
>> > bandwidth than the mirror network.
>> >
>> > These are my concerns, that basically we're causing our users to have
>> > a much worse experience. I've identified these concerns with moving to
>> > the apache mirror, but perhaps I've overlooked some benefits that
>> > would counteract these. Are there benefits?
>> >
>> > I completely agree that we need to send users to the signatures and
>> > hashes at the Apache release site (to verify the release). So I did
>> > add the link to this directly adjacent to the download.
>> >
>> > - Patrick
>> >
>> > On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann 
>> wrote:
>> >> Hey Guys,
>> >>
>> >> Yep the link should by the dyn/closer.cgi link on the website and +1
>> >> to Roman's comment about auditing spark-project.org links to be
>> replaced
>> >> with ASF counterparts.
>> >>
>> >> Cheers,
>> >> Chris
>> >>
>> >>
>> >>
>> >> -Original Message-
>> >> From: Patrick Wendell 
>> >> Reply-To: "dev@spark.incubator.apache.org" <
>> dev@spark.incubator.apache.org>
>> >> Date: Wednesday, September 25, 2013 4:08 PM
>> >> To: "dev@spark.incubator.apache.org" 
>> >> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>> >>
>> >>> Yep, we definitely need to just directly point people the location
>>at
>> >>> apache.org where they can find the hashes. I just updated the
>>release
>> >>> notes and downloads page to point to that site.
>> >>&g

Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Chris Mattmann
Hey Guys,

OK flight landed, so have a sec to reply in more detail:



-Original Message-
From: Patrick Wendell 
Reply-To: "dev@spark.incubator.apache.org" 
Date: Thursday, September 26, 2013 7:02 PM
To: "dev@spark.incubator.apache.org" 
Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure

>Chris et al,
>
>I'm -1 on this because it has many negative consequences for our existing
>users:
>
>1. Users who do automated downloads based on our posted URL's (of
>which we get many thousands each release) will no longer work.

Apache also has many 10s of thousands of downloads each release, depending
on
the project. The best example I can think of is Open Office which receives
10M downloads/day IIRC -- yes Ooo has some special downloading infra help
too,
but beyond that popular projects like Apache Lucene and Solr regularly see
4000+
downloads per day and the ASF mirroring system works fine.


> Now if
>they do "wget XXX" with our posted link, it will fail in a weird way
>to due to the redirect page. Is there a version of the closer.cgi
>script which just performs 302 redirects instead of asking me to click
>on a link?

You can do something like e.g.,

curl 
"http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubatin
g/spark-0.8.0-incubating-bin-cdh4.tgz" | grep http | grep tgz| sort -n
(and then some HTML strip magic)

Even better if you have Apache Tika installed, something like:

tika -t 
"http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubatin
g/spark-0.8.0-incubating-bin-cdh4.tgz" | grep http | grep tgz

produces:

http://mirror.nexcess.net/apache/incubator/spark/spark-0.8.0-incubating/spa
rk-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.nexcess.net/apache/incubator/spark/spark-0.8.0-incubating/sp
ark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.cs.utah.edu/incubator/spark/spark-0.8.0-incubating/spark-0.8
.0-incubating-bin-cdh4.tgz
 http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/spar
k-0.8.0-incubating-bin-cdh4.tgz
 http://www.carfab.com/apachesoftware/incubator/spark/spark-0.8.0-incubatin
g/spark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.petsads.us/incubator/spark/spark-0.8.0-incubating/spark-0.8.
0-incubating-bin-cdh4.tgz
 http://www.trieuvan.com/apache/incubator/spark/spark-0.8.0-incubating/spar
k-0.8.0-incubating-bin-cdh4.tgz
 http://mirrors.ibiblio.org/apache/incubator/spark/spark-0.8.0-incubating/s
park-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.olnevhost.net/pub/apache/incubator/spark/spark-0.8.0-incubat
ing/spark-0.8.0-incubating-bin-cdh4.tgz
 http://psg.mtu.edu/pub/apache/incubator/spark/spark-0.8.0-incubating/spark
-0.8.0-incubating-bin-cdh4.tgz
 http://apache.claz.org/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-
incubating-bin-cdh4.tgz
 http://mirror.metrocast.net/apache/incubator/spark/spark-0.8.0-incubating/
spark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.mirrors.lucidnetworks.net/incubator/spark/spark-0.8.0-incuba
ting/spark-0.8.0-incubating-bin-cdh4.tgz
 http://mirrors.gigenet.com/apache/incubator/spark/spark-0.8.0-incubating/s
park-0.8.0-incubating-bin-cdh4.tgz
 http://www.poolsaboveground.com/apache/incubator/spark/spark-0.8.0-incubat
ing/spark-0.8.0-incubating-bin-cdh4.tgz
 http://www.bizdirusa.com/mirrors/apache/incubator/spark/spark-0.8.0-incuba
ting/spark-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.sdunix.com/apache/incubator/spark/spark-0.8.0-incubating/spa
rk-0.8.0-incubating-bin-cdh4.tgz
 http://download.nextag.com/apache/incubator/spark/spark-0.8.0-incubating/s
park-0.8.0-incubating-bin-cdh4.tgz
 http://www.motorlogy.com/apache/incubator/spark/spark-0.8.0-incubating/spa
rk-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.cc.columbia.edu/pub/software/apache/incubator/spark/spark-0.
8.0-incubating/spark-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/sp
ark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.mirrors.hoobly.com/incubator/spark/spark-0.8.0-incubating/sp
ark-0.8.0-incubating-bin-cdh4.tgz
 http://www.eng.lsu.edu/mirrors/apache/incubator/spark/spark-0.8.0-incubati
ng/spark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.mesi.com.ar/incubator/spark/spark-0.8.0-incubating/spark-0.8
.0-incubating-bin-cdh4.tgz
 http://mirror.symnds.com/software/Apache/incubator/spark/spark-0.8.0-incub
ating/spark-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.reverse.net/pub/apache/incubator/spark/spark-0.8.0-incubatin
g/spark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.osuosl.org/incubator/spark/spark-0.8.0-incubating/spark-0.8.
0-incubating-bin-cdh4.tgz
 http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubating
/spark-0.8.0-incubating-bin-cdh4.tgz
 http://mirror.cogentco.com/pub/apache/incubator/spark/spark-0.8.0-incubati
ng/spark-0.8.0-incubating-bin-cdh4.tgz
 http://apache.spinellicreations.com/incubator/spark/spark-0.8.0-incubating
/spark-0.8.0-

Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Matei Zaharia
Maybe we can replace the link to "official Apache download site" in the release 
notes to point to the mirrors? Do the mirrors all have signatures on them too?

Matei

On Sep 26, 2013, at 10:59 PM, Andy Konwinski  wrote:

> Thanks Roman and Chris,
> 
> I see here http://www.apache.org/dev/release.html#mirroring that "Project
> download pages must link to the mirrors" but I don't see anything about
> ordering.
> 
> I'm definitely +1 for including a link to the apache mirrors as required
> and providing the Cloudfront link first since this seems to satisfy the
> apache requirements and provide a better experience for users.
> 
> Patrick. Thanks again for all your hard work on this release and for
> pushing back on parts of the Apache process as you go. That's how
> do-ocracies stay healthy and evolve.
> On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" <
> chris.a.mattm...@jpl.nasa.gov> wrote:
> 
>> Hi Patrick will reply in more detail later but please know that linking to
>> the apache download page is not a request it's a requirement. I will
>> explain more in a bit.
>> 
>> Cheers,
>> Chris
>> 
>> Sent from my iPhone
>> 
>> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell"  wrote:
>> 
>>> Chris et al,
>>> 
>>> I'm -1 on this because it has many negative consequences for our
>> existing users:
>>> 
>>> 1. Users who do automated downloads based on our posted URL's (of
>>> which we get many thousands each release) will no longer work. Now if
>>> they do "wget XXX" with our posted link, it will fail in a weird way
>>> to due to the redirect page. Is there a version of the closer.cgi
>>> script which just performs 302 redirects instead of asking me to click
>>> on a link?
>>> 
>>> 2. All other users have to click through an additional page to
>>> download the software.
>>> 
>>> 3. Amazon Cloudfront is, as a whole, much more reliable and higher
>>> bandwidth than the mirror network.
>>> 
>>> These are my concerns, that basically we're causing our users to have
>>> a much worse experience. I've identified these concerns with moving to
>>> the apache mirror, but perhaps I've overlooked some benefits that
>>> would counteract these. Are there benefits?
>>> 
>>> I completely agree that we need to send users to the signatures and
>>> hashes at the Apache release site (to verify the release). So I did
>>> add the link to this directly adjacent to the download.
>>> 
>>> - Patrick
>>> 
>>> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann 
>> wrote:
>>>> Hey Guys,
>>>> 
>>>> Yep the link should by the dyn/closer.cgi link on the website and +1
>>>> to Roman's comment about auditing spark-project.org links to be
>> replaced
>>>> with ASF counterparts.
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> 
>>>> 
>>>> -Original Message-
>>>> From: Patrick Wendell 
>>>> Reply-To: "dev@spark.incubator.apache.org" <
>> dev@spark.incubator.apache.org>
>>>> Date: Wednesday, September 25, 2013 4:08 PM
>>>> To: "dev@spark.incubator.apache.org" 
>>>> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>>>> 
>>>>> Yep, we definitely need to just directly point people the location at
>>>>> apache.org where they can find the hashes. I just updated the release
>>>>> notes and downloads page to point to that site.
>>>>> 
>>>>> I just wanted to point out that mirroring these through a CDN seems
>>>>> philosophically the same as mirroring through Apache, since in neither
>>>>> case do we expect the users to trust the artifact they download. We
>>>>> just need to be more explicit that we are, indeed, mirroring and
>>>>> explain that the trusted root is at apache.org
>>>>> 
>>>>> - Patrick
>>>>> 
>>>>> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik 
>> wrote:
>>>>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell 
>>>>>> wrote:
>>>>>>> Hey we've actually distributed our artifacts through amazon
>> cloudfront
>>>>>>> in the past (and that is where the website links redirect to).
>>>>>>> 
>>>>>>> Since the apache mirrors don't distribute signatures anyways,
>>>>>> 
>>>>>> True, but apache dist does. IOW, it is not uncommon for those
>>>>>> having an automated build/fetching systems to get bits from
>>>>>> one of the mirrors and then get the hashes directly from dist.
>>>>>> 
>>>>>> In your current case, I don't think I know of a way to do that.
>>>>>> 
>>>>>> Now, you may say that the current CDN you guys are you using
>>>>>> is functioning like a mirror -- well, I'd say that it needs to be
>>>>>> called out like one then.
>>>>>> 
>>>>>> Otherwise, as a naive user I *really* have to guess where
>>>>>> to get the hashes.
>>>>>> 
>>>>>>> what is the difference between linking to an apache mirror vs using a
>>>>>>> more
>>>>>>> robust CDN? If people want to verify the downloads they need to go to
>>>>>>> the apache root in either case.
>>>>>>> 
>>>>>>> Is this just a cultural thing or is there some security reason?
>>>>>> 
>>>>>> A bit of both I guess.
>>>>>> 
>>>>>> Thanks,
>>>>>> Roman.
>>>> 
>>>> 
>> 



Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Andy Konwinski
Thanks Roman and Chris,

I see here http://www.apache.org/dev/release.html#mirroring that "Project
download pages must link to the mirrors" but I don't see anything about
ordering.

I'm definitely +1 for including a link to the apache mirrors as required
and providing the Cloudfront link first since this seems to satisfy the
apache requirements and provide a better experience for users.

Patrick. Thanks again for all your hard work on this release and for
pushing back on parts of the Apache process as you go. That's how
do-ocracies stay healthy and evolve.
On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Hi Patrick will reply in more detail later but please know that linking to
> the apache download page is not a request it's a requirement. I will
> explain more in a bit.
>
> Cheers,
> Chris
>
> Sent from my iPhone
>
> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell"  wrote:
>
> > Chris et al,
> >
> > I'm -1 on this because it has many negative consequences for our
> existing users:
> >
> > 1. Users who do automated downloads based on our posted URL's (of
> > which we get many thousands each release) will no longer work. Now if
> > they do "wget XXX" with our posted link, it will fail in a weird way
> > to due to the redirect page. Is there a version of the closer.cgi
> > script which just performs 302 redirects instead of asking me to click
> > on a link?
> >
> > 2. All other users have to click through an additional page to
> > download the software.
> >
> > 3. Amazon Cloudfront is, as a whole, much more reliable and higher
> > bandwidth than the mirror network.
> >
> > These are my concerns, that basically we're causing our users to have
> > a much worse experience. I've identified these concerns with moving to
> > the apache mirror, but perhaps I've overlooked some benefits that
> > would counteract these. Are there benefits?
> >
> > I completely agree that we need to send users to the signatures and
> > hashes at the Apache release site (to verify the release). So I did
> > add the link to this directly adjacent to the download.
> >
> > - Patrick
> >
> > On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann 
> wrote:
> >> Hey Guys,
> >>
> >> Yep the link should by the dyn/closer.cgi link on the website and +1
> >> to Roman's comment about auditing spark-project.org links to be
> replaced
> >> with ASF counterparts.
> >>
> >> Cheers,
> >> Chris
> >>
> >>
> >>
> >> -Original Message-
> >> From: Patrick Wendell 
> >> Reply-To: "dev@spark.incubator.apache.org" <
> dev@spark.incubator.apache.org>
> >> Date: Wednesday, September 25, 2013 4:08 PM
> >> To: "dev@spark.incubator.apache.org" 
> >> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
> >>
> >>> Yep, we definitely need to just directly point people the location at
> >>> apache.org where they can find the hashes. I just updated the release
> >>> notes and downloads page to point to that site.
> >>>
> >>> I just wanted to point out that mirroring these through a CDN seems
> >>> philosophically the same as mirroring through Apache, since in neither
> >>> case do we expect the users to trust the artifact they download. We
> >>> just need to be more explicit that we are, indeed, mirroring and
> >>> explain that the trusted root is at apache.org
> >>>
> >>> - Patrick
> >>>
> >>> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik 
> wrote:
> >>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell 
> >>>> wrote:
> >>>>> Hey we've actually distributed our artifacts through amazon
> cloudfront
> >>>>> in the past (and that is where the website links redirect to).
> >>>>>
> >>>>> Since the apache mirrors don't distribute signatures anyways,
> >>>>
> >>>> True, but apache dist does. IOW, it is not uncommon for those
> >>>> having an automated build/fetching systems to get bits from
> >>>> one of the mirrors and then get the hashes directly from dist.
> >>>>
> >>>> In your current case, I don't think I know of a way to do that.
> >>>>
> >>>> Now, you may say that the current CDN you guys are you using
> >>>> is functioning like a mirror -- well, I'd say that it needs to be
> >>>> called out like one then.
> >>>>
> >>>> Otherwise, as a naive user I *really* have to guess where
> >>>> to get the hashes.
> >>>>
> >>>>> what is the difference between linking to an apache mirror vs using a
> >>>>> more
> >>>>> robust CDN? If people want to verify the downloads they need to go to
> >>>>> the apache root in either case.
> >>>>>
> >>>>> Is this just a cultural thing or is there some security reason?
> >>>>
> >>>> A bit of both I guess.
> >>>>
> >>>> Thanks,
> >>>> Roman.
> >>
> >>
>


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Mattmann, Chris A (398J)
Hi Patrick will reply in more detail later but please know that linking to the 
apache download page is not a request it's a requirement. I will explain more 
in a bit.

Cheers,
Chris

Sent from my iPhone

On Sep 26, 2013, at 8:09 PM, "Patrick Wendell"  wrote:

> Chris et al,
> 
> I'm -1 on this because it has many negative consequences for our existing 
> users:
> 
> 1. Users who do automated downloads based on our posted URL's (of
> which we get many thousands each release) will no longer work. Now if
> they do "wget XXX" with our posted link, it will fail in a weird way
> to due to the redirect page. Is there a version of the closer.cgi
> script which just performs 302 redirects instead of asking me to click
> on a link?
> 
> 2. All other users have to click through an additional page to
> download the software.
> 
> 3. Amazon Cloudfront is, as a whole, much more reliable and higher
> bandwidth than the mirror network.
> 
> These are my concerns, that basically we're causing our users to have
> a much worse experience. I've identified these concerns with moving to
> the apache mirror, but perhaps I've overlooked some benefits that
> would counteract these. Are there benefits?
> 
> I completely agree that we need to send users to the signatures and
> hashes at the Apache release site (to verify the release). So I did
> add the link to this directly adjacent to the download.
> 
> - Patrick
> 
> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann  wrote:
>> Hey Guys,
>> 
>> Yep the link should by the dyn/closer.cgi link on the website and +1
>> to Roman's comment about auditing spark-project.org links to be replaced
>> with ASF counterparts.
>> 
>> Cheers,
>> Chris
>> 
>> 
>> 
>> -Original Message-----
>> From: Patrick Wendell 
>> Reply-To: "dev@spark.incubator.apache.org" 
>> Date: Wednesday, September 25, 2013 4:08 PM
>> To: "dev@spark.incubator.apache.org" 
>> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>> 
>>> Yep, we definitely need to just directly point people the location at
>>> apache.org where they can find the hashes. I just updated the release
>>> notes and downloads page to point to that site.
>>> 
>>> I just wanted to point out that mirroring these through a CDN seems
>>> philosophically the same as mirroring through Apache, since in neither
>>> case do we expect the users to trust the artifact they download. We
>>> just need to be more explicit that we are, indeed, mirroring and
>>> explain that the trusted root is at apache.org
>>> 
>>> - Patrick
>>> 
>>> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik  wrote:
>>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell 
>>>> wrote:
>>>>> Hey we've actually distributed our artifacts through amazon cloudfront
>>>>> in the past (and that is where the website links redirect to).
>>>>> 
>>>>> Since the apache mirrors don't distribute signatures anyways,
>>>> 
>>>> True, but apache dist does. IOW, it is not uncommon for those
>>>> having an automated build/fetching systems to get bits from
>>>> one of the mirrors and then get the hashes directly from dist.
>>>> 
>>>> In your current case, I don't think I know of a way to do that.
>>>> 
>>>> Now, you may say that the current CDN you guys are you using
>>>> is functioning like a mirror -- well, I'd say that it needs to be
>>>> called out like one then.
>>>> 
>>>> Otherwise, as a naive user I *really* have to guess where
>>>> to get the hashes.
>>>> 
>>>>> what is the difference between linking to an apache mirror vs using a
>>>>> more
>>>>> robust CDN? If people want to verify the downloads they need to go to
>>>>> the apache root in either case.
>>>>> 
>>>>> Is this just a cultural thing or is there some security reason?
>>>> 
>>>> A bit of both I guess.
>>>> 
>>>> Thanks,
>>>> Roman.
>> 
>> 


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Roman Shaposhnik
On Thu, Sep 26, 2013 at 7:02 PM, Patrick Wendell  wrote:
> Chris et al,
>
> I'm -1 on this because it has many negative consequences for our existing 
> users:

Nobody is saying that closer.cgi should be the only link. But it should
be the leading link. IOW, if you could say something like:

click [closer.cgi |here] to download the XXX release of Spark from
the Apache Software Foundation mirror network or [here] to download
it from Amazon's CDN it would be fine.

> 3. Amazon Cloudfront is, as a whole, much more reliable and higher
> bandwidth than the mirror network.

My biggest problem with Amazon is that it seems to disallow listing.
That said -- it doesn't matter if it become a secondary link that
you make available.

Hope this helps.

Thanks,
Roman.


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Patrick Wendell
Chris et al,

I'm -1 on this because it has many negative consequences for our existing users:

1. Users who do automated downloads based on our posted URL's (of
which we get many thousands each release) will no longer work. Now if
they do "wget XXX" with our posted link, it will fail in a weird way
to due to the redirect page. Is there a version of the closer.cgi
script which just performs 302 redirects instead of asking me to click
on a link?

2. All other users have to click through an additional page to
download the software.

3. Amazon Cloudfront is, as a whole, much more reliable and higher
bandwidth than the mirror network.

These are my concerns, that basically we're causing our users to have
a much worse experience. I've identified these concerns with moving to
the apache mirror, but perhaps I've overlooked some benefits that
would counteract these. Are there benefits?

I completely agree that we need to send users to the signatures and
hashes at the Apache release site (to verify the release). So I did
add the link to this directly adjacent to the download.

- Patrick

On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann  wrote:
> Hey Guys,
>
> Yep the link should by the dyn/closer.cgi link on the website and +1
> to Roman's comment about auditing spark-project.org links to be replaced
> with ASF counterparts.
>
> Cheers,
> Chris
>
>
>
> -Original Message-
> From: Patrick Wendell 
> Reply-To: "dev@spark.incubator.apache.org" 
> Date: Wednesday, September 25, 2013 4:08 PM
> To: "dev@spark.incubator.apache.org" 
> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure
>
>>Yep, we definitely need to just directly point people the location at
>>apache.org where they can find the hashes. I just updated the release
>>notes and downloads page to point to that site.
>>
>>I just wanted to point out that mirroring these through a CDN seems
>>philosophically the same as mirroring through Apache, since in neither
>>case do we expect the users to trust the artifact they download. We
>>just need to be more explicit that we are, indeed, mirroring and
>>explain that the trusted root is at apache.org
>>
>>- Patrick
>>
>>On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik  wrote:
>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell 
>>>wrote:
>>>> Hey we've actually distributed our artifacts through amazon cloudfront
>>>> in the past (and that is where the website links redirect to).
>>>>
>>>> Since the apache mirrors don't distribute signatures anyways,
>>>
>>> True, but apache dist does. IOW, it is not uncommon for those
>>> having an automated build/fetching systems to get bits from
>>> one of the mirrors and then get the hashes directly from dist.
>>>
>>> In your current case, I don't think I know of a way to do that.
>>>
>>> Now, you may say that the current CDN you guys are you using
>>> is functioning like a mirror -- well, I'd say that it needs to be
>>> called out like one then.
>>>
>>> Otherwise, as a naive user I *really* have to guess where
>>> to get the hashes.
>>>
>>>> what is the difference between linking to an apache mirror vs using a
>>>>more
>>>> robust CDN? If people want to verify the downloads they need to go to
>>>> the apache root in either case.
>>>>
>>>> Is this just a cultural thing or is there some security reason?
>>>
>>> A bit of both I guess.
>>>
>>> Thanks,
>>> Roman.
>
>


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Chris Mattmann
Hey Guys,

Yep the link should by the dyn/closer.cgi link on the website and +1
to Roman's comment about auditing spark-project.org links to be replaced
with ASF counterparts.

Cheers,
Chris



-Original Message-
From: Patrick Wendell 
Reply-To: "dev@spark.incubator.apache.org" 
Date: Wednesday, September 25, 2013 4:08 PM
To: "dev@spark.incubator.apache.org" 
Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure

>Yep, we definitely need to just directly point people the location at
>apache.org where they can find the hashes. I just updated the release
>notes and downloads page to point to that site.
>
>I just wanted to point out that mirroring these through a CDN seems
>philosophically the same as mirroring through Apache, since in neither
>case do we expect the users to trust the artifact they download. We
>just need to be more explicit that we are, indeed, mirroring and
>explain that the trusted root is at apache.org
>
>- Patrick
>
>On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik  wrote:
>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell 
>>wrote:
>>> Hey we've actually distributed our artifacts through amazon cloudfront
>>> in the past (and that is where the website links redirect to).
>>>
>>> Since the apache mirrors don't distribute signatures anyways,
>>
>> True, but apache dist does. IOW, it is not uncommon for those
>> having an automated build/fetching systems to get bits from
>> one of the mirrors and then get the hashes directly from dist.
>>
>> In your current case, I don't think I know of a way to do that.
>>
>> Now, you may say that the current CDN you guys are you using
>> is functioning like a mirror -- well, I'd say that it needs to be
>> called out like one then.
>>
>> Otherwise, as a naive user I *really* have to guess where
>> to get the hashes.
>>
>>> what is the difference between linking to an apache mirror vs using a
>>>more
>>> robust CDN? If people want to verify the downloads they need to go to
>>> the apache root in either case.
>>>
>>> Is this just a cultural thing or is there some security reason?
>>
>> A bit of both I guess.
>>
>> Thanks,
>> Roman.




Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-26 Thread Henry Saputra
Yes, mirroring from CDN technically the same but we want to make sure
users know the link come from ASF domain =)

So we need to update the
http://spark.incubator.apache.org/releases/spark-release-0-8-0.html
link to download src and binary distributions.

- Henry

On Wed, Sep 25, 2013 at 4:08 PM, Patrick Wendell  wrote:
> Yep, we definitely need to just directly point people the location at
> apache.org where they can find the hashes. I just updated the release
> notes and downloads page to point to that site.
>
> I just wanted to point out that mirroring these through a CDN seems
> philosophically the same as mirroring through Apache, since in neither
> case do we expect the users to trust the artifact they download. We
> just need to be more explicit that we are, indeed, mirroring and
> explain that the trusted root is at apache.org
>
> - Patrick
>
> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik  wrote:
>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell  wrote:
>>> Hey we've actually distributed our artifacts through amazon cloudfront
>>> in the past (and that is where the website links redirect to).
>>>
>>> Since the apache mirrors don't distribute signatures anyways,
>>
>> True, but apache dist does. IOW, it is not uncommon for those
>> having an automated build/fetching systems to get bits from
>> one of the mirrors and then get the hashes directly from dist.
>>
>> In your current case, I don't think I know of a way to do that.
>>
>> Now, you may say that the current CDN you guys are you using
>> is functioning like a mirror -- well, I'd say that it needs to be
>> called out like one then.
>>
>> Otherwise, as a naive user I *really* have to guess where
>> to get the hashes.
>>
>>> what is the difference between linking to an apache mirror vs using a more
>>> robust CDN? If people want to verify the downloads they need to go to
>>> the apache root in either case.
>>>
>>> Is this just a cultural thing or is there some security reason?
>>
>> A bit of both I guess.
>>
>> Thanks,
>> Roman.


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-25 Thread Patrick Wendell
Yep, we definitely need to just directly point people the location at
apache.org where they can find the hashes. I just updated the release
notes and downloads page to point to that site.

I just wanted to point out that mirroring these through a CDN seems
philosophically the same as mirroring through Apache, since in neither
case do we expect the users to trust the artifact they download. We
just need to be more explicit that we are, indeed, mirroring and
explain that the trusted root is at apache.org

- Patrick

On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik  wrote:
> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell  wrote:
>> Hey we've actually distributed our artifacts through amazon cloudfront
>> in the past (and that is where the website links redirect to).
>>
>> Since the apache mirrors don't distribute signatures anyways,
>
> True, but apache dist does. IOW, it is not uncommon for those
> having an automated build/fetching systems to get bits from
> one of the mirrors and then get the hashes directly from dist.
>
> In your current case, I don't think I know of a way to do that.
>
> Now, you may say that the current CDN you guys are you using
> is functioning like a mirror -- well, I'd say that it needs to be
> called out like one then.
>
> Otherwise, as a naive user I *really* have to guess where
> to get the hashes.
>
>> what is the difference between linking to an apache mirror vs using a more
>> robust CDN? If people want to verify the downloads they need to go to
>> the apache root in either case.
>>
>> Is this just a cultural thing or is there some security reason?
>
> A bit of both I guess.
>
> Thanks,
> Roman.


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-25 Thread Roman Shaposhnik
On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell  wrote:
> Hey we've actually distributed our artifacts through amazon cloudfront
> in the past (and that is where the website links redirect to).
>
> Since the apache mirrors don't distribute signatures anyways,

True, but apache dist does. IOW, it is not uncommon for those
having an automated build/fetching systems to get bits from
one of the mirrors and then get the hashes directly from dist.

In your current case, I don't think I know of a way to do that.

Now, you may say that the current CDN you guys are you using
is functioning like a mirror -- well, I'd say that it needs to be
called out like one then.

Otherwise, as a naive user I *really* have to guess where
to get the hashes.

> what is the difference between linking to an apache mirror vs using a more
> robust CDN? If people want to verify the downloads they need to go to
> the apache root in either case.
>
> Is this just a cultural thing or is there some security reason?

A bit of both I guess.

Thanks,
Roman.


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-25 Thread Patrick Wendell
Hey we've actually distributed our artifacts through amazon cloudfront
in the past (and that is where the website links redirect to).

Since the apache mirrors don't distribute signatures anyways, what is
the difference between linking to an apache mirror vs using a more
robust CDN? If people want to verify the downloads they need to go to
the apache root in either case.

Is this just a cultural thing or is there some security reason?

- Patrick

On Wed, Sep 25, 2013 at 3:45 PM, Roman Shaposhnik  wrote:
> On Wed, Sep 25, 2013 at 3:40 PM, Henry Saputra  
> wrote:
>> Was there announcement that 0.8 artifact had been pushed to
>> http://www.apache.org/dist/incubator/spark ?
>>
>> I thought the link should points to
>> http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz
>
> For the freshly released bits it is typically better to point to 
> dyn/closer.cgi
> unless you want to start building negative karma with ASF infra ;-)
>
> Thanks,
> Roman.


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-25 Thread Roman Shaposhnik
On Wed, Sep 25, 2013 at 3:40 PM, Henry Saputra  wrote:
> Was there announcement that 0.8 artifact had been pushed to
> http://www.apache.org/dist/incubator/spark ?
>
> I thought the link should points to
> http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz

For the freshly released bits it is typically better to point to dyn/closer.cgi
unless you want to start building negative karma with ASF infra ;-)

Thanks,
Roman.


Re: Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-25 Thread Henry Saputra
Was there announcement that 0.8 artifact had been pushed to
http://www.apache.org/dist/incubator/spark ?

I thought the link should points to
http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz


- Henry

On Wed, Sep 25, 2013 at 3:24 PM, Roman Shaposhnik  wrote:
> Hi!
>
> I see that the current download link published here:
>   http://spark.incubator.apache.org/releases/spark-release-0-8-0.html
> leads to:
>   http://spark-project.org/download/spark-0.8.0-incubating.tgz
>
> This needs to be corrected to be (roughly):
>
> http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz
>
> In fact, at some point it may be worth auditing your website
> source and eliminate references to spark-project.org that
> should really be pointing back to ASF.
>
> Thanks,
> Roman.


Spark 0.8.0: bits need to come from ASF infrastructure

2013-09-25 Thread Roman Shaposhnik
Hi!

I see that the current download link published here:
  http://spark.incubator.apache.org/releases/spark-release-0-8-0.html
leads to:
  http://spark-project.org/download/spark-0.8.0-incubating.tgz

This needs to be corrected to be (roughly):
   
http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz

In fact, at some point it may be worth auditing your website
source and eliminate references to spark-project.org that
should really be pointing back to ASF.

Thanks,
Roman.