Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hey Folks, I updated the site to include the Apache mirror list for each file. I actually put it is the first download location, before the direct download link. I played around with the mirror network a bit, the performance was not bad, based on sampling from a few vantage points. I found it to be always worse then CloudFront, but typically not *much* worse. So I actually think if we can find a way to have a direct link to the nearest Apache mirror, we could just remove the CloudFront link entirely. I looked into it and apparently we're not the first apache project to have this problem. Many of the bigger projects already use some fancy selection to embed a direct link to the closest mirror: http://httpd.apache.org/download.cgi - Patrick On Fri, Sep 27, 2013 at 10:10 AM, Chris Mattmann wrote: > Yeah the sigs are usually available from here: > > http://www.apache.org/dist//../ > > So for us: > > http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/ > > > Cheers, > Chris > > > > -Original Message- > From: Matei Zaharia > Reply-To: "dev@spark.incubator.apache.org" > Date: Friday, September 27, 2013 10:06 AM > To: "dev@spark.incubator.apache.org" > Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure > >>If the mirrors don't have the signatures, then we should probably link to >>the mirrors and the signatures separately. It's definitely important to >>have a link to the mirrors so people can get this through ASF >>infrastructure without hitting only the main server. >> >>It's true that they don't seem to have them, even for other projects -- >>for example check out >>http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-2.0.5-alpha/. >> >>Matei >> >>On Sep 27, 2013, at 12:04 AM, Patrick Wendell wrote: >> >>> On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann >>>wrote: >>>> Hey Matei yep they have the signatures on them too. >>>> >>>> Cheers, >>>> Chris >>> >>> Chris - I checked a bunch of the mirrors and none of them have the >>> signatures... am I looking in the wrong place? >>> >>> >>>http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubatin >>>g/ >>> http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/ >>> http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/ >>> >>> In response to the comment about thousands of downloads - I didn't >>> mean at all to suggest that Apache couldn't handle this number, in >>> fact, I'm sure they can! I just wanted to point out that we are >>> keeping in mind the existing habits and processes of our user base, >>> and trying to make the transition smooth for them. >>> >>> - Patrick >> > >
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Yeah the sigs are usually available from here: http://www.apache.org/dist//../ So for us: http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/ Cheers, Chris -Original Message- From: Matei Zaharia Reply-To: "dev@spark.incubator.apache.org" Date: Friday, September 27, 2013 10:06 AM To: "dev@spark.incubator.apache.org" Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >If the mirrors don't have the signatures, then we should probably link to >the mirrors and the signatures separately. It's definitely important to >have a link to the mirrors so people can get this through ASF >infrastructure without hitting only the main server. > >It's true that they don't seem to have them, even for other projects -- >for example check out >http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-2.0.5-alpha/. > >Matei > >On Sep 27, 2013, at 12:04 AM, Patrick Wendell wrote: > >> On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann >>wrote: >>> Hey Matei yep they have the signatures on them too. >>> >>> Cheers, >>> Chris >> >> Chris - I checked a bunch of the mirrors and none of them have the >> signatures... am I looking in the wrong place? >> >> >>http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubatin >>g/ >> http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/ >> http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/ >> >> In response to the comment about thousands of downloads - I didn't >> mean at all to suggest that Apache couldn't handle this number, in >> fact, I'm sure they can! I just wanted to point out that we are >> keeping in mind the existing habits and processes of our user base, >> and trying to make the transition smooth for them. >> >> - Patrick >
Re: Spark 0.8.0: bits need to come from ASF infrastructure
If the mirrors don't have the signatures, then we should probably link to the mirrors and the signatures separately. It's definitely important to have a link to the mirrors so people can get this through ASF infrastructure without hitting only the main server. It's true that they don't seem to have them, even for other projects -- for example check out http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-2.0.5-alpha/. Matei On Sep 27, 2013, at 12:04 AM, Patrick Wendell wrote: > On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann wrote: >> Hey Matei yep they have the signatures on them too. >> >> Cheers, >> Chris > > Chris - I checked a bunch of the mirrors and none of them have the > signatures... am I looking in the wrong place? > > http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubating/ > http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/ > http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/ > > In response to the comment about thousands of downloads - I didn't > mean at all to suggest that Apache couldn't handle this number, in > fact, I'm sure they can! I just wanted to point out that we are > keeping in mind the existing habits and processes of our user base, > and trying to make the transition smooth for them. > > - Patrick
Re: Spark 0.8.0: bits need to come from ASF infrastructure
On Thu, Sep 26, 2013 at 8:27 PM, Chris Mattmann wrote: > Hey Matei yep they have the signatures on them too. > > Cheers, > Chris Chris - I checked a bunch of the mirrors and none of them have the signatures... am I looking in the wrong place? http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubating/ http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/ http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/ In response to the comment about thousands of downloads - I didn't mean at all to suggest that Apache couldn't handle this number, in fact, I'm sure they can! I just wanted to point out that we are keeping in mind the existing habits and processes of our user base, and trying to make the transition smooth for them. - Patrick
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Is there a way we can track the number of downloads with the apache mirrors? On Thursday, September 26, 2013, Chris Mattmann wrote: > Hey Matei yep they have the signatures on them too. > > Cheers, > Chris > > > -Original Message- > From: Matei Zaharia > > Reply-To: "dev@spark.incubator.apache.org " < > dev@spark.incubator.apache.org > > Date: Thursday, September 26, 2013 8:11 PM > To: "dev@spark.incubator.apache.org" > Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure > > >Maybe we can replace the link to "official Apache download site" in the > >release notes to point to the mirrors? Do the mirrors all have signatures > >on them too? > > > >Matei > > > >On Sep 26, 2013, at 10:59 PM, Andy Konwinski > >wrote: > > > >> Thanks Roman and Chris, > >> > >> I see here http://www.apache.org/dev/release.html#mirroring that > >>"Project > >> download pages must link to the mirrors" but I don't see anything about > >> ordering. > >> > >> I'm definitely +1 for including a link to the apache mirrors as required > >> and providing the Cloudfront link first since this seems to satisfy the > >> apache requirements and provide a better experience for users. > >> > >> Patrick. Thanks again for all your hard work on this release and for > >> pushing back on parts of the Apache process as you go. That's how > >> do-ocracies stay healthy and evolve. > >> On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" < > >> chris.a.mattm...@jpl.nasa.gov> wrote: > >> > >>> Hi Patrick will reply in more detail later but please know that > >>>linking to > >>> the apache download page is not a request it's a requirement. I will > >>> explain more in a bit. > >>> > >>> Cheers, > >>> Chris > >>> > >>> Sent from my iPhone > >>> > >>> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" > >>>wrote: > >>> > >>>> Chris et al, > >>>> > >>>> I'm -1 on this because it has many negative consequences for our > >>> existing users: > >>>> > >>>> 1. Users who do automated downloads based on our posted URL's (of > >>>> which we get many thousands each release) will no longer work. Now if > >>>> they do "wget XXX" with our posted link, it will fail in a weird way > >>>> to due to the redirect page. Is there a version of the closer.cgi > >>>> script which just performs 302 redirects instead of asking me to click > >>>> on a link? > >>>> > >>>> 2. All other users have to click through an additional page to > >>>> download the software. > >>>> > >>>> 3. Amazon Cloudfront is, as a whole, much more reliable and higher > >>>> bandwidth than the mirror network. > >>>> > >>>> These are my concerns, that basically we're causing our users to have > >>>> a much worse experience. I've identified these concerns with moving to > >>>> the apache mirror, but perhaps I've overlooked some benefits that > >>>> would counteract these. Are there benefits? > >>>> > >>>> I completely agree that we need to send users to the signatures and > >>>> hashes at the Apache release site (to verify the release). So I did > >>>> add the link to this directly adjacent to the download. > >>>> > >>>> - Patrick > >>>> > >>>> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann > >>> wrote: > >>>>> Hey Guys, > >>>>> > >>>>> Yep the link should by the dyn/closer.cgi link on the website and +1 > >>>>> to Roman's comment about auditing spark-project.org links to be > >>> replaced > >>>>> with ASF counterparts. > >>>>> > >>>>> Cheers, > >>>>> Chris > >>>>> > >>>>> > >>>>> > >>>>> -Original Message -- -- Reynold Xin, AMPLab, UC Berkeley http://rxin.org
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hey Matei yep they have the signatures on them too. Cheers, Chris -Original Message- From: Matei Zaharia Reply-To: "dev@spark.incubator.apache.org" Date: Thursday, September 26, 2013 8:11 PM To: "dev@spark.incubator.apache.org" Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >Maybe we can replace the link to "official Apache download site" in the >release notes to point to the mirrors? Do the mirrors all have signatures >on them too? > >Matei > >On Sep 26, 2013, at 10:59 PM, Andy Konwinski >wrote: > >> Thanks Roman and Chris, >> >> I see here http://www.apache.org/dev/release.html#mirroring that >>"Project >> download pages must link to the mirrors" but I don't see anything about >> ordering. >> >> I'm definitely +1 for including a link to the apache mirrors as required >> and providing the Cloudfront link first since this seems to satisfy the >> apache requirements and provide a better experience for users. >> >> Patrick. Thanks again for all your hard work on this release and for >> pushing back on parts of the Apache process as you go. That's how >> do-ocracies stay healthy and evolve. >> On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" < >> chris.a.mattm...@jpl.nasa.gov> wrote: >> >>> Hi Patrick will reply in more detail later but please know that >>>linking to >>> the apache download page is not a request it's a requirement. I will >>> explain more in a bit. >>> >>> Cheers, >>> Chris >>> >>> Sent from my iPhone >>> >>> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" >>>wrote: >>> >>>> Chris et al, >>>> >>>> I'm -1 on this because it has many negative consequences for our >>> existing users: >>>> >>>> 1. Users who do automated downloads based on our posted URL's (of >>>> which we get many thousands each release) will no longer work. Now if >>>> they do "wget XXX" with our posted link, it will fail in a weird way >>>> to due to the redirect page. Is there a version of the closer.cgi >>>> script which just performs 302 redirects instead of asking me to click >>>> on a link? >>>> >>>> 2. All other users have to click through an additional page to >>>> download the software. >>>> >>>> 3. Amazon Cloudfront is, as a whole, much more reliable and higher >>>> bandwidth than the mirror network. >>>> >>>> These are my concerns, that basically we're causing our users to have >>>> a much worse experience. I've identified these concerns with moving to >>>> the apache mirror, but perhaps I've overlooked some benefits that >>>> would counteract these. Are there benefits? >>>> >>>> I completely agree that we need to send users to the signatures and >>>> hashes at the Apache release site (to verify the release). So I did >>>> add the link to this directly adjacent to the download. >>>> >>>> - Patrick >>>> >>>> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann >>> wrote: >>>>> Hey Guys, >>>>> >>>>> Yep the link should by the dyn/closer.cgi link on the website and +1 >>>>> to Roman's comment about auditing spark-project.org links to be >>> replaced >>>>> with ASF counterparts. >>>>> >>>>> Cheers, >>>>> Chris >>>>> >>>>> >>>>> >>>>> -Original Message- >>>>> From: Patrick Wendell >>>>> Reply-To: "dev@spark.incubator.apache.org" < >>> dev@spark.incubator.apache.org> >>>>> Date: Wednesday, September 25, 2013 4:08 PM >>>>> To: "dev@spark.incubator.apache.org" >>>>> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >>>>> >>>>>> Yep, we definitely need to just directly point people the location >>>>>>at >>>>>> apache.org where they can find the hashes. I just updated the >>>>>>release >>>>>> notes and downloads page to point to that site. >>>>>> >>>>>> I just wanted to point out that mirroring these through a CDN seems >>>>>> philosophically the same as mirroring t
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hi Andy, -Original Message- From: Andy Konwinski Reply-To: "dev@spark.incubator.apache.org" Date: Thursday, September 26, 2013 7:59 PM To: "dev@spark.incubator.apache.org" Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >Thanks Roman and Chris, > >I see here http://www.apache.org/dev/release.html#mirroring that "Project >download pages must link to the mirrors" but I don't see anything about >ordering. Technically we ought to promote Apache's mirroring system as the Project Endorsed home for the project. As an Apache member and someone who values what the Foundation does for its projects and communities I don't think that's much to ask. If you guys feel strongly about the ordering of the Cloud Front first I'm open to it, I would just appreciate seeing some existing data showing that you guys have users who have tried the mirroring system from the ASF and it hasn't performed as well as the Amazon one. > >I'm definitely +1 for including a link to the apache mirrors as required >and providing the Cloudfront link first since this seems to satisfy the >apache requirements and provide a better experience for users. > >Patrick. Thanks again for all your hard work on this release and for >pushing back on parts of the Apache process as you go. That's how >do-ocracies stay healthy and evolve. Here here. This project doesn't have a "boss" and it's not me :) I'm just trying to spread my Apache knowledge and help you guys wear your Apache hats too since the project lives at the ASF now. I think you'll find the benefits of wearing those hats are many :) Cheers, Chris >On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" < >chris.a.mattm...@jpl.nasa.gov> wrote: > >> Hi Patrick will reply in more detail later but please know that linking >>to >> the apache download page is not a request it's a requirement. I will >> explain more in a bit. >> >> Cheers, >> Chris >> >> Sent from my iPhone >> >> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" >>wrote: >> >> > Chris et al, >> > >> > I'm -1 on this because it has many negative consequences for our >> existing users: >> > >> > 1. Users who do automated downloads based on our posted URL's (of >> > which we get many thousands each release) will no longer work. Now if >> > they do "wget XXX" with our posted link, it will fail in a weird way >> > to due to the redirect page. Is there a version of the closer.cgi >> > script which just performs 302 redirects instead of asking me to click >> > on a link? >> > >> > 2. All other users have to click through an additional page to >> > download the software. >> > >> > 3. Amazon Cloudfront is, as a whole, much more reliable and higher >> > bandwidth than the mirror network. >> > >> > These are my concerns, that basically we're causing our users to have >> > a much worse experience. I've identified these concerns with moving to >> > the apache mirror, but perhaps I've overlooked some benefits that >> > would counteract these. Are there benefits? >> > >> > I completely agree that we need to send users to the signatures and >> > hashes at the Apache release site (to verify the release). So I did >> > add the link to this directly adjacent to the download. >> > >> > - Patrick >> > >> > On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann >> wrote: >> >> Hey Guys, >> >> >> >> Yep the link should by the dyn/closer.cgi link on the website and +1 >> >> to Roman's comment about auditing spark-project.org links to be >> replaced >> >> with ASF counterparts. >> >> >> >> Cheers, >> >> Chris >> >> >> >> >> >> >> >> -Original Message- >> >> From: Patrick Wendell >> >> Reply-To: "dev@spark.incubator.apache.org" < >> dev@spark.incubator.apache.org> >> >> Date: Wednesday, September 25, 2013 4:08 PM >> >> To: "dev@spark.incubator.apache.org" >> >> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >> >> >> >>> Yep, we definitely need to just directly point people the location >>at >> >>> apache.org where they can find the hashes. I just updated the >>release >> >>> notes and downloads page to point to that site. >> >>&g
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hey Guys, OK flight landed, so have a sec to reply in more detail: -Original Message- From: Patrick Wendell Reply-To: "dev@spark.incubator.apache.org" Date: Thursday, September 26, 2013 7:02 PM To: "dev@spark.incubator.apache.org" Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >Chris et al, > >I'm -1 on this because it has many negative consequences for our existing >users: > >1. Users who do automated downloads based on our posted URL's (of >which we get many thousands each release) will no longer work. Apache also has many 10s of thousands of downloads each release, depending on the project. The best example I can think of is Open Office which receives 10M downloads/day IIRC -- yes Ooo has some special downloading infra help too, but beyond that popular projects like Apache Lucene and Solr regularly see 4000+ downloads per day and the ASF mirroring system works fine. > Now if >they do "wget XXX" with our posted link, it will fail in a weird way >to due to the redirect page. Is there a version of the closer.cgi >script which just performs 302 redirects instead of asking me to click >on a link? You can do something like e.g., curl "http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubatin g/spark-0.8.0-incubating-bin-cdh4.tgz" | grep http | grep tgz| sort -n (and then some HTML strip magic) Even better if you have Apache Tika installed, something like: tika -t "http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubatin g/spark-0.8.0-incubating-bin-cdh4.tgz" | grep http | grep tgz produces: http://mirror.nexcess.net/apache/incubator/spark/spark-0.8.0-incubating/spa rk-0.8.0-incubating-bin-cdh4.tgz http://mirror.nexcess.net/apache/incubator/spark/spark-0.8.0-incubating/sp ark-0.8.0-incubating-bin-cdh4.tgz http://apache.cs.utah.edu/incubator/spark/spark-0.8.0-incubating/spark-0.8 .0-incubating-bin-cdh4.tgz http://apache.tradebit.com/pub/incubator/spark/spark-0.8.0-incubating/spar k-0.8.0-incubating-bin-cdh4.tgz http://www.carfab.com/apachesoftware/incubator/spark/spark-0.8.0-incubatin g/spark-0.8.0-incubating-bin-cdh4.tgz http://apache.petsads.us/incubator/spark/spark-0.8.0-incubating/spark-0.8. 0-incubating-bin-cdh4.tgz http://www.trieuvan.com/apache/incubator/spark/spark-0.8.0-incubating/spar k-0.8.0-incubating-bin-cdh4.tgz http://mirrors.ibiblio.org/apache/incubator/spark/spark-0.8.0-incubating/s park-0.8.0-incubating-bin-cdh4.tgz http://mirror.olnevhost.net/pub/apache/incubator/spark/spark-0.8.0-incubat ing/spark-0.8.0-incubating-bin-cdh4.tgz http://psg.mtu.edu/pub/apache/incubator/spark/spark-0.8.0-incubating/spark -0.8.0-incubating-bin-cdh4.tgz http://apache.claz.org/incubator/spark/spark-0.8.0-incubating/spark-0.8.0- incubating-bin-cdh4.tgz http://mirror.metrocast.net/apache/incubator/spark/spark-0.8.0-incubating/ spark-0.8.0-incubating-bin-cdh4.tgz http://apache.mirrors.lucidnetworks.net/incubator/spark/spark-0.8.0-incuba ting/spark-0.8.0-incubating-bin-cdh4.tgz http://mirrors.gigenet.com/apache/incubator/spark/spark-0.8.0-incubating/s park-0.8.0-incubating-bin-cdh4.tgz http://www.poolsaboveground.com/apache/incubator/spark/spark-0.8.0-incubat ing/spark-0.8.0-incubating-bin-cdh4.tgz http://www.bizdirusa.com/mirrors/apache/incubator/spark/spark-0.8.0-incuba ting/spark-0.8.0-incubating-bin-cdh4.tgz http://mirror.sdunix.com/apache/incubator/spark/spark-0.8.0-incubating/spa rk-0.8.0-incubating-bin-cdh4.tgz http://download.nextag.com/apache/incubator/spark/spark-0.8.0-incubating/s park-0.8.0-incubating-bin-cdh4.tgz http://www.motorlogy.com/apache/incubator/spark/spark-0.8.0-incubating/spa rk-0.8.0-incubating-bin-cdh4.tgz http://mirror.cc.columbia.edu/pub/software/apache/incubator/spark/spark-0. 8.0-incubating/spark-0.8.0-incubating-bin-cdh4.tgz http://mirror.tcpdiag.net/apache/incubator/spark/spark-0.8.0-incubating/sp ark-0.8.0-incubating-bin-cdh4.tgz http://apache.mirrors.hoobly.com/incubator/spark/spark-0.8.0-incubating/sp ark-0.8.0-incubating-bin-cdh4.tgz http://www.eng.lsu.edu/mirrors/apache/incubator/spark/spark-0.8.0-incubati ng/spark-0.8.0-incubating-bin-cdh4.tgz http://apache.mesi.com.ar/incubator/spark/spark-0.8.0-incubating/spark-0.8 .0-incubating-bin-cdh4.tgz http://mirror.symnds.com/software/Apache/incubator/spark/spark-0.8.0-incub ating/spark-0.8.0-incubating-bin-cdh4.tgz http://mirror.reverse.net/pub/apache/incubator/spark/spark-0.8.0-incubatin g/spark-0.8.0-incubating-bin-cdh4.tgz http://apache.osuosl.org/incubator/spark/spark-0.8.0-incubating/spark-0.8. 0-incubating-bin-cdh4.tgz http://www.interior-dsgn.com/apache/incubator/spark/spark-0.8.0-incubating /spark-0.8.0-incubating-bin-cdh4.tgz http://mirror.cogentco.com/pub/apache/incubator/spark/spark-0.8.0-incubati ng/spark-0.8.0-incubating-bin-cdh4.tgz http://apache.spinellicreations.com/incubator/spark/spark-0.8.0-incubating /spark-0.8.0-
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Maybe we can replace the link to "official Apache download site" in the release notes to point to the mirrors? Do the mirrors all have signatures on them too? Matei On Sep 26, 2013, at 10:59 PM, Andy Konwinski wrote: > Thanks Roman and Chris, > > I see here http://www.apache.org/dev/release.html#mirroring that "Project > download pages must link to the mirrors" but I don't see anything about > ordering. > > I'm definitely +1 for including a link to the apache mirrors as required > and providing the Cloudfront link first since this seems to satisfy the > apache requirements and provide a better experience for users. > > Patrick. Thanks again for all your hard work on this release and for > pushing back on parts of the Apache process as you go. That's how > do-ocracies stay healthy and evolve. > On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> Hi Patrick will reply in more detail later but please know that linking to >> the apache download page is not a request it's a requirement. I will >> explain more in a bit. >> >> Cheers, >> Chris >> >> Sent from my iPhone >> >> On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" wrote: >> >>> Chris et al, >>> >>> I'm -1 on this because it has many negative consequences for our >> existing users: >>> >>> 1. Users who do automated downloads based on our posted URL's (of >>> which we get many thousands each release) will no longer work. Now if >>> they do "wget XXX" with our posted link, it will fail in a weird way >>> to due to the redirect page. Is there a version of the closer.cgi >>> script which just performs 302 redirects instead of asking me to click >>> on a link? >>> >>> 2. All other users have to click through an additional page to >>> download the software. >>> >>> 3. Amazon Cloudfront is, as a whole, much more reliable and higher >>> bandwidth than the mirror network. >>> >>> These are my concerns, that basically we're causing our users to have >>> a much worse experience. I've identified these concerns with moving to >>> the apache mirror, but perhaps I've overlooked some benefits that >>> would counteract these. Are there benefits? >>> >>> I completely agree that we need to send users to the signatures and >>> hashes at the Apache release site (to verify the release). So I did >>> add the link to this directly adjacent to the download. >>> >>> - Patrick >>> >>> On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann >> wrote: >>>> Hey Guys, >>>> >>>> Yep the link should by the dyn/closer.cgi link on the website and +1 >>>> to Roman's comment about auditing spark-project.org links to be >> replaced >>>> with ASF counterparts. >>>> >>>> Cheers, >>>> Chris >>>> >>>> >>>> >>>> -Original Message- >>>> From: Patrick Wendell >>>> Reply-To: "dev@spark.incubator.apache.org" < >> dev@spark.incubator.apache.org> >>>> Date: Wednesday, September 25, 2013 4:08 PM >>>> To: "dev@spark.incubator.apache.org" >>>> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >>>> >>>>> Yep, we definitely need to just directly point people the location at >>>>> apache.org where they can find the hashes. I just updated the release >>>>> notes and downloads page to point to that site. >>>>> >>>>> I just wanted to point out that mirroring these through a CDN seems >>>>> philosophically the same as mirroring through Apache, since in neither >>>>> case do we expect the users to trust the artifact they download. We >>>>> just need to be more explicit that we are, indeed, mirroring and >>>>> explain that the trusted root is at apache.org >>>>> >>>>> - Patrick >>>>> >>>>> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik >> wrote: >>>>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell >>>>>> wrote: >>>>>>> Hey we've actually distributed our artifacts through amazon >> cloudfront >>>>>>> in the past (and that is where the website links redirect to). >>>>>>> >>>>>>> Since the apache mirrors don't distribute signatures anyways, >>>>>> >>>>>> True, but apache dist does. IOW, it is not uncommon for those >>>>>> having an automated build/fetching systems to get bits from >>>>>> one of the mirrors and then get the hashes directly from dist. >>>>>> >>>>>> In your current case, I don't think I know of a way to do that. >>>>>> >>>>>> Now, you may say that the current CDN you guys are you using >>>>>> is functioning like a mirror -- well, I'd say that it needs to be >>>>>> called out like one then. >>>>>> >>>>>> Otherwise, as a naive user I *really* have to guess where >>>>>> to get the hashes. >>>>>> >>>>>>> what is the difference between linking to an apache mirror vs using a >>>>>>> more >>>>>>> robust CDN? If people want to verify the downloads they need to go to >>>>>>> the apache root in either case. >>>>>>> >>>>>>> Is this just a cultural thing or is there some security reason? >>>>>> >>>>>> A bit of both I guess. >>>>>> >>>>>> Thanks, >>>>>> Roman. >>>> >>>> >>
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Thanks Roman and Chris, I see here http://www.apache.org/dev/release.html#mirroring that "Project download pages must link to the mirrors" but I don't see anything about ordering. I'm definitely +1 for including a link to the apache mirrors as required and providing the Cloudfront link first since this seems to satisfy the apache requirements and provide a better experience for users. Patrick. Thanks again for all your hard work on this release and for pushing back on parts of the Apache process as you go. That's how do-ocracies stay healthy and evolve. On Sep 26, 2013 7:23 PM, "Mattmann, Chris A (398J)" < chris.a.mattm...@jpl.nasa.gov> wrote: > Hi Patrick will reply in more detail later but please know that linking to > the apache download page is not a request it's a requirement. I will > explain more in a bit. > > Cheers, > Chris > > Sent from my iPhone > > On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" wrote: > > > Chris et al, > > > > I'm -1 on this because it has many negative consequences for our > existing users: > > > > 1. Users who do automated downloads based on our posted URL's (of > > which we get many thousands each release) will no longer work. Now if > > they do "wget XXX" with our posted link, it will fail in a weird way > > to due to the redirect page. Is there a version of the closer.cgi > > script which just performs 302 redirects instead of asking me to click > > on a link? > > > > 2. All other users have to click through an additional page to > > download the software. > > > > 3. Amazon Cloudfront is, as a whole, much more reliable and higher > > bandwidth than the mirror network. > > > > These are my concerns, that basically we're causing our users to have > > a much worse experience. I've identified these concerns with moving to > > the apache mirror, but perhaps I've overlooked some benefits that > > would counteract these. Are there benefits? > > > > I completely agree that we need to send users to the signatures and > > hashes at the Apache release site (to verify the release). So I did > > add the link to this directly adjacent to the download. > > > > - Patrick > > > > On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann > wrote: > >> Hey Guys, > >> > >> Yep the link should by the dyn/closer.cgi link on the website and +1 > >> to Roman's comment about auditing spark-project.org links to be > replaced > >> with ASF counterparts. > >> > >> Cheers, > >> Chris > >> > >> > >> > >> -Original Message- > >> From: Patrick Wendell > >> Reply-To: "dev@spark.incubator.apache.org" < > dev@spark.incubator.apache.org> > >> Date: Wednesday, September 25, 2013 4:08 PM > >> To: "dev@spark.incubator.apache.org" > >> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure > >> > >>> Yep, we definitely need to just directly point people the location at > >>> apache.org where they can find the hashes. I just updated the release > >>> notes and downloads page to point to that site. > >>> > >>> I just wanted to point out that mirroring these through a CDN seems > >>> philosophically the same as mirroring through Apache, since in neither > >>> case do we expect the users to trust the artifact they download. We > >>> just need to be more explicit that we are, indeed, mirroring and > >>> explain that the trusted root is at apache.org > >>> > >>> - Patrick > >>> > >>> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik > wrote: > >>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell > >>>> wrote: > >>>>> Hey we've actually distributed our artifacts through amazon > cloudfront > >>>>> in the past (and that is where the website links redirect to). > >>>>> > >>>>> Since the apache mirrors don't distribute signatures anyways, > >>>> > >>>> True, but apache dist does. IOW, it is not uncommon for those > >>>> having an automated build/fetching systems to get bits from > >>>> one of the mirrors and then get the hashes directly from dist. > >>>> > >>>> In your current case, I don't think I know of a way to do that. > >>>> > >>>> Now, you may say that the current CDN you guys are you using > >>>> is functioning like a mirror -- well, I'd say that it needs to be > >>>> called out like one then. > >>>> > >>>> Otherwise, as a naive user I *really* have to guess where > >>>> to get the hashes. > >>>> > >>>>> what is the difference between linking to an apache mirror vs using a > >>>>> more > >>>>> robust CDN? If people want to verify the downloads they need to go to > >>>>> the apache root in either case. > >>>>> > >>>>> Is this just a cultural thing or is there some security reason? > >>>> > >>>> A bit of both I guess. > >>>> > >>>> Thanks, > >>>> Roman. > >> > >> >
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hi Patrick will reply in more detail later but please know that linking to the apache download page is not a request it's a requirement. I will explain more in a bit. Cheers, Chris Sent from my iPhone On Sep 26, 2013, at 8:09 PM, "Patrick Wendell" wrote: > Chris et al, > > I'm -1 on this because it has many negative consequences for our existing > users: > > 1. Users who do automated downloads based on our posted URL's (of > which we get many thousands each release) will no longer work. Now if > they do "wget XXX" with our posted link, it will fail in a weird way > to due to the redirect page. Is there a version of the closer.cgi > script which just performs 302 redirects instead of asking me to click > on a link? > > 2. All other users have to click through an additional page to > download the software. > > 3. Amazon Cloudfront is, as a whole, much more reliable and higher > bandwidth than the mirror network. > > These are my concerns, that basically we're causing our users to have > a much worse experience. I've identified these concerns with moving to > the apache mirror, but perhaps I've overlooked some benefits that > would counteract these. Are there benefits? > > I completely agree that we need to send users to the signatures and > hashes at the Apache release site (to verify the release). So I did > add the link to this directly adjacent to the download. > > - Patrick > > On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann wrote: >> Hey Guys, >> >> Yep the link should by the dyn/closer.cgi link on the website and +1 >> to Roman's comment about auditing spark-project.org links to be replaced >> with ASF counterparts. >> >> Cheers, >> Chris >> >> >> >> -Original Message----- >> From: Patrick Wendell >> Reply-To: "dev@spark.incubator.apache.org" >> Date: Wednesday, September 25, 2013 4:08 PM >> To: "dev@spark.incubator.apache.org" >> Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >> >>> Yep, we definitely need to just directly point people the location at >>> apache.org where they can find the hashes. I just updated the release >>> notes and downloads page to point to that site. >>> >>> I just wanted to point out that mirroring these through a CDN seems >>> philosophically the same as mirroring through Apache, since in neither >>> case do we expect the users to trust the artifact they download. We >>> just need to be more explicit that we are, indeed, mirroring and >>> explain that the trusted root is at apache.org >>> >>> - Patrick >>> >>> On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik wrote: >>>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell >>>> wrote: >>>>> Hey we've actually distributed our artifacts through amazon cloudfront >>>>> in the past (and that is where the website links redirect to). >>>>> >>>>> Since the apache mirrors don't distribute signatures anyways, >>>> >>>> True, but apache dist does. IOW, it is not uncommon for those >>>> having an automated build/fetching systems to get bits from >>>> one of the mirrors and then get the hashes directly from dist. >>>> >>>> In your current case, I don't think I know of a way to do that. >>>> >>>> Now, you may say that the current CDN you guys are you using >>>> is functioning like a mirror -- well, I'd say that it needs to be >>>> called out like one then. >>>> >>>> Otherwise, as a naive user I *really* have to guess where >>>> to get the hashes. >>>> >>>>> what is the difference between linking to an apache mirror vs using a >>>>> more >>>>> robust CDN? If people want to verify the downloads they need to go to >>>>> the apache root in either case. >>>>> >>>>> Is this just a cultural thing or is there some security reason? >>>> >>>> A bit of both I guess. >>>> >>>> Thanks, >>>> Roman. >> >>
Re: Spark 0.8.0: bits need to come from ASF infrastructure
On Thu, Sep 26, 2013 at 7:02 PM, Patrick Wendell wrote: > Chris et al, > > I'm -1 on this because it has many negative consequences for our existing > users: Nobody is saying that closer.cgi should be the only link. But it should be the leading link. IOW, if you could say something like: click [closer.cgi |here] to download the XXX release of Spark from the Apache Software Foundation mirror network or [here] to download it from Amazon's CDN it would be fine. > 3. Amazon Cloudfront is, as a whole, much more reliable and higher > bandwidth than the mirror network. My biggest problem with Amazon is that it seems to disallow listing. That said -- it doesn't matter if it become a secondary link that you make available. Hope this helps. Thanks, Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Chris et al, I'm -1 on this because it has many negative consequences for our existing users: 1. Users who do automated downloads based on our posted URL's (of which we get many thousands each release) will no longer work. Now if they do "wget XXX" with our posted link, it will fail in a weird way to due to the redirect page. Is there a version of the closer.cgi script which just performs 302 redirects instead of asking me to click on a link? 2. All other users have to click through an additional page to download the software. 3. Amazon Cloudfront is, as a whole, much more reliable and higher bandwidth than the mirror network. These are my concerns, that basically we're causing our users to have a much worse experience. I've identified these concerns with moving to the apache mirror, but perhaps I've overlooked some benefits that would counteract these. Are there benefits? I completely agree that we need to send users to the signatures and hashes at the Apache release site (to verify the release). So I did add the link to this directly adjacent to the download. - Patrick On Thu, Sep 26, 2013 at 3:50 PM, Chris Mattmann wrote: > Hey Guys, > > Yep the link should by the dyn/closer.cgi link on the website and +1 > to Roman's comment about auditing spark-project.org links to be replaced > with ASF counterparts. > > Cheers, > Chris > > > > -Original Message- > From: Patrick Wendell > Reply-To: "dev@spark.incubator.apache.org" > Date: Wednesday, September 25, 2013 4:08 PM > To: "dev@spark.incubator.apache.org" > Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure > >>Yep, we definitely need to just directly point people the location at >>apache.org where they can find the hashes. I just updated the release >>notes and downloads page to point to that site. >> >>I just wanted to point out that mirroring these through a CDN seems >>philosophically the same as mirroring through Apache, since in neither >>case do we expect the users to trust the artifact they download. We >>just need to be more explicit that we are, indeed, mirroring and >>explain that the trusted root is at apache.org >> >>- Patrick >> >>On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik wrote: >>> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell >>>wrote: >>>> Hey we've actually distributed our artifacts through amazon cloudfront >>>> in the past (and that is where the website links redirect to). >>>> >>>> Since the apache mirrors don't distribute signatures anyways, >>> >>> True, but apache dist does. IOW, it is not uncommon for those >>> having an automated build/fetching systems to get bits from >>> one of the mirrors and then get the hashes directly from dist. >>> >>> In your current case, I don't think I know of a way to do that. >>> >>> Now, you may say that the current CDN you guys are you using >>> is functioning like a mirror -- well, I'd say that it needs to be >>> called out like one then. >>> >>> Otherwise, as a naive user I *really* have to guess where >>> to get the hashes. >>> >>>> what is the difference between linking to an apache mirror vs using a >>>>more >>>> robust CDN? If people want to verify the downloads they need to go to >>>> the apache root in either case. >>>> >>>> Is this just a cultural thing or is there some security reason? >>> >>> A bit of both I guess. >>> >>> Thanks, >>> Roman. > >
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hey Guys, Yep the link should by the dyn/closer.cgi link on the website and +1 to Roman's comment about auditing spark-project.org links to be replaced with ASF counterparts. Cheers, Chris -Original Message- From: Patrick Wendell Reply-To: "dev@spark.incubator.apache.org" Date: Wednesday, September 25, 2013 4:08 PM To: "dev@spark.incubator.apache.org" Subject: Re: Spark 0.8.0: bits need to come from ASF infrastructure >Yep, we definitely need to just directly point people the location at >apache.org where they can find the hashes. I just updated the release >notes and downloads page to point to that site. > >I just wanted to point out that mirroring these through a CDN seems >philosophically the same as mirroring through Apache, since in neither >case do we expect the users to trust the artifact they download. We >just need to be more explicit that we are, indeed, mirroring and >explain that the trusted root is at apache.org > >- Patrick > >On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik wrote: >> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell >>wrote: >>> Hey we've actually distributed our artifacts through amazon cloudfront >>> in the past (and that is where the website links redirect to). >>> >>> Since the apache mirrors don't distribute signatures anyways, >> >> True, but apache dist does. IOW, it is not uncommon for those >> having an automated build/fetching systems to get bits from >> one of the mirrors and then get the hashes directly from dist. >> >> In your current case, I don't think I know of a way to do that. >> >> Now, you may say that the current CDN you guys are you using >> is functioning like a mirror -- well, I'd say that it needs to be >> called out like one then. >> >> Otherwise, as a naive user I *really* have to guess where >> to get the hashes. >> >>> what is the difference between linking to an apache mirror vs using a >>>more >>> robust CDN? If people want to verify the downloads they need to go to >>> the apache root in either case. >>> >>> Is this just a cultural thing or is there some security reason? >> >> A bit of both I guess. >> >> Thanks, >> Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Yes, mirroring from CDN technically the same but we want to make sure users know the link come from ASF domain =) So we need to update the http://spark.incubator.apache.org/releases/spark-release-0-8-0.html link to download src and binary distributions. - Henry On Wed, Sep 25, 2013 at 4:08 PM, Patrick Wendell wrote: > Yep, we definitely need to just directly point people the location at > apache.org where they can find the hashes. I just updated the release > notes and downloads page to point to that site. > > I just wanted to point out that mirroring these through a CDN seems > philosophically the same as mirroring through Apache, since in neither > case do we expect the users to trust the artifact they download. We > just need to be more explicit that we are, indeed, mirroring and > explain that the trusted root is at apache.org > > - Patrick > > On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik wrote: >> On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell wrote: >>> Hey we've actually distributed our artifacts through amazon cloudfront >>> in the past (and that is where the website links redirect to). >>> >>> Since the apache mirrors don't distribute signatures anyways, >> >> True, but apache dist does. IOW, it is not uncommon for those >> having an automated build/fetching systems to get bits from >> one of the mirrors and then get the hashes directly from dist. >> >> In your current case, I don't think I know of a way to do that. >> >> Now, you may say that the current CDN you guys are you using >> is functioning like a mirror -- well, I'd say that it needs to be >> called out like one then. >> >> Otherwise, as a naive user I *really* have to guess where >> to get the hashes. >> >>> what is the difference between linking to an apache mirror vs using a more >>> robust CDN? If people want to verify the downloads they need to go to >>> the apache root in either case. >>> >>> Is this just a cultural thing or is there some security reason? >> >> A bit of both I guess. >> >> Thanks, >> Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Yep, we definitely need to just directly point people the location at apache.org where they can find the hashes. I just updated the release notes and downloads page to point to that site. I just wanted to point out that mirroring these through a CDN seems philosophically the same as mirroring through Apache, since in neither case do we expect the users to trust the artifact they download. We just need to be more explicit that we are, indeed, mirroring and explain that the trusted root is at apache.org - Patrick On Wed, Sep 25, 2013 at 3:56 PM, Roman Shaposhnik wrote: > On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell wrote: >> Hey we've actually distributed our artifacts through amazon cloudfront >> in the past (and that is where the website links redirect to). >> >> Since the apache mirrors don't distribute signatures anyways, > > True, but apache dist does. IOW, it is not uncommon for those > having an automated build/fetching systems to get bits from > one of the mirrors and then get the hashes directly from dist. > > In your current case, I don't think I know of a way to do that. > > Now, you may say that the current CDN you guys are you using > is functioning like a mirror -- well, I'd say that it needs to be > called out like one then. > > Otherwise, as a naive user I *really* have to guess where > to get the hashes. > >> what is the difference between linking to an apache mirror vs using a more >> robust CDN? If people want to verify the downloads they need to go to >> the apache root in either case. >> >> Is this just a cultural thing or is there some security reason? > > A bit of both I guess. > > Thanks, > Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
On Wed, Sep 25, 2013 at 3:48 PM, Patrick Wendell wrote: > Hey we've actually distributed our artifacts through amazon cloudfront > in the past (and that is where the website links redirect to). > > Since the apache mirrors don't distribute signatures anyways, True, but apache dist does. IOW, it is not uncommon for those having an automated build/fetching systems to get bits from one of the mirrors and then get the hashes directly from dist. In your current case, I don't think I know of a way to do that. Now, you may say that the current CDN you guys are you using is functioning like a mirror -- well, I'd say that it needs to be called out like one then. Otherwise, as a naive user I *really* have to guess where to get the hashes. > what is the difference between linking to an apache mirror vs using a more > robust CDN? If people want to verify the downloads they need to go to > the apache root in either case. > > Is this just a cultural thing or is there some security reason? A bit of both I guess. Thanks, Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Hey we've actually distributed our artifacts through amazon cloudfront in the past (and that is where the website links redirect to). Since the apache mirrors don't distribute signatures anyways, what is the difference between linking to an apache mirror vs using a more robust CDN? If people want to verify the downloads they need to go to the apache root in either case. Is this just a cultural thing or is there some security reason? - Patrick On Wed, Sep 25, 2013 at 3:45 PM, Roman Shaposhnik wrote: > On Wed, Sep 25, 2013 at 3:40 PM, Henry Saputra > wrote: >> Was there announcement that 0.8 artifact had been pushed to >> http://www.apache.org/dist/incubator/spark ? >> >> I thought the link should points to >> http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz > > For the freshly released bits it is typically better to point to > dyn/closer.cgi > unless you want to start building negative karma with ASF infra ;-) > > Thanks, > Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
On Wed, Sep 25, 2013 at 3:40 PM, Henry Saputra wrote: > Was there announcement that 0.8 artifact had been pushed to > http://www.apache.org/dist/incubator/spark ? > > I thought the link should points to > http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz For the freshly released bits it is typically better to point to dyn/closer.cgi unless you want to start building negative karma with ASF infra ;-) Thanks, Roman.
Re: Spark 0.8.0: bits need to come from ASF infrastructure
Was there announcement that 0.8 artifact had been pushed to http://www.apache.org/dist/incubator/spark ? I thought the link should points to http://www.apache.org/dist/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz - Henry On Wed, Sep 25, 2013 at 3:24 PM, Roman Shaposhnik wrote: > Hi! > > I see that the current download link published here: > http://spark.incubator.apache.org/releases/spark-release-0-8-0.html > leads to: > http://spark-project.org/download/spark-0.8.0-incubating.tgz > > This needs to be corrected to be (roughly): > > http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz > > In fact, at some point it may be worth auditing your website > source and eliminate references to spark-project.org that > should really be pointing back to ASF. > > Thanks, > Roman.
Spark 0.8.0: bits need to come from ASF infrastructure
Hi! I see that the current download link published here: http://spark.incubator.apache.org/releases/spark-release-0-8-0.html leads to: http://spark-project.org/download/spark-0.8.0-incubating.tgz This needs to be corrected to be (roughly): http://www.apache.org/dyn/closer.cgi/incubator/spark/spark-0.8.0-incubating/spark-0.8.0-incubating.tgz In fact, at some point it may be worth auditing your website source and eliminate references to spark-project.org that should really be pointing back to ASF. Thanks, Roman.