Re: Enable choosing mirror for updates.xml (was: Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment)

2021-06-03 Thread Matthias Bläsing
Hi,

as there was _zero_ feedback, I just killed the experiment. I'll leave
the code in the tools repository.

Greetings

Matthias

Am Dienstag, dem 02.03.2021 um 20:08 +0100 schrieb Matthias Bläsing:
> Hi,
> 
> Am Donnerstag, den 30.01.2020, 14:55 + schrieb Jean-Marc Borer:
> > 
> > This is bit of a complaint addressed to the NB infra community. When you
> > are stuck behind corporate proxy that is very aggressively filtering web
> > content, mirrors are a nightmare for you. In my case we need to whitelist
> > every single server hosting java binaries otherwise we get empty files. As
> > such, it is not viable to declare every single mirror (the company won't
> > accept that).
> > 
> > So my request is: please for NB IDE updates and their core components,
> > never ever use mirrors or provide an option to always point to the same
> > mirrored site.
> > 
> > This may sound silly, but is really an issue within enterprises and
> > therefore prevent the adoption of NB IDE far more than the alternatives,
> > which is really a pity...
> > 
> 
> this is a late reply, but I finally managed to get this into a state
> I'm willing to put into production and get it onto the 
> netbeans-vm1.apache.org server.
> 
> So that no one can hold this against me: THIS IS EXPERIMENTAL!
> 
> The problem is, that we have to serve the update catalog from a trusted
> source, but still want to use the mirror infrastructure to distribute
> load around the globe. The idea to fix this is to allow the user to
> choose a mirror, while still fetching the catalog from netbeans-vm.
> 
> The code can be found here:
> 
> https://github.com/apache/netbeans-tools
> 
> The instructions how to use can be found here:
> 
> https://netbeans-vm1.apache.org/uc-proxy-chooser/mirror-list.php
> 
> Basicly this URL:
> 
> https://netbeans-vm1.apache.org/uc/12.2/updates.xml.gz?{$netbeans.hash.code}
> 
> becomes this:
> 
> https://netbeans-vm1.apache.org/uc-proxy-chooser/12.2/updates.xml.gz?{$netbeans.hash.code}&mirror=
> 
> The PHP Script behind this uses the original update catalog (with
> cryptographic hashes) and patches the URLs in it to directly point to
> the chosen mirror.
> 
> For anyone with interest in the generated content, I found this command
> line to be helpful:
> 
> curl 
> https://doppel-helix.eu/uc-proxy-chooser/12.2/updates.xml.gz?mirror=ftp-stud.hs-esslingen.de
>  | gzip -d | less
> 
> 
> With this modification, two servers need to be whitelisted:
> 
> - the chosen mirror host (for active releases)
> - archive.apache.org (for older releases)
> 
> 
> I would be interested in feedback.
> 
> Greetings
> 
> Matthias
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
> For additional commands, e-mail: dev-h...@netbeans.apache.org
> 
> For further information about the NetBeans mailing lists, visit:
> https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists
> 
> 
> 



-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Enable choosing mirror for updates.xml (was: Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment)

2021-03-02 Thread Matthias Bläsing
Hi,

Am Donnerstag, den 30.01.2020, 14:55 + schrieb Jean-Marc Borer:
> 
> This is bit of a complaint addressed to the NB infra community. When you
> are stuck behind corporate proxy that is very aggressively filtering web
> content, mirrors are a nightmare for you. In my case we need to whitelist
> every single server hosting java binaries otherwise we get empty files. As
> such, it is not viable to declare every single mirror (the company won't
> accept that).
> 
> So my request is: please for NB IDE updates and their core components,
> never ever use mirrors or provide an option to always point to the same
> mirrored site.
> 
> This may sound silly, but is really an issue within enterprises and
> therefore prevent the adoption of NB IDE far more than the alternatives,
> which is really a pity...
> 

this is a late reply, but I finally managed to get this into a state
I'm willing to put into production and get it onto the 
netbeans-vm1.apache.org server.

So that no one can hold this against me: THIS IS EXPERIMENTAL!

The problem is, that we have to serve the update catalog from a trusted
source, but still want to use the mirror infrastructure to distribute
load around the globe. The idea to fix this is to allow the user to
choose a mirror, while still fetching the catalog from netbeans-vm.

The code can be found here:

https://github.com/apache/netbeans-tools

The instructions how to use can be found here:

https://netbeans-vm1.apache.org/uc-proxy-chooser/mirror-list.php

Basicly this URL:

https://netbeans-vm1.apache.org/uc/12.2/updates.xml.gz?{$netbeans.hash.code}

becomes this:

https://netbeans-vm1.apache.org/uc-proxy-chooser/12.2/updates.xml.gz?{$netbeans.hash.code}&mirror=

The PHP Script behind this uses the original update catalog (with
cryptographic hashes) and patches the URLs in it to directly point to
the chosen mirror.

For anyone with interest in the generated content, I found this command
line to be helpful:

curl 
https://doppel-helix.eu/uc-proxy-chooser/12.2/updates.xml.gz?mirror=ftp-stud.hs-esslingen.de
 | gzip -d | less


With this modification, two servers need to be whitelisted:

- the chosen mirror host (for active releases)
- archive.apache.org (for older releases)


I would be interested in feedback.

Greetings

Matthias


-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-14 Thread Neil C Smith
Hi,

On Tue, 14 Apr 2020 at 12:27, antonio  wrote:
> I think it's better to generate an "update.xml" file for each mirror,
> containing full urls for the given mirror, and generate them _once_, and
> not for _every_ user request. That would be 226 [1] update-xml files. We
> can generate a webpage that allows users to select a mirror of their choice.

That would be a good idea.  Or at least some caching so that they're
not generated each time, but possibly validated on a regular basis?

Looking at the closer.cgi [1] script I'm wondering whether, if we are
going down this route, we could generate and cache update catalog
files by country code (cca2) as well?  Would require doing GeoIP
lookup on every call to update file though.   OTOH, all nbm requests
would bypass redirects - pros/cons.

[1] 
https://svn.apache.org/repos/asf/infrastructure/site/trunk/content/dyn/closer.lua

Best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-14 Thread antonio

Hi Matthias,

Thanks for the explanation, crystal clear now.

I think it's better to generate an "update.xml" file for each mirror, 
containing full urls for the given mirror, and generate them _once_, and 
not for _every_ user request. That would be 226 [1] update-xml files. We 
can generate a webpage that allows users to select a mirror of their choice.


Then Jean-Marc can download this mirror-specific update.xml file (for 
the mirror of their IT dept. choice), add it to his update center in the 
IDE, and download files directly from that mirror.


Would this work?

Regarding the "fragile update code" I think we should think about a more 
Apache-centric solution to the problem, and refactor/review the code, 
right? Maybe 13.0 is a good target?


Kind regards,
Antonio


[1]
http://www.apache.org/mirrors/dist.html
Currently there're 226 Apache full mirrors.

El 14/4/20 a las 11:59, Matthias Bläsing escribió:

Hi,

Am Dienstag, den 14.04.2020, 06:02 +0200 schrieb antonio:

I see your point, I just can't understand why this computes the URLs on
the server side (in the netbeans-vm php file) and not on the client side
(in the IDE itself).

Wouldn't it be easier to let the user choose her preferred mirror(from a
drop down list, for example) and then compute the appropriate URLs from
that? I have a limited understanding on the internals of the update
center logic, so I usually get lost on this.


ok, the problem is this:

updates.xml is our trust anchor. It must come from a trusted source
(apache infrastructure) and that is not the case for the mirror
network.

In the updates.xml the URLs to location to the nbms is specified
relative to the location of the updates.xml file.

This is where the current redirection magic comes in. When nbms are
requested the request is passed through a redirect cascade, that ends
in closer.lua, which redirects the request to the "right" mirror.

The problem is:

a) the right mirror can't be found when requested via IPv6
b) the right mirror sometimes needs to be one specific mirror
c) if the users location can't be determined "preferred" in closer.lua
is ignored

So instead of relying on closer.lua the script moves the decision to
the user. Who can specify the mirror to use via the `mirror` parameter.

The PHP script modifies the XML structure to hold absolute URLs instead
of relative ones.

This could be moved to the Netbeans Code, but then the already fragile
autoupdate code would need a special case just to deal with the
speciality of the apache infrastructure. We saw in the past, that
nearly nobody is willing to go into the autoupdate code and I don't see
this changing anytime soon.

It would still be helpful to create the right URL in the IDE, but
without the server support, this won't fly.

Greetings

Matthias






-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-14 Thread Neil C Smith
On Tue, 14 Apr 2020 at 11:13, Matthias Bläsing
 wrote:
> Yes - it is the _catalog_ not every nbm. Yes this still needs work (for
> example instead of loading the catalog and working on the DOM, a
> streaming parser should be able to do the modification on-the-fly). The
> alternative would be to generate update.xmls for every mirror.

But the catalog is the thing that's polled continuously, whereas nbms
are rarely touched. The additional overhead should at least be checked
for, although maybe not much different to the plugin centre?

> Sorry - no idea what this means. Do you mean, that closer.lua will
> reject request because a request limit is reached?

Actually it's the redirect into the VM that gets rejected after like
~50 nbms, although potential for closer.lua to reject too i think.
Anything that could get the IDE and Ant scripts to download nbms
directly from one mirror rather than continuously via all redirects
would be an improvement when multiple nbms are being downloaded in one
batch.

I did a test uninstalling and reinstalling an entire cluster a while
back and that also failed, although might not in all cases.

> > It might be good if the IDE code followed redirects only for the
> > updates.xml, and treated relative links as relative to the catalog
> > endpoint anyway?
>
> This is already the case - the urls are resolved relative to the update
> center URL. BUT that URL must not be redirected to mirrors, as it is
> our trust anchor.

As above, the resolution is a problem - but yes, I made a mistake in
what I said, and it's not like I fixed the .htaccess in the first
place :-)  - we were serving the catalog from the mirrors prior to
11.1.

Somehow, we could do with a situation where a mirror is looked up
once, and all nbms are downloaded directly from it.  But now of course
keeping the catalog as the trust anchor too.  Maybe your approach is
then the right one left.

> No, we must _never_ redirect the IDE to fetch the updates.xml from an
> untrusted source and the mirror network must be considered untrusted.

Well, I meant end users doing that, not us - but good point, it's not
really the thing to recommend.

That leaves pointing people to try using
https://dist.apache.org/repos/dist/release/netbeans/netbeans/11.3/nbms/updates.xml.gz
if stuck, which might get frowned upon.  :-\

And of course, if we push updated nbms they are manually added into
the catalog on the VM at the moment - so the above won't get updates
either.

Best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-14 Thread Matthias Bläsing
Am Dienstag, den 14.04.2020, 10:20 +0100 schrieb Neil C Smith:
> On Mon, 13 Apr 2020 at 20:44, Matthias Bläsing
>  wrote:
> > Fetch the base updates.xml:
> ...
> > The difference is, that the "distibution" attributes are relative i the
> > base version and fully qualified in the mirror version.
> ...
> > Does this looks sane?
> 
> If you're thinking of generating every request for updates.xml?  Not really?!

Yes - it is the _catalog_ not every nbm. Yes this still needs work (for
example instead of loading the catalog and working on the DOM, a
streaming parser should be able to do the modification on-the-fly). The
alternative would be to generate update.xmls for every mirror.

> The relative / base URL different in the XML can be a problem - eg.
> using the platform Ant scripts to create the platform will fail due to
> redirects maxing out downloading NBMs.  I'm not sure if installing
> whole clusters in the IDE would trigger the same issue.  In some ways
> we only redirect via NetBeans VM as a statistics gathering mechanism.

Sorry - no idea what this means. Do you mean, that closer.lua will
reject request because a request limit is reached?

> It might be good if the IDE code followed redirects only for the
> updates.xml, and treated relative links as relative to the catalog
> endpoint anyway?

This is already the case - the urls are resolved relative to the update
center URL. BUT that URL must not be redirected to mirrors, as it is
our trust anchor.

> I agree with Antonio that this is probably better handled in the end
> user's IDE itself.  With Apache mirrors now moving to https, changing
> the update centre to directly reference a mirror may be the best
> short-term workaround - eg.
> https://www.mirrorservice.org/sites/ftp.apache.org/netbeans/netbeans/11.3/nbms/updates.xml.gz

No, we must _never_ redirect the IDE to fetch the updates.xml from an
untrusted source and the mirror network must be considered untrusted.

> This obviously doesn't handle mirrors going away after next release is
> done - would need some logic to fall back to default on error if
> providing a UI for this?

Only if we trust the mirrors.

Greetings

Matthias


-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-14 Thread Matthias Bläsing
Hi,

Am Dienstag, den 14.04.2020, 06:02 +0200 schrieb antonio:
> I see your point, I just can't understand why this computes the URLs on 
> the server side (in the netbeans-vm php file) and not on the client side 
> (in the IDE itself).
> 
> Wouldn't it be easier to let the user choose her preferred mirror(from a 
> drop down list, for example) and then compute the appropriate URLs from 
> that? I have a limited understanding on the internals of the update 
> center logic, so I usually get lost on this.

ok, the problem is this:

updates.xml is our trust anchor. It must come from a trusted source
(apache infrastructure) and that is not the case for the mirror
network.

In the updates.xml the URLs to location to the nbms is specified
relative to the location of the updates.xml file.

This is where the current redirection magic comes in. When nbms are
requested the request is passed through a redirect cascade, that ends
in closer.lua, which redirects the request to the "right" mirror.

The problem is:

a) the right mirror can't be found when requested via IPv6
b) the right mirror sometimes needs to be one specific mirror
c) if the users location can't be determined "preferred" in closer.lua
   is ignored

So instead of relying on closer.lua the script moves the decision to
the user. Who can specify the mirror to use via the `mirror` parameter.

The PHP script modifies the XML structure to hold absolute URLs instead
of relative ones.

This could be moved to the Netbeans Code, but then the already fragile
autoupdate code would need a special case just to deal with the
speciality of the apache infrastructure. We saw in the past, that
nearly nobody is willing to go into the autoupdate code and I don't see
this changing anytime soon.

It would still be helpful to create the right URL in the IDE, but
without the server support, this won't fly.

Greetings

Matthias






-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-14 Thread Neil C Smith
On Mon, 13 Apr 2020 at 20:44, Matthias Bläsing
 wrote:
> Fetch the base updates.xml:
...
> The difference is, that the "distibution" attributes are relative i the
> base version and fully qualified in the mirror version.
...
> Does this looks sane?

If you're thinking of generating every request for updates.xml?  Not really?!

The relative / base URL different in the XML can be a problem - eg.
using the platform Ant scripts to create the platform will fail due to
redirects maxing out downloading NBMs.  I'm not sure if installing
whole clusters in the IDE would trigger the same issue.  In some ways
we only redirect via NetBeans VM as a statistics gathering mechanism.

It might be good if the IDE code followed redirects only for the
updates.xml, and treated relative links as relative to the catalog
endpoint anyway?

I agree with Antonio that this is probably better handled in the end
user's IDE itself.  With Apache mirrors now moving to https, changing
the update centre to directly reference a mirror may be the best
short-term workaround - eg.
https://www.mirrorservice.org/sites/ftp.apache.org/netbeans/netbeans/11.3/nbms/updates.xml.gz

This obviously doesn't handle mirrors going away after next release is
done - would need some logic to fall back to default on error if
providing a UI for this?

But we could add instructions for affected users to change UC links to
mirrors in 12.0 and get reports back on issues before potentially
progressing with UI in an update release?  And in doing so deciding
whether we need the statistics gathering aspect of this anyway.

Best wishes,

Neil

-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-13 Thread antonio

Hi Matthias,

Sorry for the late reply.

I see your point, I just can't understand why this computes the URLs on 
the server side (in the netbeans-vm php file) and not on the client side 
(in the IDE itself).


Wouldn't it be easier to let the user choose her preferred mirror(from a 
drop down list, for example) and then compute the appropriate URLs from 
that? I have a limited understanding on the internals of the update 
center logic, so I usually get lost on this.


Thanks,
Antonio

El 13/4/20 a las 21:44, Matthias Bläsing escribió:

@Antonio, Neil: Here is the setup I used to simulate:

$docRoot/apache/netbeans/netbeans/11.3/nbms/updates.xml holds update.xml from 
NB 11.3
$docRoot/apache/netbeans/netbeans/11.2/nbms/updates.xml holds update.xml from 
NB 11.2
$docRoot/apache/proxy-chooser/updates.php holds the script from the above git 
repository, the folder is writeable by the webserver


-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Re: Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-04-13 Thread Matthias Bläsing
Hi all,

Am Donnerstag, den 30.01.2020, 14:55 + schrieb Jean-Marc Borer:
> This is bit of a complaint addressed to the NB infra community. When you
> are stuck behind corporate proxy that is very aggressively filtering web
> content, mirrors are a nightmare for you. In my case we need to whitelist
> every single server hosting java binaries otherwise we get empty files. As
> such, it is not viable to declare every single mirror (the company won't
> accept that).
> 
> So my request is: please for NB IDE updates and their core components,
> never ever use mirrors or provide an option to always point to the same
> mirrored site.

I implemented a first variant:

https://github.com/matthiasblaesing/netbeans-tools/tree/proxy-chooser
https://github.com/matthiasblaesing/netbeans-tools/commit/ab41d0dd364a9135675cdf5331c3a0500cb63a46

The result can be seen here:

NetBeans 11.3:

Fetch the base updates.xml:

https://doppel-helix.eu/apache/netbeans/netbeans/11.3/nbms/updates.php

Fetch an updates.xml, that uses the mirror ftp.halifax.rwth-aachen.de
(for a list see http://www.apache.org/mirrors/):

https://doppel-helix.eu/apache/netbeans/netbeans/11.3/nbms/updates.php?mirror=ftp.halifax.rwth-aachen.de

For archived versions, the "mirror" entry is ignored and the source is
moved to archive:

https://doppel-helix.eu/apache/netbeans/netbeans/11.2/nbms/updates.php?mirror=ftp.halifax.rwth-aachen.de


The difference is, that the "distibution" attributes are relative i the
base version and fully qualified in the mirror version.

@all:

Does this looks sane?

@Antonio, Neil: Here is the setup I used to simulate:

$docRoot/apache/netbeans/netbeans/11.3/nbms/updates.xml holds update.xml from 
NB 11.3
$docRoot/apache/netbeans/netbeans/11.2/nbms/updates.xml holds update.xml from 
NB 11.2
$docRoot/apache/proxy-chooser/updates.php holds the script from the above git 
repository, the folder is writeable by the webserver

The .htaccess file reads:

RewriteEngine on
RewriteRule !/apache/netbeans/netbeans/11.2/nbms/updates.php.*$ 
apache/proxy-chooser/updates.php$1 [L]
RewriteRule !/apache/netbeans/netbeans/11.3/nbms/updates.php.*$ 
apache/proxy-chooser/updates.php$1 [L]

Would that work on the netbeans-vm? Do you request alternatives/improvements?

If you have questions, feel free to ask!

Matthias


-
To unsubscribe, e-mail: dev-unsubscr...@netbeans.apache.org
For additional commands, e-mail: dev-h...@netbeans.apache.org

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists





Fwd: Update centers based on mirrors are a nightmare to support in an enterprise environment

2020-01-30 Thread Jean-Marc Borer
Hi guys,

This is bit of a complaint addressed to the NB infra community. When you
are stuck behind corporate proxy that is very aggressively filtering web
content, mirrors are a nightmare for you. In my case we need to whitelist
every single server hosting java binaries otherwise we get empty files. As
such, it is not viable to declare every single mirror (the company won't
accept that).

So my request is: please for NB IDE updates and their core components,
never ever use mirrors or provide an option to always point to the same
mirrored site.

This may sound silly, but is really an issue within enterprises and
therefore prevent the adoption of NB IDE far more than the alternatives,
which is really a pity...

Regards,

JMB