There are 3 basic approaches to rewriting proxy servers that I have seen in the
wild, each with their own strengths and weaknesses:
1) Proxy by port
This is the original EZproxy model, where each proxied resource gets its own
port number. This runs afoul of firewall rules to non port 80/443 resources,
and it creates a problem for SSL access, as clients try both HTTP and HTTPS to
the same port number, and EZproxy is not setup to differentiate both protocols
accessing the same port. With more and more resources moving to HTTPS, the end
of this solution as a viable option is in sight.
2) Proxy by hostname
This is the current preferred EZproxy model, as it addresses the HTTP(S) port
issue, but as you have identified, it instead creates a hostname mangling
issue, and now I’m curious myself about how EZproxy will handle a hyphenated
SSL site as well with HttpsHyphens enabled. I /think/ it does the right thing
by mapping the hostname back to the original internally, as a “-“ in hostnames
for release versioning is how the Google App Engine platform works, but I have
not explicitly investigated that.
3) Proxy by path
A different proxy product that we use, Muse Proxy from Edulib, leverages proxy
by path, where the original website URL is deconstructed and passed to the
proxy server as query arguments. This approach has worked fairly well as it
cleanly avoids the hostname mangling issues, though some of the new “single
page web apps” that use JavaScript routing patterns can be interesting, so the
vendor has added proxy by hostname support as an option for those sites as a
fallback.
So there is no perfect solution, but some work better than others. I’m looking
forward to expanding our use of the proxy by path approach, as that is a very
clean approach to this problem, and it seems to have fewer caveats than the
other two approaches.
--
Andrew Anderson, Director of Development, Library and Information Resources
Network, Inc.
http://www.lirn.net/ | http://www.twitter.com/LIRNnotes |
http://www.facebook.com/LIRNnotes
On Dec 18, 2014, at 17:04, Stuart A. Yeates syea...@gmail.com wrote:
It appears that the core of my problem was that I was unaware of
Option HttpsHyphens / NoHttpsHyphens
which toggle between proxying on
https://www.somedb.com.ezproxy.yourlib.org
and
https://www-somedb-com.ezproxy.yourlib.org
and allows infinitely nested domains to be proxied using a simple
wildcard cert by compressing things.
The paranoid in me is screaming that there's an interesting brokenness
in here when a separate hosted resource is at https://www-somedb.com/,
but I'm trying to overlook that.
cheers
stuart
--
...let us be heard from red core to black sky
On Mon, Dec 15, 2014 at 9:24 AM, Stuart A. Yeates syea...@gmail.com wrote:
Some resources are only available only via HTTPS. Previously we used a
wildcard certificate, I can't swear that it was ever tested as
working, but we weren't getting any complaints.
Recently browser security has been tightened and RFC 6125 has appeared
and been implemented and proxing of https resources with a naive
wildcard cert no longer works (we're getting complaints and are able
to duplicate the issues).
At
https://security.stackexchange.com/questions/10538/what-certificates-are-needed-for-multi-level-subdomains
there is an interesting solution with multiple wildcards in the same
cert:
foo.com
*.foo.com
*.*.foo.com
...
There is also the possibility that we can just grep the logs for every
machine name ever accessed and generate a huge list.
Has anyone tried these options? Successes? Failures? Thoughts?
cheers
stuart
--
...let us be heard from red core to black sky