There are 3 basic approaches to rewriting proxy servers that I have seen in the 
wild, each with their own strengths and weaknesses:

1) Proxy by port

This is the original EZproxy model, where each proxied resource gets its own 
port number.  This runs afoul of firewall rules to non port 80/443 resources, 
and it creates a problem for SSL access, as clients try both HTTP and HTTPS to 
the same port number, and EZproxy is not setup to differentiate both protocols 
accessing the same port.  With more and more resources moving to HTTPS, the end 
of this solution as a viable option is in sight.

2) Proxy by hostname

This is the current preferred EZproxy model, as it addresses the HTTP(S) port 
issue, but as you have identified, it instead creates a hostname mangling 
issue, and now I’m curious myself about how EZproxy will handle a hyphenated 
SSL site as well with HttpsHyphens enabled.  I /think/ it does the right thing 
by mapping the hostname back to the original internally, as a “-“ in hostnames 
for release versioning is how the Google App Engine platform works, but I have 
not explicitly investigated that.

3) Proxy by path

A different proxy product that we use, Muse Proxy from Edulib, leverages proxy 
by path, where the original website URL is deconstructed and passed to the 
proxy server as query arguments.  This approach has worked fairly well as it 
cleanly avoids the hostname mangling issues, though some of the new “single 
page web apps” that use JavaScript routing patterns can be interesting, so the 
vendor has added proxy by hostname support as an option for those sites as a 
fallback.

So there is no perfect solution, but some work better than others.  I’m looking 
forward to expanding our use of the proxy by path approach, as that is a very 
clean approach to this problem, and it seems to have fewer caveats than the 
other two approaches.

-- 
Andrew Anderson, Director of Development, Library and Information Resources 
Network, Inc.
http://www.lirn.net/ | http://www.twitter.com/LIRNnotes | 
http://www.facebook.com/LIRNnotes

On Dec 18, 2014, at 17:04, Stuart A. Yeates <syea...@gmail.com> wrote:

> It appears that the core of my problem was that I was unaware of
> 
> Option HttpsHyphens / NoHttpsHyphens
> 
> which toggle between proxying on
> 
> https://www.somedb.com.ezproxy.yourlib.org
> 
> and
> 
> https://www-somedb-com.ezproxy.yourlib.org
> 
> and allows infinitely nested domains to be proxied using a simple
> wildcard cert by compressing things.
> 
> The paranoid in me is screaming that there's an interesting brokenness
> in here when a separate hosted resource is at https://www-somedb.com/,
> but I'm trying to overlook that.
> 
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
> 
> 
> On Mon, Dec 15, 2014 at 9:24 AM, Stuart A. Yeates <syea...@gmail.com> wrote:
>> Some resources are only available only via HTTPS. Previously we used a
>> wildcard certificate, I can't swear that it was ever tested as
>> working, but we weren't getting any complaints.
>> 
>> Recently browser security has been tightened and RFC 6125 has appeared
>> and been implemented and proxing of https resources with a naive
>> wildcard cert no longer works (we're getting complaints and are able
>> to duplicate the issues).
>> 
>> At 
>> https://security.stackexchange.com/questions/10538/what-certificates-are-needed-for-multi-level-subdomains
>> there is an interesting solution with multiple wildcards in the same
>> cert:
>> 
>> foo.com
>> *.foo.com
>> *.*.foo.com
>> ...
>> 
>> There is also the possibility that we can just grep the logs for every
>> machine name ever accessed and generate a huge list.
>> 
>> Has anyone tried these options? Successes? Failures? Thoughts?
>> 
>> cheers
>> stuart
>> 
>> 
>> --
>> ...let us be heard from red core to black sky

Reply via email to