>>> Supporting PAC by polipo would ease the forwarding.

>> It's been a while since I looked at it, but if memory servers the 
>> "proxy.pac" file is written in Javascript.

> The idea of using it within a proxy as the upstream proxy selection 
> mechanism is interesting, but I agree it's not worth the complexity. 
> I've also seen some pretty nasty proxy.pac abuse, with 100's of lines of 
> js for IMHO no good reason. Using js just invites such abuse.

Uh-uh.  On the other hand, Javascript allows neat tricks such as 
implementing load balancing in just a few lines of code -- without a full 
programming language, you'd need to hard-wire that in your browser.  (For 
the idea of a full programming language at the wrong layer followed to its 
logical extreme, have a lookg at "NeFS":

     ftp://ftp.funet.fi/pub/networking/documents/historical/nefs.doc.ps.Z

which makes NeWS (q.v.) seem almost reasonable.  Note that it's a really 
old-fashioned PS file, generated by FrameMaker 2, with the pages in the 
inverse order; you'll probably need to convert it to PDF in order to read it.)


> I'd much rather see some sort of generic regex url matching together 
> with support for applying these for selecting parent proxies, caching 
> policy settings, and whatever else you might want to vary per url. 
> Something similar, but less overkill complex, to squid's acl approach.

I wouldn't be entirely opposed to a clean implementation of that landing 
in Polipo -- but I won't do it myself.  Especially since regexps are not 
quite trivial to optimise.

In Polipo's URL matcher (forbidden.c), I distinguish between domains and 
regexps.  I first check for a domain match (using a binary search in 
O(log n)), and only after that fails to match do I go to the regexp matcher. 
The reason for that is that general-purpose regexp libraries are optimised for 
space and convenience, not for time.  If you do some profiling, you'll 
realise that with just a few hundred regexps, Polipo is spending much of 
its time in the regexp matcher.

Writing a really fast regexp library is not exactly rocket science — you 
need to generate a DFA, then minimise it, and interpret the result in 
a really tight loop.  That's much simpler than the kind of magic that 
LALR/GLR parser generators do on their push-down automaton, but for some 
reason I was unable to find a DFA library that did what I needed for 
Polipo, hence the above-mentioned hack.

-- Juliusz

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Polipo-users mailing list
Polipo-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/polipo-users

Reply via email to