[Mongrel] [ANN] Mongrel 0.3.15 PR -- All The Fixes Good For You

Zed A. Shaw Wed, 15 Nov 2006 18:04:31 -0800

Hi folks,

Getting much much much closer to the 1.0 release.  I have some documentation to 
work on tonight, and I need to go through the patch queue one more time, but 
I've put up another pre-release for people to test.


What this pre-release does is pull together the various patches, monkey 
patching, and alternatives that make Mongrel either faster or more stable.  It 
is also the start of an effort to get Mongrel's CGIWrapper to handle the mime 
type decoding as well.

Full (lame ass svk style) ChangeLog is at 
http://mongrel.rubyforge.org/releases/ChangeLog
It's also tagged as 0.3.15 in svn for people who need that.

The better explanation of the changes is:

* Uses Mentalguy's Optimized Sync for locking rather than Mutex or Sync.  
***THIS BREAKS WINDOWS***
* Includes a monkey patched version of Mutex.  If you see memory leaks, throw:  
require 'mutex_fix' and see if they magically go away.
* Still depends on the cgi multipart fix.  I'll be just putting this into 
Mongrel directly when I work on the new CGIWrapper functionality.


CGI WRAPPER PLANS

When you get this new version (and read the ChangeLog) you'll see mention of a 
BMH implementation.  This is the Boyer-Moore-Horspool algorithm for finding one 
string in another:

http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm

I took the above code (which I guess is alright) and modified the hell out of 
it so that you can pass successive chunks of the haystack and it'll find all 
the needles ultra fast even across chunk boundaries.  My initial performance 
measurements puts it at about 3.84G/second. Yes, as in 26 seconds to process 
100G of data with 99000 mime boundaries in it.  Commence the arguing.

Why all the fuss?  The plan is to finally put the last nail in cgi.rb's coffin 
with this tasty bit of C code.  It is implemented in 0.3.15 mongrel as the 
Mongrel::BMHSearch class (feel free to play with it and break it).  With this 
class, Mongrel can now stream out multipart mime uploads *and* record the 
locations of the boundary string as it's streams.  When the file is fully 
uploaded it can then go back through and carve the result without further 
scanning.

This is coming from my work at http://www.travelistic.com/ where I'm doing a 
specialized upload server, and from Ezra's work at Engineyard, where we're 
finding that the main bottleneck for large file uploads is now just waiting 
around for cgi.rb to do it's lame find mime boundaries thing on giant files.  
This is all CPU bound, so a new algorithm was in order.  This is also where the 
current cgi.rb security hole is, so I'm hoping to just eliminate that and 
improve cgi multipart performance dramatically.

So, stay tuned.  Mongrel may soon get another boost in this particular domain, 
and then doing uploads will be very nice and fast without using much ram (like 
fastcgi).  If people think this little class is handy outside of Mongrel then 
I'll consider breaking it out.


THE 1.0 PLAN

Still don't have the 1.0 plan formalized, but my big list is:

1) Update the docs (apache needs an update, etc.)
2) Get this last change to CGIWrapper so that multipart mime is fast and cgi.rb 
is no longer needed.
3) Final extensive testing and little bug fixes for all.
4) Any other requests?

Keep in mind that the only core functionality change will be to CGIWrapper.  
This is really the last thing I feel will make Mongrel 1.0 ready.

Test away and let me know if anything bad happens.  BUT DON'T TEST ON 
PRODUCTION.  Yes, some dumbasses do that.

-- 
Zed A. Shaw, MUDCRAP-CE Master Black Belt Sifu
http://www.zedshaw.com/
http://www.awprofessional.com/title/0321483502 -- The Mongrel Book
http://mongrel.rubyforge.org/
http://www.lingr.com/room/3yXhqKbfPy8 -- Come get help.
_______________________________________________
Mongrel-users mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/mongrel-users

[Mongrel] [ANN] Mongrel 0.3.15 PR -- All The Fixes Good For You

Reply via email to