Jimmy Soho <[email protected]> wrote: > Hi All, > > I was wondering what would happen when large files were uploaded to > our system in parallel to endpoints that don't process file uploads. > In particular I was wondering if we're vulnerable to a simple DoS > attack.
nginx will protect you by buffering large requests to disk, so slow requests are taken care of (of course you may still run out of disk space) > The setup I tested with was nginx v1.2.4 with upload module (v2.2.0) > configured only for location /uploads with 2 unicorn (v4.3.1) workers > with timeout 30 secs, all running on 1 small unix box. > > In a few terminals I started this command 3 times in parallel: > > $ curl -i -F importer_input=@/Users/admin/largefile.tar.gz > https://mywebserver.com/doesnotexist > > In a browser I then tried to go a page that would be served by a unicorn > worker. > > My expectation was that I would not get to see the web page as all > unicorn workers would be busy with receiving / saving the upload. As > discussed in for example this article: > http://stackoverflow.com/questions/9592664/unicorn-rails-large-uploads. > Or as https://github.com/dwilkie/carrierwave_direct describes it: > > "Processing and saving file uploads are typically long running tasks > and should be done in a background process." That is true. It's good to move slow jobs to background processes if possible if the bottleneck is either: a) your application processing b) the storage destination of your app (e.g. cloud storage) However, if your only bottleneck is client <-> your app, then nginx will take care of that part for you. > But I don't see this. The page is served just fine in my setup. The > requests for the file uploads appear in the nginx access log at the > same time the curl upload command eventually finishes minutes later > client side, and then it's handed off to a unicorn/rack worker > process, which quickly returns a 404 page not found. Response times of > less than 50ms. > > What am I missing here? I'm starting to wonder what's the use of the > nginx upload module? My understanding was that it's use was to keep > unicorn workers available as long as a file upload was in progress, > but it seems that without that module it does the same thing. I'm not familiar with the nginx upload module, but stock nginx will already do full request buffering for you. It looks like the nginx upload module[1] is mostly meant for standalone apps written for nginx, and not when nginx is used as a proxy for Rails app... [1] http://www.grid.net.ru/nginx/upload.en.html > Another question (more an nginx question though I guess): is there a > way to kill an upload request as early as possible if the request is > not made against known / accepted URI locations, instead of waiting > for it to be completely uploaded to our system and/or waiting for it > to reach the unicorn workers? I'm not sure if nginx has this functionality, but unicorn lazily buffers uploads. So your upload will be fully read by nginx, but unicorn will only read the uploaded request body if your application wants to read it. Unfortunately, I think most application frameworks (e.g. Rails) will attempt to do all the multipart parsing up front. To get around this, you'll probably want some middleware along the following lines (and placed in front of whichever part of your stack calls Rack::Multipart.parse_multipart) class BadUploadStopper def initialize(app) @app = app end def call(env) case env["REQUEST_METHOD"] when "POST", "PUT" case env["PATH_INFO"] when "/upload_allowed" @app.call(env) # forward to the app else # bad path, don't waste time with @app.call [ 403, {}, [ "Go away\n" ] ] end else @app.call(env) # forward to the app end end end ------------------- config.ru --------------------- use BadUploadStopper run YourApp.new _______________________________________________ Unicorn mailing list - [email protected] http://rubyforge.org/mailman/listinfo/mongrel-unicorn Do not quote signatures (like this one) or top post when replying
