Hi Lee,

On Tue, Mar 6, 2012 at 8:37 PM, Lee Hambley <lee.hamb...@gmail.com> wrote:
> I believe it has to do with the possibility that on a newly deployed server
> the symlink might not exist in the first place. So a `mv` would fail.

The destination symlink location not existing? That wouldn't cause a
mv to fail, would it? Or the one which is being created by
deploy:create_symlink before it's moved into place? That possibility
doesn't make sense to me either.

To be clear, I don't mean mv the old symlink out of the way. I mean mv
the new one over the top - reads to the location should fall either
side of the rename, so none of them will fail. (Well, actually it
varies by *nix flavour - OS X, for example, broke this specification
right up until Lion: http://www.weirdnet.nl/apple/rename.html)

> If this is really a reproducible problem for you (it's never been a problem
> anyone has reported in 5 years of Capistrano being mainstream)

It's a race condition, so it's not easily reproducible under
production conditions. This one is pretty small too: you're unlikely
to see it "in the wild", unless your request rate is phenomenal, or
you're not deploying a website. Then again, in testing, it's extremely
reproducible, so I can't help but think it must be causing the odd
404. Especially consider that this bug is likely to be under-reported
by its nature: your next deploy would probably go perfectly, and you'd
have to be watching the logs very closely (even after several hundred
successful deploys) to be able to correlate one or two failed requests
back to a deployment problem.

So, to test: first, I set up a simple loop performing the symlink swap
(using the shell, just like Capistrano does) in a loop.

mkdir a b; echo -n a > a/foo; echo -n b > b/foo; ln -s a current
while true; do
  rm current && ln -s a current
  rm current && ln -s b current
done

Then I tried to repeatedly read the value in current/foo in another
thread (https://gist.github.com/1985290). At full bore reading, this
fails to get through even a single deployment: the python loop is
always managing to open the file faster than the rm && ln can
complete; it throws an IO error on the first attempt.

The symlink is only unreadable for a few thousandths of a second; my
read loop was doing about 11KHz. But if you're aiming for something
like C10K (or not serving web at all, but using Capistrano as a
generic deployment tool), then it begins to be more likely that you'll
see problems. And this isn't even considering some of the weirder
stuff that could happen: an interrupt causing execution jitter between
the rm and ln, or the filesystem being marked as read-only between
them, etc.

> you can easily override `deploy:create_symlink` to meet your requirements

Thanks, yeah, I will be. I was only asking to see if there was a
definite purpose or rationale behind it, or whether it had just always
been like that.


Dom

-- 
* You received this message because you are subscribed to the Google Groups 
"Capistrano" group.
* To post to this group, send email to capistrano@googlegroups.com
* To unsubscribe from this group, send email to 
capistrano+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/capistrano?hl=en

Reply via email to