On Sun, Oct 31, 2010 at 09:28:50AM -0200, Rodrigo Rosenfeld Rosas wrote:
>  On 30-10-2010 15:55, Aaron Patterson wrote:
> >...
> >>When listening to Yehuda's talk about development on the client
> >>side, which was really great by the way, I got really worried when
> >>he commented about some Rails validations not being concurrent-safe
> >>nor even thread-safe.
> >>
> >>While I can understand it is hard to guarantee uniqueness validation
> >>among different server instances, this can be easily avoided in a
> >>single server configuration with config.threadsafe! enabled.
> >>
> >>...
> >>I've just read the documentation for validates_uniqueness_of and it
> >>explains well the problem:
> >>
> >>http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of
> >>
> >>But I was thinking if Rails could provide some way for avoiding
> >>dealing with exceptions when using a single multi-thread server
> >>environment. For instance:
> >>
> >>@record.validate :lock =>  @shared_lock do
> >>   # code run if @record.valid?
> >>   ...
> >>   @record.save
> >>end or (render :edit; return)
> >>redirect_to :list
> >>...
> >>
> >>The :lock parameter should be optional and an internal lock (one per
> >>model, maybe) should be used when not specified in the case it is
> >>not necessary to share this code with another block which could also
> >>affect the record, for instance.
> >A shared lock like this would work if you only had one process running.
> >It won't work for people that run multiple processes as they won't share
> >the lock.
> 
> Yes, I know, but for lots of applications, a single multi-thread
> server would suffice and Ruby locks are probably much faster than
> database locks and don't have the database locks shortcomings.
> 
> For instance, usually I use PostgreSQL. There is no LOCK instruction
> in ANSI SQL. PostgreSQL only support locking an entire table, and it
> is not possible to name the locks (maybe an alternative could be the
> pg_advisory_lock function added in 8.2). So, it is probably better
> to rely on another kind of shared locks between different servers,
> but probably this won't be as efficient as it can be in a single
> multi-thread server, even when using a solution similar to
> memcached, I mean a socket in-memory lock server.

So why don't use use a mutex and wrap your saves with that?  I don't see
a reason to change AR API for this.

> >>An error should raise if 'config.threadsafe!' is not enabled and it
> >>should be pointed out in the docs that this won't work for multiple
> >>servers setup, for avoiding confusing end-developer users. Maybe a
> >>warning instead of an error should suffice, for allowing this usage
> >>by plugins that don't have control of the deployment decisions.
> >>
> >>If the user calls 'save' directly without validating first, the
> >>validation and save operations shoud be atomic in this case. So,
> >>'save' should also support the :lock parameter.
> >>
> >>Is this reasonable or am I missing something?
> >Even people running multi threaded servers run multiple processes.
> >You could use the database connection to create a shared lock.
> >
> >Though, IMHO if you want to guarantee a unique column, you should add a
> >unique index on the column.  The uniqueness validation should work most
> >of the time, and for edge cases, the user could see an exception.
> 
> Yes, the unique index should exist anyway. Maybe it would be great
> to create a 'db:create_unique_indexes' Rake task for creating a
> migration to add the missing unique indexes to a new migration. But
> for a single multi-thread server, such a lock would avoid the
> exception generation in the corner case. But my concerns aren't
> specific to unique validations only. For instance, imagine a
> situation where a user needs credits to do something and the system
> provides the services 'put_credit' and 'consume_credit'.
> 
> # current user's credit = 5
> 
> # consume_credit is launched
> [consume_service] (render :error; return) unless @user.credit > 0
> [consume_service] do_consume_credit
> 
> # put_credit is launched in another thread
> [put_credit_service] @user.credit += 1 # 5 + 1 = 6
> [put_credit_service] @user.save
> 
> [consume_service] @user.credit -= 1 # 5 - 1 = 4
> [consume_service] @user.credit = 4
> 
> The correct expected user's credit should obviously be 5 instead of
> 4. For this simple case, a manual SQL UPDATE would solve this
> concurrency problem (UPDATE user SET credit = credit + 1). Also,
> this situation would probably be modeled as 'user has_many credits'
> and this situation wouldn't happen, but you get the idea.

Yes, this would seem reasonable if AR objects were thread safe, but they
aren't.  If you're going to use a threaded server, you need to know what
data structures are thread safe and which ones aren't.  Then you need to
design your system accordingly.

My point is that we can document what is thread safe and what isn't.
Then *you* can figure out how to handle concurrency.

> The problem could easily be avoided in the above situation with a
> shared named lock between the two services. In this case, a normal
> Ruby lock could be used instead of modifying Rails API to support
> locks. But the advantage of improving the API is that the user can
> choose if he wants to enable a shared lock between multiple servers
> (more expensive) or using a less expensive Ruby lock in some
> configuration file that could be changed later. Also, maybe the
> Rails community can come with a better implementation of locking and
> it would be easier to apply it instead of having to modify all
> manually put locking code.

You can decide this without changes to the AR api.  Just write a method
that returns a lock depending on a configuration file:

  def gimme_lock
    case YAML.load_file('/some/config')[Rails.env]['lock_type']
    when 'mutex'
      ...
    when 'database'
      ...
    else
      ...
    end
  end

  def safe_update
    gimme_lock.synchronize do
      yield
    end
  end

  safe_update { mymodel.save }

This seems like a feature easily implemented without the help of rails.

> Maybe we could start with a new Concurrency guide on Rails Guides.

Probably documenting which things are thread safe and which things
aren't would be good.  For now you can assume no AR objects are thread
safe.  :-)

> >If you're really paranoid, you could implement it now with something
> >like this (note this is mysql specific):
> >
> >def shared_lock(name)
> >   r = AR::Base.connection.execute("SELECT GET_LOCK('#{name}', 2)")
> >   # ... make sure to check the return value ...
> >   yield
> >   ensure
> >   AR::Base.connection.execute("SELECT RELEASE_LOCK('#{name}')")
> >end
> >
> 
> I'm just curious. What would happen in this case (MySQL) if the
> application is killed after the lock is acquired but before the
> ensure block is executed? The connection would probably be closed.
> Would this free the lock too?

Yes.  MySQL frees the lock when the connection is closed:

  
http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_get-lock

> >def some_function
> >   # this function must calculate a value that you can reproduce across
> >   # servers and processes
> >end
> >
> >shared_lock(some_function) do
> >   my_model.save!
> >end
> >
> >Read here for more info:
> >
> >   
> > http://dev.mysql.com/doc/refman/4.1/en/miscellaneous-functions.html#function_get-lock
> >
> >There may be a way to do this in a non-database specific way (a lock
> >server or something).
> 
> Maybe a lock server of just temporary file locking would fit better.
> But then, what would happen again in the case I presented above
> regarding killing the app between acquiring and releasing the lock?

That's something you need to work out with your lock server.  :-)

-- 
Aaron Patterson
http://tenderlovemaking.com/

Attachment: pgpdAM8OEVpB5.pgp
Description: PGP signature

Reply via email to