Daniel A. wrote in post #968711:
> Hello everyone! I've a question that I hope someone in here would be
> able to
> answer. Today I was writing a simple test for a method that takes the
> last
> "n" (where n is an integer) users that signed up on my web app.
>
> I know it's a simple method, but then I started to think about the
> optimal
> way to query the DB, and I tried different options. The first thing I
> did
> was create 1000 users through a simple:
>
> 1000.times {Factory(:student)}
>
> The difference between the first and the last records is about 2
> minutes.
>
> first created_at: 2010-12-15 20:35:41
> last created_at: 2010-12-15 20:37:51
>
> And this is what I've got on the console when using
> ActiveRecord::Base.uncached:
>
> *User.order("created_at DESC").limit(5).reverse*
> User Load (*7.4ms*)  SELECT "users".* FROM "users" ORDER BY created_at
> DESC
> LIMIT 5
>
> *User.order("created_at ASC").last(5)*
> User Load (*1179.1ms*)  SELECT "users".* FROM "users" ORDER BY
> created_at
> ASC
>
> *User.last(5)*  *# I didn't even know I could add a parameter to last*
> User Load (*1171.2ms*)  SELECT "users".* FROM "users"
>
> *User.all.last(5)*
> User Load (*1177.3ms*)  SELECT "users".* FROM "users"
>
> Well, there's obviously a huge performance impact between querying the
> DB with the first one and the rest.

Only the first of your four tested queries has a "LIMIT 5" in the SQL.

With that SQL, only 5 records are transfered to the Ruby array. This
is because the "limit(5)" function does that (I assume on an
ActiveRecord::Relation object) and can thus modify the SQL query.

When the last(5) is called on the resulting ActiveRecord::Relation
object,
it acts more like the Array#last method and operates on the data that is
already fully loaded from the database. So, this would transfer all 1000
records to a Ruby array and then in Ruby select 5 records out of that
array.

I presume that is the problem. For a database, processing 1000 records
should not be a real problem. But transferring them to the ruby program
may take some time.

In almost all cases, the construct which triggers a LIMIT on the
database
is the better one, _if_ you can do an ORDER BY that can be expressed
in the database (you can use database functions in the database to add
some flexibility there). This is typically used when you want to
paginate
through a very large dataset, because you do not want to load all the
records into ruby, only the small subset to render that 1 paginated
page.

So, I see two proper solutions where you clearly see that the order and
the limit are applied.

> User.all(:order => "created_at DESC", :limit => 5)
> User.order("created_at DESC").limit(5)

They both result in:
User Load (0.5ms)  SELECT "users".* FROM "users" ORDER BY created_at 
DESC LIMIT 5
User Load (0.3ms)  SELECT "users".* FROM "users" ORDER BY created_at 
DESC LIMIT 5

-- 
Posted via http://www.ruby-forum.com/.

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Talk" group.
To post to this group, send email to rubyonrails-t...@googlegroups.com.
To unsubscribe from this group, send email to 
rubyonrails-talk+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en.

Reply via email to