Daniel A. wrote in post #968711: > Hello everyone! I've a question that I hope someone in here would be > able to > answer. Today I was writing a simple test for a method that takes the > last > "n" (where n is an integer) users that signed up on my web app. > > I know it's a simple method, but then I started to think about the > optimal > way to query the DB, and I tried different options. The first thing I > did > was create 1000 users through a simple: > > 1000.times {Factory(:student)} > > The difference between the first and the last records is about 2 > minutes. > > first created_at: 2010-12-15 20:35:41 > last created_at: 2010-12-15 20:37:51 > > And this is what I've got on the console when using > ActiveRecord::Base.uncached: > > *User.order("created_at DESC").limit(5).reverse* > User Load (*7.4ms*) SELECT "users".* FROM "users" ORDER BY created_at > DESC > LIMIT 5 > > *User.order("created_at ASC").last(5)* > User Load (*1179.1ms*) SELECT "users".* FROM "users" ORDER BY > created_at > ASC > > *User.last(5)* *# I didn't even know I could add a parameter to last* > User Load (*1171.2ms*) SELECT "users".* FROM "users" > > *User.all.last(5)* > User Load (*1177.3ms*) SELECT "users".* FROM "users" > > Well, there's obviously a huge performance impact between querying the > DB with the first one and the rest.
Only the first of your four tested queries has a "LIMIT 5" in the SQL. With that SQL, only 5 records are transfered to the Ruby array. This is because the "limit(5)" function does that (I assume on an ActiveRecord::Relation object) and can thus modify the SQL query. When the last(5) is called on the resulting ActiveRecord::Relation object, it acts more like the Array#last method and operates on the data that is already fully loaded from the database. So, this would transfer all 1000 records to a Ruby array and then in Ruby select 5 records out of that array. I presume that is the problem. For a database, processing 1000 records should not be a real problem. But transferring them to the ruby program may take some time. In almost all cases, the construct which triggers a LIMIT on the database is the better one, _if_ you can do an ORDER BY that can be expressed in the database (you can use database functions in the database to add some flexibility there). This is typically used when you want to paginate through a very large dataset, because you do not want to load all the records into ruby, only the small subset to render that 1 paginated page. So, I see two proper solutions where you clearly see that the order and the limit are applied. > User.all(:order => "created_at DESC", :limit => 5) > User.order("created_at DESC").limit(5) They both result in: User Load (0.5ms) SELECT "users".* FROM "users" ORDER BY created_at DESC LIMIT 5 User Load (0.3ms) SELECT "users".* FROM "users" ORDER BY created_at DESC LIMIT 5 -- Posted via http://www.ruby-forum.com/. -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-t...@googlegroups.com. To unsubscribe from this group, send email to rubyonrails-talk+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en.