On Jul 22, 2015, at 9:10 PM, Mohit Sindhwani <mo_m...@onghu.com> wrote: > We have tried this and the query is quite a bit slower. Filtering to the > last 900k records before doing the recorded_on part helped speed it up.
I don't understand how that could possibly be the case if there's an index on recorded_on. > Your email got me going back to look at all the parts again since obviously > the query should be using the index and it was still slow. Further search > last night made me realize that it's not the indexes that are a problem. The > problem is the count(distinct group_id) part which seems to be quite slow in > PostgreSQL. This is a lot faster: > select count(*) from > (select distinct group_id from > data_store_v2 where recorded_on >= '2015-06-06') td; > than: > select count(distinct group_id) from > data_store_v2 where recorded_on >= '2015-06-06'; > > as explained here: > https://www.periscope.io/blog/use-subqueries-to-count-distinct-50x-faster.html > > So, I guess the real problem was being masked by something else and an > incorrect assumption on my part :) I would expect the select count(distinct...) to be a major contributor to the time taken by the query, just given the amount of work it must do. The select count(*) from (select distinct...) alternative is a nice tip :) > > Thanks for the analysis :D > I do understand SQL and I thought I'm not throwing things together... for my > understanding, which column was unnecessary? I thought we needed all: > > group_id for counting the distinct group_id > > recorded_on for the subsequent query on it > > id only for getting the most recent records I apologize--I misread the query structure. I got it into my head as: select ... from (select ... from ... where recorded_on ... order by ... limit ...) I think you can see how THAT query would have better fit my description of being poorly constructed. I'm glad that my somewhat off-base pontification still managed to point you in a useful direction! -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ https://www.linkedin.com/in/scottribe/ (303) 722-0567 voice -- You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to rubyonrails-talk+unsubscr...@googlegroups.com. To post to this group, send email to rubyonrails-talk@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/rubyonrails-talk/1DD7FF4B-12E3-479B-977F-048C512FF5C1%40elevated-dev.com. For more options, visit https://groups.google.com/d/optout.