First I would work to learn patterns in time increments, which occur most 
often, and plan from there.  Then I would design with regard to DB code and 
servers, optimal Master-slave replication.  Pretty sure that in replication 
WHERE  clauses like (where recorded_on >= (now()  - '24 hr'::INTERVAL) 
could be eliminated at the user level. But I am not certain.


On Tuesday, July 21, 2015 at 10:31:20 AM UTC-4, Mohit Sindhwani wrote:
>
> Hello! 
>
> We are thinking of different ways to handle hot data in our system.  We 
> get a lot of information that's very relevant for periods of time.  So, 
> for example, we have some data that we query on the basis of 
>  > counts over last 30 minutes 
>  > data collected in the last 30 minutes 
>  > items collected today 
>  > items collected within the past week 
> ...and so on. 
>
> We are looking at different strategies to keep this data updated and to 
> manage scaling the database.  So, we are looking at horizontal sharding 
> [splitting data into multiple tables that inherit from a master and have 
> exclusion constraints] and are also considering having tables that hold 
> data as: 
>  > All the data 
>  > Data for the last 24 hours 
>  > Data for the last 2 hours 
> That way, when we want something recent, we would just query the most 
> recent table, but in the few occasions that we need something more, we 
> go to the larger tables that are sharded. 
>
> Just for reference, we are doing something like this: 
> select count(*), count(distinct group_id) from 
> (select group_id, recorded_on from data_store_v2 order by id DESC limit 
> 900000) td 
> where recorded_on >= (now()  - '24 hr'::INTERVAL); 
>
> We are getting 800,000 data items a day right now and the above query 
> takes around 14 seconds on a single table that is not sharded and has 
> around 268million records.  Every week, this table becomes slightly 
> slower since we add close to 6 million records every week. 
>
> I've read this: 
>
> https://www.amberbit.com/blog/2014/2/4/postgresql-awesomeness-for-rails-developers/
>  
> and am looking at ways that everything is managed under Rails, if 
> possible. 
>
> So, the questions (and thanks for reading this far) are: 
>  > What is a good way to do this while still working within ActiveRecord? 
>  > This is a capability that we'd like to attach to any model that might 
> need it.  What would be a good way to approach a gem for it? 
>
> I'm sure there will be a few more questions as we progress on this. 
>
> Thanks, 
> Mohit. 
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rubyonrails-talk+unsubscr...@googlegroups.com.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rubyonrails-talk/111e5bc4-1268-4549-8486-eca3717fd445%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to