Re: Questions on Table design for time series data

2012-10-04 Thread Karthikeyan Muthukumarasamy
Jacques: I think you got me wrong on my statement. I was only requesting you to think again about my questions assuming that I have seen the jive video, since there are some differences in our case compared to jive. I completely understand that all this is voluntary effort and my sincere thanks

Re: Questions on Table design for time series data

2012-10-04 Thread Karthikeyan Muthukumarasamy
Thanks Eugeny. We are currently running some experiments based on your suggestions! On Thu, Oct 4, 2012 at 2:20 AM, Eugeny Morozov emoro...@griddynamics.comwrote: I'd suggest to think about manual major compactions and splits. Using manual compactions and bulkload allows to split HFiles

Re: Questions on Table design for time series data

2012-10-03 Thread Jacques
I would suggest you watch this video: http://www.cloudera.com/resource/video-hbasecon-2012-real-performance-gains-with-real-time-data/ The jive guys solved a lot of the problems you're talking about and discuss it in that case study. On Wed, Oct 3, 2012 at 6:27 AM, Karthikeyan Muthukumarasamy

Re: Questions on Table design for time series data

2012-10-03 Thread Karthikeyan Muthukumarasamy
Hi Jacques, Thanks for the response! Yes, I have seen the video before. It suggets usage of TTL based retention implementation. In their usecase, Jive has a fixed retention say 3 months and so they can pre-create regions for so many buckets, their bucket id is DAY_OF_YEAR%retention_in_days. But,

Re: Questions on Table design for time series data

2012-10-03 Thread Jacques
We're all volunteers here so we don't always have the time to fully understand and plan others' schemas. In general your questions seemed to be worried about a lot of things that may or may not matter depending on the specifics of your implementation. Without knowing those specifics it is hard

Re: Questions on Table design for time series data

2012-10-03 Thread Eugeny Morozov
I'd suggest to think about manual major compactions and splits. Using manual compactions and bulkload allows to split HFiles manually. Like if you would like to read last 3 months more often that all others data, then you could have three HFiles for each month and one HFile for whole other stuff.