Hi there,

On top of what Vladimir already saidŠ

re:  "Table1: 80 m records say Author, Table2 : 5k records say Category"

Just 80 million records?  Hbase tends to be overkill for relatively low
data volumes.

But if you wish to proceed this path, to extend what was already said,
rather than thinking of it in terms of an RDBMS 2 table design, create a
pre-joined table that has data from both tables as the query target.


As for the LRU cache, ³premature optimization is the root of all evil².
:-)    

Best of luck!


On 2/24/14, 4:38 PM, "Vikram Singh Chandel" <vikramsinghchan...@gmail.com>
wrote:

>Hi Vladimir
>We are planing to have around 40Gb for L1 and 150Gb for L2 and when this
>size is breached then we have start cleaning L1 and L2.
>now this cleaning (deletion of records) i needed that LRU info at record
>level, i.e. delete all records which are not been used past 15 days or
>later.
>We will save save this LRU info in a Metric column family.
>
>What we thought of using a Post Get Observer to write the value to Last
>Read column of Metric column family.
>this info we will later use for deletion of records.
>
>Is there any other simpler way. As you said block cache is at table level(
>if i am correct) but we info at record level
>
>Thanks
>
>
>On Tue, Feb 25, 2014 at 1:42 AM, Vladimir Rodionov
><vrodio...@carrieriq.com>wrote:
>
>> I recommend you work a little bit more on design.
>> NoSQL in general and HBase in particular are not very good at joining
>> tables, but very good at point and range queries.
>>
>> Sure, you can do some optimizations in your current approach: create
>>CACHE
>> table as IN_MEMORY, set TTL for say 1day (or less, depends
>> on the data volume your are able to store ) and utilize HBase internal
>> block cache (which is LRU) for that table.
>>
>> Best regards,
>> Vladimir Rodionov
>> Principal Platform Engineer
>> Carrier IQ, www.carrieriq.com
>> e-mail: vrodio...@carrieriq.com
>>
>> ________________________________________
>> From: Vikram Singh Chandel [vikramsinghchan...@gmail.com]
>> Sent: Monday, February 24, 2014 11:38 AM
>> To: user@hbase.apache.org
>> Subject: Re: How to get Last access time of a record
>>
>> Hi Vladimir,
>>
>> We are going to implement cache in HBase, let me give you a example
>>
>> We have two tables
>> Table1: 80 m records say Author
>> Table2 : 5k records say Category
>> query : Get details of all publications by Author XYZ broken down by
>> Category
>>
>> We fire a get on Table 1 to get a list of publications ids(hashed)
>> Then we do a scan on  Table 2 to get list of publications for each
>>category
>> and then we do Intersection
>> of both list and in the end get the details from publication table.
>>
>> Now suppose same query comes again instead of doing all this computation
>> again we are going to save the intersected results
>> in a table we are calling L2 Cache (there's a L1 also)
>>
>> Hope you would have got idea of what we are trying to achieve.
>> Now if you can help please
>>
>>
>>
>>
>>
>> On Tue, Feb 25, 2014 at 12:20 AM, Vladimir Rodionov <
>> vrodio...@carrieriq.com
>> > wrote:
>>
>> > Interesting. You want to use HBase as a cache. What data are going to
>> > cache? Is it some kind of a cold storage
>> > on tapes or Blu-Ray disks? Just curious.
>> >
>> > Best regards,
>> > Vladimir Rodionov
>> > Principal Platform Engineer
>> > Carrier IQ, www.carrieriq.com
>> > e-mail: vrodio...@carrieriq.com
>> >
>> > ________________________________________
>> > From: Vikram Singh Chandel [vikramsinghchan...@gmail.com]
>> > Sent: Monday, February 24, 2014 4:25 AM
>> > To: user@hbase.apache.org
>> > Subject: Re: How to get Last access time of a record
>> >
>> > Hi
>> > Hbase provides cache on non processed data, we are implementing a
>>second
>> > level of caching on processed data,
>> > for eg on intersected data between two tables, or on post processed
>>data.
>> >
>> >
>> > On Mon, Feb 24, 2014 at 5:02 PM, haosdent <haosd...@gmail.com> wrote:
>> >
>> > > HBase have already maintained a cache.
>> > >
>> > > >we can get last accessed time for a record
>> > >
>> > > I think you could get this from your application level.
>> > >
>> > >
>> > > On Mon, Feb 24, 2014 at 7:21 PM, Vikram Singh Chandel <
>> > > vikramsinghchan...@gmail.com> wrote:
>> > >
>> > > > Hi
>> > > >
>> > > > We are planning to implement caching mechanism for our Hbase data
>> model
>> > > for
>> > > > that we have to remove the *LRU (least recently used)  records*
>>from
>> > the
>> > > > cached table.
>> > > >
>> > > > Is there any way by which we can get last accessed time for a
>>record,
>> > > > primarily the access will be
>> > > > using *Range Scan and Get *
>> > > >
>> > > > --
>> > > > *Regards*
>> > > >
>> > > > *VIKRAM SINGH CHANDEL*
>> > > >
>> > > > Please do not print this email unless it is absolutely
>> > necessary,Reduce.
>> > > > Reuse. Recycle. Save our planet.
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Best Regards,
>> > > Haosdent Huang
>> > >
>> >
>> >
>> >
>> > --
>> > *Regards*
>> >
>> > *VIKRAM SINGH CHANDEL*
>> >
>> > Please do not print this email unless it is absolutely
>>necessary,Reduce.
>> > Reuse. Recycle. Save our planet.
>> >
>> > Confidentiality Notice:  The information contained in this message,
>> > including any attachments hereto, may be confidential and is intended
>>to
>> be
>> > read only by the individual or entity to whom this message is
>>addressed.
>> If
>> > the reader of this message is not the intended recipient or an agent
>>or
>> > designee of the intended recipient, please note that any review, use,
>> > disclosure or distribution of this message or its attachments, in any
>> form,
>> > is strictly prohibited.  If you have received this message in error,
>> please
>> > immediately notify the sender and/or notificati...@carrieriq.com and
>> > delete or destroy any copy of this message and its attachments.
>> >
>>
>>
>>
>> --
>> *Regards*
>>
>> *VIKRAM SINGH CHANDEL*
>>
>> Please do not print this email unless it is absolutely necessary,Reduce.
>> Reuse. Recycle. Save our planet.
>>
>> Confidentiality Notice:  The information contained in this message,
>> including any attachments hereto, may be confidential and is intended
>>to be
>> read only by the individual or entity to whom this message is
>>addressed. If
>> the reader of this message is not the intended recipient or an agent or
>> designee of the intended recipient, please note that any review, use,
>> disclosure or distribution of this message or its attachments, in any
>>form,
>> is strictly prohibited.  If you have received this message in error,
>>please
>> immediately notify the sender and/or notificati...@carrieriq.com and
>> delete or destroy any copy of this message and its attachments.
>>
>
>
>
>-- 
>*Regards*
>
>*VIKRAM SINGH CHANDEL*
>
>Please do not print this email unless it is absolutely necessary,Reduce.
>Reuse. Recycle. Save our planet.

Reply via email to