Sorry, the problem was caused by a bug from my code. So, it works, I can identify the good row.
In fact my row keys are coded as revsersehost...@date1-date2 More like theses ones : www.google....@200801-200802 www.google....@200902-200904 www.google....@201001-201002 To identify the good row for the request www.google....@200901, I should access two rows to find the closest date interval : getClosestRowBefore(www.google....@200901) = > www.google....@200801-200802 getScanner(www.google....@200801-200802*).next() => www.google....@200902-200904 I use this method, and it works, but it is really slow, even if make the same requests several times ! All my columns are block cached. Does theses two methods benefit from block caching ? Thank you for your time ! Jérôme ----- Message d'origine ----- De: stack <[email protected]> Date: Vendredi, Février 27, 2009 7:10 pm Objet: Re: Re : Re: Re : Re: Table design question > getClosestRowBefore should work. What are you supplying for row? The > column you ask for exists? > > What happens if you open a scanner at the (non-existent) row > 'www.google.com@'? > > St.Ack > > On Fri, Feb 27, 2009 at 8:02 AM, <[email protected]> wrote: > > > Hi, > > > > following the discussion with Stack, I have modified the way I > insert data > > in hbase. > > > > Now, I insert data in an htable using u...@date as row key. > > Like this : > > > > Case3: > > BactUpdate update = new BacthUpdate(www.google....@20090218); > > update.put('content:', > > 1ffe36e5b13f28e69c2886f40fd3fcea2ce05d030b508c11d714dead5d69000f); > > update.put('type:', 'text/html'); > > table.commit(update); > > > > I want to access this rows but with inexact keys. If i have > inserted these > > rows : > > > > www.google....@200801 > > www.google....@200901 > > www.google....@201001 > > > > and make this request : > > > > www.google....@200902, I would like to find the row with the > specified url > > at the closest date from 200902 (www.google....@200901 in my case) > > > > So, I thought i could use the method : > HTable.getClosestRowBefore(byte[]> row, byte[] column) to identify > a row which the key is less than the > > requested one, and then scan to identify precisely the good row. > > > > > > In fact, this methods returns always the row with the null key if > I request > > a row that doesn't exactly match an inserted one. > > > > Is there really a way to make this kind of request in hbase ? > > > > Jérôme Thièvre > > > > > > > > > > > > ----- Message d'origine ----- > > De: stack <[email protected]> > > Date: Mercredi, Février 18, 2009 10:48 pm > > Objet: Re: Re : Re: Table design question > > > > > On Wed, Feb 18, 2009 at 10:29 AM, <[email protected]> wrote: > > > > > > > > > > > > > Currently we can only return records at an explicit date or > > > older, not > > > > > newer. > > > > > > > > > > > > > > > Each record is made of 10 columns, and each insert is of > the type; > > > > > > > > > > > > insertRecord(url, date, record); > > > > > > > > > > > > There are several possible designs for my record table : > > > > > > > > > > > > 1. RowKey= url and all columns are labelled with the same > date.> > > > > > > > > 2. RowKey=url and we use timestamp and version support of > hbase,> > > > and columns > > > > > > names are columnFamily names (no label). > > > > > > > > > > > 3. RowKey=url+date, and columns names are columnFamily > names (no > > > > > label).> > > > > > > > > > > Examples please (I've only had one cup of coffee so far this > > > morning).> > > > > > > > > > > > > > > > > > > Supposed colum families are : {'content:', 'type:'} > > > > I want to insert a new record with url www.google.com at date > > > 20090218 : > > > > > > > > Case 1: > > > > BactUpdate update = new BacthUpdate(www.google.com); > > > > update.put('content:20090218', > > > > > 1ffe36e5b13f28e69c2886f40fd3fcea2ce05d030b508c11d714dead5d69000f);> > > > update.put('type:20090218', 'text/html'); > > > > table.commit(update); > > > > > > > > Case 2: Implies use hbase versioning > > > > BactUpdate update = new BacthUpdate(www.google.com, > > > toTimestamp(20090218> )); > > > > update.put('content:', > > > > > 1ffe36e5b13f28e69c2886f40fd3fcea2ce05d030b508c11d714dead5d69000f);> > > > update.put('type:', 'text/html'); > > > > table.commit(update); > > > > > > > > > > > > I like this schema best. > > > > > > But both case 1 and 2 will have issues in current hbase if > > > thousands of > > > versions (to be fixed in 0.20.0). Just a heads up. > > > > > > > > > > > > > > Case3: > > > > BactUpdate update = new BacthUpdate(www.google....@20090218); > > > > update.put('content:', > > > > > 1ffe36e5b13f28e69c2886f40fd3fcea2ce05d030b508c11d714dead5d69000f);> > > > update.put('type:', 'text/html'); > > > > table.commit(update); > > > > > > > > > > > > > This will work fine in current hbase, even if thousands of > versions.> > > > > > > > Is it possible (or will it be) to load column names without > load cell > > > > content ? Same questions for the timestamp ? > > > > > > > > > > Cell has to have something in it. > > > > > > Or do you mean query hbase to find list of columns in a row > without> > returning data? If the latter is your question, no, > there is no > > > way to get > > > listing without getting the payload too. > > > > > > St.Ack > > > > > >
