Lars, should we always consider disabling Nagle? What's the down side? JM
2013/2/9, Varun Sharma <va...@pinterest.com>: > Yeah, I meant true... > > On Sat, Feb 9, 2013 at 12:17 AM, lars hofhansl <la...@apache.org> wrote: > >> Should be set to true. If tcpnodelay is set to true, Nagle's is disabled. >> >> -- Lars >> >> >> >> ________________________________ >> From: Varun Sharma <va...@pinterest.com> >> To: user@hbase.apache.org; lars hofhansl <la...@apache.org> >> Sent: Saturday, February 9, 2013 12:11 AM >> Subject: Re: Get on a row with multiple columns >> >> >> Okay I did my research - these need to be set to false. I agree. >> >> >> On Sat, Feb 9, 2013 at 12:05 AM, Varun Sharma <va...@pinterest.com> >> wrote: >> >> I have ipc.client.tcpnodelay, ipc.server.tcpnodelay set to false and the >> hbase one - [hbase].ipc.client.tcpnodelay set to true. Do these induce >> network latency ? >> > >> > >> >On Fri, Feb 8, 2013 at 11:57 PM, lars hofhansl <la...@apache.org> wrote: >> > >> >Sorry.. I meant set these two config parameters to true (not false as I >> state below). >> >> >> >> >> >> >> >> >> >>----- Original Message ----- >> >>From: lars hofhansl <la...@apache.org> >> >>To: "user@hbase.apache.org" <user@hbase.apache.org> >> >>Cc: >> >>Sent: Friday, February 8, 2013 11:41 PM >> >>Subject: Re: Get on a row with multiple columns >> >> >> >>Only somewhat related. Seeing the magic 40ms random read time there. >> >> Did >> you disable Nagle's? >> >>(set hbase.ipc.client.tcpnodelay and ipc.server.tcpnodelay to false in >> hbase-site.xml). >> >> >> >>________________________________ >> >>From: Varun Sharma <va...@pinterest.com> >> >>To: user@hbase.apache.org; lars hofhansl <la...@apache.org> >> >>Sent: Friday, February 8, 2013 10:45 PM >> >>Subject: Re: Get on a row with multiple columns >> >> >> >>The use case is like your twitter feed. Tweets from people u follow. >> >> When >> >>someone unfollows, you need to delete a bunch of his tweets from the >> >>following feed. So, its frequent, and we are essentially running into >> some >> >>extreme corner cases like the one above. We need high write throughput >> for >> >>this, since when someone tweets, we need to fanout the tweet to all the >> >>followers. We need the ability to do fast deletes (unfollow) and fast >> adds >> >>(follow) and also be able to do fast random gets - when a real user >> >> loads >> >>the feed. I doubt we will able to play much with the schema here since >> >> we >> >>need to support a bunch of use cases. >> >> >> >>@lars: It does not take 30 seconds to place 300 delete markers. It >> >> takes >> 30 >> >>seconds to first find which of those 300 pins are in the set of columns >> >>present - this invokes 300 gets and then place the appropriate delete >> >>markers. Note that we can have tens of thousands of columns in a single >> row >> >>so a single get is not cheap. >> >> >> >>If we were to just place delete markers, that is very fast. But when >> >>started doing that, our random read performance suffered because of too >> >>many delete markers. The 90th percentile on random reads shot up from >> >> 40 >> >>milliseconds to 150 milliseconds, which is not acceptable for our >> usecase. >> >> >> >>Thanks >> >>Varun >> >> >> >>On Fri, Feb 8, 2013 at 10:33 PM, lars hofhansl <la...@apache.org> >> >> wrote: >> >> >> >>> Can you organize your columns and then delete by column family? >> >>> >> >>> deleteColumn without specifying a TS is expensive, since HBase first >> has >> >>> to figure out what the latest TS is. >> >>> >> >>> Should be better in 0.94.1 or later since deletes are batched like >> >>> Puts >> >>> (still need to retrieve the latest version, though). >> >>> >> >>> In 0.94.3 or later you can also the BulkDeleteEndPoint, which >> >>> basically >> >>> let's specify a scan condition and then place specific delete marker >> for >> >>> all KVs encountered. >> >>> >> >>> >> >>> If you wanted to get really >> >>> fancy, you could hook up a coprocessor to the compaction process and >> >>> simply filter all KVs you no longer want (without ever placing any >> >>> delete markers). >> >>> >> >>> >> >>> Are you saying it takes 15 seconds to place 300 version delete >> markers?! >> >>> >> >>> >> >>> -- Lars >> >>> >> >>> >> >>> >> >>> ________________________________ >> >>> From: Varun Sharma <va...@pinterest.com> >> >>> To: user@hbase.apache.org >> >>> Sent: Friday, February 8, 2013 10:05 PM >> >>> Subject: Re: Get on a row with multiple columns >> >>> >> >>> We are given a set of 300 columns to delete. I tested two cases: >> >>> >> >>> 1) deleteColumns() - with the 's' >> >>> >> >>> This function simply adds delete markers for 300 columns, in our >> >>> case, >> >>> typically only a fraction of these columns are actually present - 10. >> After >> >>> starting to use deleteColumns, we starting seeing a drop in cluster >> wide >> >>> random read performance - 90th percentile latency worsened, so did >> >>> 99th >> >>> probably because of having to traverse delete markers. I attribute >> this to >> >>> profusion of delete markers in the cluster. Major compactions slowed >> down >> >>> by almost 50 percent probably because of having to clean out >> significantly >> >>> more delete markers. >> >>> >> >>> 2) deleteColumn() >> >>> >> >>> Ended up with untolerable 15 second calls, which clogged all the >> handlers. >> >>> Making the cluster pretty much unresponsive. >> >>> >> >>> On Fri, Feb 8, 2013 at 9:55 PM, Ted Yu <yuzhih...@gmail.com> wrote: >> >>> >> >>> > For the 300 column deletes, can you show us how the Delete(s) are >> >>> > constructed ? >> >>> > >> >>> > Do you use this method ? >> >>> > >> >>> > public Delete deleteColumns(byte [] family, byte [] qualifier) { >> >>> > Thanks >> >>> > >> >>> > On Fri, Feb 8, 2013 at 9:44 PM, Varun Sharma <va...@pinterest.com> >> >>> wrote: >> >>> > >> >>> > > So a Get call with multiple columns on a single row should be >> >>> > > much >> >>> faster >> >>> > > than independent Get(s) on each of those columns for that row. I >> >>> > > am >> >>> > > basically seeing severely poor performance (~ 15 seconds) for >> certain >> >>> > > deleteColumn() calls and I am seeing that there is a >> >>> > > prepareDeleteTimestamps() function in HRegion.java which first >> tries to >> >>> > > locate the column by doing individual gets on each column you >> >>> > > want >> to >> >>> > > delete (I am doing 300 column deletes). Now, I think this should >> ideall >> >>> > by >> >>> > > 1 get call with the batch of 300 columns so that one scan can >> retrieve >> >>> > the >> >>> > > columns and the columns that are found, are indeed deleted. >> >>> > > >> >>> > > Before I try this fix, I wanted to get an opinion if it will make >> >>> > > a >> >>> > > difference to batch the get() and it seems from your answer, it >> should. >> >>> > > >> >>> > > On Fri, Feb 8, 2013 at 9:34 PM, lars hofhansl <la...@apache.org> >> >>> wrote: >> >>> > > >> >>> > > > Everything is stored as a KeyValue in HBase. >> >>> > > > The Key part of a KeyValue contains the row key, column family, >> >>> column >> >>> > > > name, and timestamp in that order. >> >>> > > > Each column family has it's own store and store files. >> >>> > > > >> >>> > > > So in a nutshell a get is executed by starting a scan at the >> >>> > > > row >> key >> >>> > > > (which is a prefix of the key) in each store (CF) and then >> scanning >> >>> > > forward >> >>> > > > in each store until the next row key is reached. (in reality it >> is a >> >>> > bit >> >>> > > > more complicated due to multiple versions, skipping columns, >> >>> > > > etc) >> >>> > > > >> >>> > > > >> >>> > > > -- Lars >> >>> > > > ________________________________ >> >>> > > > From: Varun Sharma <va...@pinterest.com> >> >>> > > > To: user@hbase.apache.org >> >>> > > > Sent: Friday, February 8, 2013 9:22 PM >> >>> > > > Subject: Re: Get on a row with multiple columns >> >>> > > > >> >>> > > > Sorry, I was a little unclear with my question. >> >>> > > > >> >>> > > > Lets say you have >> >>> > > > >> >>> > > > Get get = new Get(row) >> >>> > > > get.addColumn("1"); >> >>> > > > get.addColumn("2"); >> >>> > > > . >> >>> > > > . >> >>> > > > . >> >>> > > > >> >>> > > > When internally hbase executes the batch get, it will seek to >> column >> >>> > "1", >> >>> > > > now since data is lexicographically sorted, it does not need to >> seek >> >>> > from >> >>> > > > the beginning to get to "2", it can continue seeking, >> >>> > > > henceforth >> >>> since >> >>> > > > column "2" will always be after column "1". I want to know >> whether >> >>> this >> >>> > > is >> >>> > > > how a multicolumn get on a row works or not. >> >>> > > > >> >>> > > > Thanks >> >>> > > > Varun >> >>> > > > >> >>> > > > On Fri, Feb 8, 2013 at 9:08 PM, Marcos Ortiz <mlor...@uci.cu> >> wrote: >> >>> > > > >> >>> > > > > Like Ishan said, a get give an instance of the Result class. >> >>> > > > > All utility methods that you can use are: >> >>> > > > > byte[] getValue(byte[] family, byte[] qualifier) >> >>> > > > > byte[] value() >> >>> > > > > byte[] getRow() >> >>> > > > > int size() >> >>> > > > > boolean isEmpty() >> >>> > > > > KeyValue[] raw() # Like Ishan said, all data here is sorted >> >>> > > > > List<KeyValue> list() >> >>> > > > > >> >>> > > > > >> >>> > > > > >> >>> > > > > >> >>> > > > > On 02/08/2013 11:29 PM, Ishan Chhabra wrote: >> >>> > > > > >> >>> > > > >> Based on what I read in Lars' book, a get will return a >> result a >> >>> > > Result, >> >>> > > > >> which is internally a KeyValue[]. This KeyValue[] is sorted >> by the >> >>> > key >> >>> > > > and >> >>> > > > >> you access this array using raw or list methods on the >> >>> > > > >> Result >> >>> > object. >> >>> > > > >> >> >>> > > > >> >> >>> > > > >> On Fri, Feb 8, 2013 at 5:40 PM, Varun Sharma < >> va...@pinterest.com >> >>> > >> >>> > > > wrote: >> >>> > > > >> >> >>> > > > >> +user >> >>> > > > >>> >> >>> > > > >>> On Fri, Feb 8, 2013 at 5:38 PM, Varun Sharma < >> >>> va...@pinterest.com> >> >>> > > > >>> wrote: >> >>> > > > >>> >> >>> > > > >>> Hi, >> >>> > > > >>>> >> >>> > > > >>>> When I do a Get on a row with multiple column qualifiers. >> Do we >> >>> > sort >> >>> > > > the >> >>> > > > >>>> column qualifers and make use of the sorted order when we >> get >> >>> the >> >>> > > > >>>> >> >>> > > > >>> results ? >> >>> > > > >>> >> >>> > > > >>>> Thanks >> >>> > > > >>>> Varun >> >>> > > > >>>> >> >>> > > > >>>> >> >>> > > > >> >> >>> > > > >> >> >>> > > > > -- >> >>> > > > > Marcos Ortiz Valmaseda, >> >>> > > > > Product Manager && Data Scientist at UCI >> >>> > > > > Blog: http://marcosluis2186.**posterous.com< >> >>> > > > http://marcosluis2186.posterous.com> >> >>> > > > > Twitter: @marcosluis2186 >> >>> > > > > <http://twitter.com/**marcosluis2186< >> >>> > > > http://twitter.com/marcosluis2186> >> >>> > > > > > >> >>> > > > > >> >>> > > > >> >>> > > >> >>> > >> >>> >> >> >> >> >> > >> >