Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-21 Thread Eugeny Morozov
> before the particular one. > > > On Mon, Jan 21, 2013 at 4:22 AM, Michael Segel >wrote: > > > If its the same class and its not a patch, then the first class loaded > > wins. > > > > So if you have a Class Foo and HBase has a Class Foo, your code will &g

Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-21 Thread Eugeny Morozov
class loaded > wins. > > So if you have a Class Foo and HBase has a Class Foo, your code will never > see the light of day. > > Perhaps I'm stating the obvious but its something to think about when > working w Hadoop. > > On Jan 19, 2013, at 3:36 AM, Eugeny Moroz

Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-20 Thread Eugeny Morozov
ter filter's getHint method) know exactly where to go? On Sat, Jan 19, 2013 at 5:16 PM, Ted wrote: > In your original email you said the first key looked like start key of a > region, can you verify that ? > > Thanks > > On Jan 19, 2013, at 1:36 AM, Eugeny Morozov > wrot

Re: Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-19 Thread Eugeny Morozov
3:56 AM, Ted Yu wrote: > To my knowledge CDH-4.1.2 is based on HBase 0.92.x > > Looks like you were using patch from HBASE-6509 which was integrated to > trunk only. > Please confirm. > > Copying Alex who wrote the patch. > > Cheers > > On Fri, Jan 18, 2013 at

Custom Filter and SEEK_NEXT_USING_HINT issue

2013-01-18 Thread Eugeny Morozov
Hi, folks! HBase, Hadoop, etc version is CDH-4.1.2 I'm using custom FuzzyRowFilter, which I get from http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/and suddenly after quite a time we found that it starts loosing data. Basically the i

Re: Many scanner opening

2012-12-23 Thread Eugeny Morozov
looks like it's enough to get contention =) On Thu, Dec 20, 2012 at 10:51 PM, lars hofhansl wrote: > Cool. > > You probably made it less likely that your scanners will scan the same > HFile in parallel. > > -- Lars > > > > ________

Re: Many scanner opening

2012-12-20 Thread Eugeny Morozov
wrote: > You might have run into HBASE-7336. > (Not available in any official release, yet) > > If you're using 0.94 (and probably 0.92) you can just apply this patch > (it's save and simple). > > > > > From: Eugeny Morozov &

Many scanner opening

2012-12-18 Thread Eugeny Morozov
Hello! We faced an issue recently that the more map tasks are completed, the longer it takes to complete one more map task. In our architecture we have two scanners to read the table. The first one, which is called 'outer' scanner is reading table and filter some rowkeys. These rowkeys are used a

Re: Debugging Coprocessor code in Eclipse

2012-10-16 Thread Eugeny Morozov
Anil, you could've also get some benefit from using HBaseTestingUtility. It is able to run HBase cluster in standalone mode all-in-one JVM. Of course it requires to have some code to create tables, assign coprocessor to table and populate it with data. And then run client code against it. All of

Re: Does TotalOrderPartitioner refresh its partitions selection tree

2012-10-09 Thread Eugeny Morozov
Chris, In this case nothing scared actually happens. * If partitions are the same, then HBase simply copies all your HFiles during bulkloading procedure. * If partitions are changed, then it still copies them, but in addition, some of these files (according to number of split regions) would be al

Re: Questions on Table design for time series data

2012-10-03 Thread Eugeny Morozov
I'd suggest to think about manual major compactions and splits. Using manual compactions and bulkload allows to split HFiles manually. Like if you would like to read last 3 months more often that all others data, then you could have three HFiles for each month and one HFile for whole other stuff. U

Re: Bulk Loads and Updates

2012-10-03 Thread Eugeny Morozov
Hi! Sure, you do, but don't forget to sort all KV pairs before put them into context. Or else you'd get some "unsorted" expection. If you have them completely the same and you need to reduce number of same lines you could use Combiner, but their behavior is not deterministic, so basically there i

Re: Distribution of regions to servers

2012-09-27 Thread Eugeny Morozov
; That is why I want to do some management of regions to get better load > balance based on large queries. > > Hope it makes sense to you. > > Best Wishes > Dan Han > > > On Wed, Sep 26, 2012 at 3:19 PM, Eugeny Morozov > wrote: > > > Dan, > > > > I have

Re: Distribution of regions to servers

2012-09-26 Thread Eugeny Morozov
Dan, I have additional questions. What is the access pattern of your queries? I mean that f.e. PrefixFilters have to be applied for all KeyValue pairs in HFiles, which could be slow. Or f.e. scanner setCaching option is able to decrease number of network hops to get data from RegionServer. Additi

Re: Simple way to unit test HBase Map reduce jobs?

2012-09-24 Thread Eugeny Morozov
Hi, Elazar, I've found that MRUnit is pretty convenient way to test MR jobs. On the other hand there is HBaseTestingUtility, which is helpful to run miniCluster. Hope this helps. On Mon, Sep 24, 2012 at 8:43 PM, Elazar Leibovich wrote: > Is there a way similar to miniserver to test map reduce

Re: scan.setTimeRange performance

2012-09-24 Thread Eugeny Morozov
n a public web server. > > I don't have much to contribute except than to point to a recent > conversation that you can find here: > http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/28722 > > Hope this helps, > > J-D > > On Fri, Sep 21, 2012 at 5:03 AM

scan.setTimeRange performance

2012-09-21 Thread Eugeny Morozov
Hello! It is known and I saw it in the code that time range set by scan.setTimeRange is used to filter out HFiles for further scan. Which means that speed of following scanner.next must be almost zero in case if I set time range far away in future. I am sure that I do not have HFiles that fall int