Re: Hbase Question
Hi, No matter how many versions of HBase class in your jar, the classloader will choose the first one on the classpath. Perhaps you could consider OSGi (A kind of module system). 2017-11-17 18:57 GMT+08:00 apple : > Hi: > I expect synchrodata between hbase 0.9 and hbase 1.2. > What's more,I find several ways to do it. > Follow : > 1.replication (need modify) > 2.sync hlog before delete to hdfs .oldlog (need modify) > 3.client writes data to two hbase > > 4.client writes data to kafka and consume to two hbase > This is a good choice to satisfy your scenario. > But, I think the bigest question is one java client how to use two > hbase-cliet jar,It must be conflict,How can I do? >
Re: HBase Question
Meh. Go to Hive instead. > On Mar 13, 2015, at 11:35 AM, Abraham Tom wrote: > > If you are comfortable with SQL > I would look into Phoenix > http://phoenix.apache.org/index.html > > > On Thu, Mar 12, 2015 at 10:00 PM, Sudeep Pandey > wrote: > >> Hello: >> >> If I am unable to do JAVA coding and prefer HBase shell for HBase >> works/interactions, will I be able to do all operations? >> i.e. >> >> Is JAVA coding (Client API) needed to do something in HBase which is not >> possible by HBase shell commands? >> >> Thank You, >> Sudeep Pandey >> Ph: 5107783972 >> > > > > -- > Abraham Tom > Email: work2m...@gmail.com > Phone: 415-515-3621 The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental. Use at your own risk. Michael Segel michael_segel (AT) hotmail.com
Re: HBase Question
Have you looked at the REST API? Can that be an option for you? Le 2015-03-13 11:28, "Sudeep Pandey" a écrit : > Hello: > > If I am unable to do JAVA coding and prefer HBase shell for HBase > works/interactions, will I be able to do all operations? > i.e. > > Is JAVA coding (Client API) needed to do something in HBase which is not > possible by HBase shell commands? > > Thank You, > Sudeep Pandey > Ph: 5107783972 >
Re: HBase Question
If you are comfortable with SQL I would look into Phoenix http://phoenix.apache.org/index.html On Thu, Mar 12, 2015 at 10:00 PM, Sudeep Pandey wrote: > Hello: > > If I am unable to do JAVA coding and prefer HBase shell for HBase > works/interactions, will I be able to do all operations? > i.e. > > Is JAVA coding (Client API) needed to do something in HBase which is not > possible by HBase shell commands? > > Thank You, > Sudeep Pandey > Ph: 5107783972 > -- Abraham Tom Email: work2m...@gmail.com Phone: 415-515-3621
Re: HBase Question
We usually try to have a shell way of doing all public facing operations. In particular, I'd say if something shows up in the ref guide[1] without a shell way to do it, I'd consider it a bug. The one big caveat is that the shell is not performant for doing data inserts or fetching. Those functions are really only set up in the shell for works-at-all operational testing. Would working in a scripting language like python be easier? We have a couple of options for accessing HBase from non-JVM languages. [1]: http://hbase.apache.org/book.html On Fri, Mar 13, 2015 at 12:00 AM, Sudeep Pandey wrote: > Hello: > > If I am unable to do JAVA coding and prefer HBase shell for HBase > works/interactions, will I be able to do all operations? > i.e. > > Is JAVA coding (Client API) needed to do something in HBase which is not > possible by HBase shell commands? > > Thank You, > Sudeep Pandey > Ph: 5107783972 > -- Sean
Re: Hbase question
HBase Serve the purpose if you use the HDFS underlying hbase, as it will distribute, and also you can write hbase mapreduce code additional to the hbase APIs. Please check following links for hbase mapreduce coding... http://hbase.apache.org/book/mapreduce.example.html http://hbase.apache.org/book/ops_mgt.html http://hbase.apache.org/book/mapreduce.html *Thanks & Regards* ∞ Shashwat Shriparv On Sun, Apr 21, 2013 at 10:47 PM, Ted Yu wrote: > HBase relies on hdfs features heavily. > HBase also supports running Map Reduce Jobs. > > You can find examples in these places (0.94 codebase): > > ./security/src/test/java/org/apache/hadoop/hbase/mapreduce > ./src/examples/mapreduce/org/apache/hadoop/hbase/mapreduce > ./src/main/java/org/apache/hadoop/hbase/mapreduce > ./src/main/resources/org/apache/hadoop/hbase/mapreduce > ./src/test/java/org/apache/hadoop/hbase/mapreduce > > On Sun, Apr 21, 2013 at 12:53 AM, Rami Mankevich wrote: > > > I have additional question: > > Hbase is built on top of hadoop. > > Does HBases uses HDFS only of hadoop or uses Map Reduce Jobs engine as > > well? > > Thanks a lot! > > > > > > From: Rami Mankevich > > Sent: Tuesday, April 09, 2013 8:52 PM > > To: 'user@hbase.apache.org' > > Cc: 'Andrew Purtell' > > Subject: RE: Hbase question > > > > First of all - thanks for the quick response. > > > > Basically threads I want to open are for my own internal structure > > updates and I guess have no relations to HBase internal structures. > > All I want is initiations for some asynchronous structure updates as part > > of coprocessor execution in order not to block user reponse. > > > > The only reason I was asking is to be sure Hbase will not kill those > > threads. > > As I understand - shouldn't be any issue with that. Am I correct? > > > > In addition - Is there any Hbase Thread pool I can use? > > > > > > Thanks > > From: Andrew Purtell [mailto:apurt...@apache.org]<mailto:[mailto: > > apurt...@apache.org]> > > Sent: Tuesday, April 09, 2013 6:53 PM > > To: Rami Mankevich > > Cc: apurt...@apache.org<mailto:apurt...@apache.org> > > Subject: Re: Hbase question > > > > Hi Rami, > > > > It is no problem to create threads in a coprocessor as a generic answer. > > More specifically there could be issues depending on exactly what you > want > > to do, since coprocessor code changes HBase internals. Perhaps you could > > say a bit more. I also encourage you to ask this question on > > user@hbase.apache.org<mailto:user@hbase.apache.org> so other > contributors > > can chime in too. > > > > On Tuesday, April 9, 2013, Rami Mankevich wrote: > > Hey > > According to the Hbase documentation you are one of contrinuters to the > > HBase project > > I would like to raise some question when nobody can basically advice me: > > > > In context of coprocessors I want to raise some threads. > > Do you see any problems with that? > > > > Thanks > > This message and the information contained herein is proprietary and > > confidential and subject to the Amdocs policy statement, you may review > at > > http://www.amdocs.com/email_disclaimer.asp > > > > > > -- > > Best regards, > > > >- Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > >
Re: Hbase question
HBase relies on hdfs features heavily. HBase also supports running Map Reduce Jobs. You can find examples in these places (0.94 codebase): ./security/src/test/java/org/apache/hadoop/hbase/mapreduce ./src/examples/mapreduce/org/apache/hadoop/hbase/mapreduce ./src/main/java/org/apache/hadoop/hbase/mapreduce ./src/main/resources/org/apache/hadoop/hbase/mapreduce ./src/test/java/org/apache/hadoop/hbase/mapreduce On Sun, Apr 21, 2013 at 12:53 AM, Rami Mankevich wrote: > I have additional question: > Hbase is built on top of hadoop. > Does HBases uses HDFS only of hadoop or uses Map Reduce Jobs engine as > well? > Thanks a lot! > > > From: Rami Mankevich > Sent: Tuesday, April 09, 2013 8:52 PM > To: 'user@hbase.apache.org' > Cc: 'Andrew Purtell' > Subject: RE: Hbase question > > First of all - thanks for the quick response. > > Basically threads I want to open are for my own internal structure > updates and I guess have no relations to HBase internal structures. > All I want is initiations for some asynchronous structure updates as part > of coprocessor execution in order not to block user reponse. > > The only reason I was asking is to be sure Hbase will not kill those > threads. > As I understand - shouldn't be any issue with that. Am I correct? > > In addition - Is there any Hbase Thread pool I can use? > > > Thanks > From: Andrew Purtell [mailto:apurt...@apache.org]<mailto:[mailto: > apurt...@apache.org]> > Sent: Tuesday, April 09, 2013 6:53 PM > To: Rami Mankevich > Cc: apurt...@apache.org<mailto:apurt...@apache.org> > Subject: Re: Hbase question > > Hi Rami, > > It is no problem to create threads in a coprocessor as a generic answer. > More specifically there could be issues depending on exactly what you want > to do, since coprocessor code changes HBase internals. Perhaps you could > say a bit more. I also encourage you to ask this question on > user@hbase.apache.org<mailto:user@hbase.apache.org> so other contributors > can chime in too. > > On Tuesday, April 9, 2013, Rami Mankevich wrote: > Hey > According to the Hbase documentation you are one of contrinuters to the > HBase project > I would like to raise some question when nobody can basically advice me: > > In context of coprocessors I want to raise some threads. > Do you see any problems with that? > > Thanks > This message and the information contained herein is proprietary and > confidential and subject to the Amdocs policy statement, you may review at > http://www.amdocs.com/email_disclaimer.asp > > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
Re: Hbase question
Hello Rami, Hbase is not built on top of Hadoop. Hdfs is not a must, but provided you a better storage option(courtesy Hdfs's distributed style storage, scalability etc). You could use it with other FS as well, even with your local FS. And you could definitely use MR jobs to efficiently handle stored in your Hbase store. But MR is again not a must. Not only this Hbase allso provided other APIs as well like Java and Thrift. HTH Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Sun, Apr 21, 2013 at 1:23 PM, Rami Mankevich wrote: > I have additional question: > Hbase is built on top of hadoop. > Does HBases uses HDFS only of hadoop or uses Map Reduce Jobs engine as > well? > Thanks a lot! > > > From: Rami Mankevich > Sent: Tuesday, April 09, 2013 8:52 PM > To: 'user@hbase.apache.org' > Cc: 'Andrew Purtell' > Subject: RE: Hbase question > > First of all - thanks for the quick response. > > Basically threads I want to open are for my own internal structure > updates and I guess have no relations to HBase internal structures. > All I want is initiations for some asynchronous structure updates as part > of coprocessor execution in order not to block user reponse. > > The only reason I was asking is to be sure Hbase will not kill those > threads. > As I understand - shouldn't be any issue with that. Am I correct? > > In addition - Is there any Hbase Thread pool I can use? > > > Thanks > From: Andrew Purtell [mailto:apurt...@apache.org]<mailto:[mailto: > apurt...@apache.org]> > Sent: Tuesday, April 09, 2013 6:53 PM > To: Rami Mankevich > Cc: apurt...@apache.org<mailto:apurt...@apache.org> > Subject: Re: Hbase question > > Hi Rami, > > It is no problem to create threads in a coprocessor as a generic answer. > More specifically there could be issues depending on exactly what you want > to do, since coprocessor code changes HBase internals. Perhaps you could > say a bit more. I also encourage you to ask this question on > user@hbase.apache.org<mailto:user@hbase.apache.org> so other contributors > can chime in too. > > On Tuesday, April 9, 2013, Rami Mankevich wrote: > Hey > According to the Hbase documentation you are one of contrinuters to the > HBase project > I would like to raise some question when nobody can basically advice me: > > In context of coprocessors I want to raise some threads. > Do you see any problems with that? > > Thanks > This message and the information contained herein is proprietary and > confidential and subject to the Amdocs policy statement, you may review at > http://www.amdocs.com/email_disclaimer.asp > > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
RE: Hbase question
I have additional question: Hbase is built on top of hadoop. Does HBases uses HDFS only of hadoop or uses Map Reduce Jobs engine as well? Thanks a lot! From: Rami Mankevich Sent: Tuesday, April 09, 2013 8:52 PM To: 'user@hbase.apache.org' Cc: 'Andrew Purtell' Subject: RE: Hbase question First of all - thanks for the quick response. Basically threads I want to open are for my own internal structure updates and I guess have no relations to HBase internal structures. All I want is initiations for some asynchronous structure updates as part of coprocessor execution in order not to block user reponse. The only reason I was asking is to be sure Hbase will not kill those threads. As I understand - shouldn't be any issue with that. Am I correct? In addition - Is there any Hbase Thread pool I can use? Thanks From: Andrew Purtell [mailto:apurt...@apache.org]<mailto:[mailto:apurt...@apache.org]> Sent: Tuesday, April 09, 2013 6:53 PM To: Rami Mankevich Cc: apurt...@apache.org<mailto:apurt...@apache.org> Subject: Re: Hbase question Hi Rami, It is no problem to create threads in a coprocessor as a generic answer. More specifically there could be issues depending on exactly what you want to do, since coprocessor code changes HBase internals. Perhaps you could say a bit more. I also encourage you to ask this question on user@hbase.apache.org<mailto:user@hbase.apache.org> so other contributors can chime in too. On Tuesday, April 9, 2013, Rami Mankevich wrote: Hey According to the Hbase documentation you are one of contrinuters to the HBase project I would like to raise some question when nobody can basically advice me: In context of coprocessors I want to raise some threads. Do you see any problems with that? Thanks This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at http://www.amdocs.com/email_disclaimer.asp -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
Re: Hbase question
Gary has provided nice summary of things to watch out for. One more thing I want to mention is that care should be taken w.r.t. coordinating the progress of the thread pool and normal region operations. There're already many threads running in the region server JVM. Adding one more thread pool may make resource consumption more complex. What if the custom processing cannot keep up with the rate at which asynchronous requests are queued ? w.r.t. thread pool, you can refer to the following code in HRegion: ThreadPoolExecutor storeOpenerThreadPool = getStoreOpenAndCloseThreadPool( "StoreOpenerThread-" + this.getRegionNameAsString()); CompletionService completionService = new ExecutorCompletionService(storeOpenerThreadPool); Cheers On Tue, Apr 9, 2013 at 11:57 AM, Gary Helmling wrote: > Hi Rami, > > One thing to note for RegionObservers, is that each table region gets its > own instance of each configured coprocessor. So if your cluster has N > regions per region server, with your RegionObserver loaded on all tables, > then each region server will have N instances of your coprocessor. You > should just be aware of this in case you, say, create a thread pool in your > coprocessor constructor. An alternative in this case is to use a singleton > class per region server (aka per jvm) to manage the resources. > > You do want to be sure that all threads are daemon threads, so that they > don't block region server shutdown. Or else you'll need to ensure you > properly stop/join all the threads you've spawned on shutdown. > RegionServerObserver.preStopRegionServer() may help there. > > --gh > > > > On Tue, Apr 9, 2013 at 11:40 AM, Ted Yu wrote: > > > Rami: > > Can you tell us what coprocessor hook you plan to use ? > > > > Thanks > > > > On Tue, Apr 9, 2013 at 10:51 AM, Rami Mankevich > wrote: > > > > > First of all - thanks for the quick response. > > > > > > Basically threads I want to open are for my own internal structure > > > updates and I guess have no relations to HBase internal structures. > > > All I want is initiations for some asynchronous structure updates as > part > > > of coprocessor execution in order not to block user reponse. > > > > > > The only reason I was asking is to be sure Hbase will not kill those > > > threads. > > > As I understand - shouldn't be any issue with that. Am I correct? > > > > > > In addition - Is there any Hbase Thread pool I can use? > > > > > > > > > Thanks > > > From: Andrew Purtell [mailto:apurt...@apache.org] > > > Sent: Tuesday, April 09, 2013 6:53 PM > > > To: Rami Mankevich > > > Cc: apurt...@apache.org > > > Subject: Re: Hbase question > > > > > > Hi Rami, > > > > > > It is no problem to create threads in a coprocessor as a generic > answer. > > > More specifically there could be issues depending on exactly what you > > want > > > to do, since coprocessor code changes HBase internals. Perhaps you > could > > > say a bit more. I also encourage you to ask this question on > > > user@hbase.apache.org<mailto:user@hbase.apache.org> so other > > contributors > > > can chime in too. > > > > > > On Tuesday, April 9, 2013, Rami Mankevich wrote: > > > Hey > > > According to the Hbase documentation you are one of contrinuters to the > > > HBase project > > > I would like to raise some question when nobody can basically advice > me: > > > > > > In context of coprocessors I want to raise some threads. > > > Do you see any problems with that? > > > > > > Thanks > > > This message and the information contained herein is proprietary and > > > confidential and subject to the Amdocs policy statement, you may review > > at > > > http://www.amdocs.com/email_disclaimer.asp > > > > > > > > > -- > > > Best regards, > > > > > >- Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > > (via Tom White) > > > > > >
Re: Hbase question
Hi Rami, One thing to note for RegionObservers, is that each table region gets its own instance of each configured coprocessor. So if your cluster has N regions per region server, with your RegionObserver loaded on all tables, then each region server will have N instances of your coprocessor. You should just be aware of this in case you, say, create a thread pool in your coprocessor constructor. An alternative in this case is to use a singleton class per region server (aka per jvm) to manage the resources. You do want to be sure that all threads are daemon threads, so that they don't block region server shutdown. Or else you'll need to ensure you properly stop/join all the threads you've spawned on shutdown. RegionServerObserver.preStopRegionServer() may help there. --gh On Tue, Apr 9, 2013 at 11:40 AM, Ted Yu wrote: > Rami: > Can you tell us what coprocessor hook you plan to use ? > > Thanks > > On Tue, Apr 9, 2013 at 10:51 AM, Rami Mankevich wrote: > > > First of all - thanks for the quick response. > > > > Basically threads I want to open are for my own internal structure > > updates and I guess have no relations to HBase internal structures. > > All I want is initiations for some asynchronous structure updates as part > > of coprocessor execution in order not to block user reponse. > > > > The only reason I was asking is to be sure Hbase will not kill those > > threads. > > As I understand - shouldn't be any issue with that. Am I correct? > > > > In addition - Is there any Hbase Thread pool I can use? > > > > > > Thanks > > From: Andrew Purtell [mailto:apurt...@apache.org] > > Sent: Tuesday, April 09, 2013 6:53 PM > > To: Rami Mankevich > > Cc: apurt...@apache.org > > Subject: Re: Hbase question > > > > Hi Rami, > > > > It is no problem to create threads in a coprocessor as a generic answer. > > More specifically there could be issues depending on exactly what you > want > > to do, since coprocessor code changes HBase internals. Perhaps you could > > say a bit more. I also encourage you to ask this question on > > user@hbase.apache.org<mailto:user@hbase.apache.org> so other > contributors > > can chime in too. > > > > On Tuesday, April 9, 2013, Rami Mankevich wrote: > > Hey > > According to the Hbase documentation you are one of contrinuters to the > > HBase project > > I would like to raise some question when nobody can basically advice me: > > > > In context of coprocessors I want to raise some threads. > > Do you see any problems with that? > > > > Thanks > > This message and the information contained herein is proprietary and > > confidential and subject to the Amdocs policy statement, you may review > at > > http://www.amdocs.com/email_disclaimer.asp > > > > > > -- > > Best regards, > > > >- Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > >
Re: Hbase question
Rami: Can you tell us what coprocessor hook you plan to use ? Thanks On Tue, Apr 9, 2013 at 10:51 AM, Rami Mankevich wrote: > First of all - thanks for the quick response. > > Basically threads I want to open are for my own internal structure > updates and I guess have no relations to HBase internal structures. > All I want is initiations for some asynchronous structure updates as part > of coprocessor execution in order not to block user reponse. > > The only reason I was asking is to be sure Hbase will not kill those > threads. > As I understand - shouldn't be any issue with that. Am I correct? > > In addition - Is there any Hbase Thread pool I can use? > > > Thanks > From: Andrew Purtell [mailto:apurt...@apache.org] > Sent: Tuesday, April 09, 2013 6:53 PM > To: Rami Mankevich > Cc: apurt...@apache.org > Subject: Re: Hbase question > > Hi Rami, > > It is no problem to create threads in a coprocessor as a generic answer. > More specifically there could be issues depending on exactly what you want > to do, since coprocessor code changes HBase internals. Perhaps you could > say a bit more. I also encourage you to ask this question on > user@hbase.apache.org<mailto:user@hbase.apache.org> so other contributors > can chime in too. > > On Tuesday, April 9, 2013, Rami Mankevich wrote: > Hey > According to the Hbase documentation you are one of contrinuters to the > HBase project > I would like to raise some question when nobody can basically advice me: > > In context of coprocessors I want to raise some threads. > Do you see any problems with that? > > Thanks > This message and the information contained herein is proprietary and > confidential and subject to the Amdocs policy statement, you may review at > http://www.amdocs.com/email_disclaimer.asp > > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
RE: Hbase question
First of all - thanks for the quick response. Basically threads I want to open are for my own internal structure updates and I guess have no relations to HBase internal structures. All I want is initiations for some asynchronous structure updates as part of coprocessor execution in order not to block user reponse. The only reason I was asking is to be sure Hbase will not kill those threads. As I understand - shouldn't be any issue with that. Am I correct? In addition - Is there any Hbase Thread pool I can use? Thanks From: Andrew Purtell [mailto:apurt...@apache.org] Sent: Tuesday, April 09, 2013 6:53 PM To: Rami Mankevich Cc: apurt...@apache.org Subject: Re: Hbase question Hi Rami, It is no problem to create threads in a coprocessor as a generic answer. More specifically there could be issues depending on exactly what you want to do, since coprocessor code changes HBase internals. Perhaps you could say a bit more. I also encourage you to ask this question on user@hbase.apache.org<mailto:user@hbase.apache.org> so other contributors can chime in too. On Tuesday, April 9, 2013, Rami Mankevich wrote: Hey According to the Hbase documentation you are one of contrinuters to the HBase project I would like to raise some question when nobody can basically advice me: In context of coprocessors I want to raise some threads. Do you see any problems with that? Thanks This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at http://www.amdocs.com/email_disclaimer.asp -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
RE: Hbase Question
Dear yong, How to distribute my data in the cluster ? Note that I am using cloudera manager 4.1 Thanks in advance:D > Date: Fri, 28 Dec 2012 20:38:22 +0100 > Subject: Re: Hbase Question > From: yongyong...@gmail.com > To: user@hbase.apache.org > > I think you can take a look at your row-key design and evenly > distribute your data in your cluster, as you mentioned even if you > added more nodes, there was no improvement of performance. Maybe you > have a node who is a hot spot, and the other nodes have no work to do. > > regards! > > Yong > > On Tue, Dec 25, 2012 at 3:31 AM, 周梦想 wrote: > > Hi Dalia, > > > > I think you can make a small sample of the table to do the test, then > > you'll find what's the difference of scan and count. > > because you can count it by human. > > > > Best regards, > > Andy > > > > 2012/12/24 Dalia Sobhy > > > >> > >> Dear all, > >> > >> I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 > >> rows with "renal". > >> > >> When I type this in Hbase shell, > >> > >> import org.apache.hadoop.hbase.filter.CompareFilter > >> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > >> import org.apache.hadoop.hbase.filter.SubstringComparator > >> import org.apache.hadoop.hbase.util.Bytes > >> > >> scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > >> SingleColumnValueFilter.new(Bytes.toBytes('info'), > >> Bytes.toBytes('diagnosis'), > >> CompareFilter::CompareOp.valueOf('EQUAL'), > >> SubstringComparator.new('cardiac'))} > >> > >> Output = 50,000 row > >> > >> import org.apache.hadoop.hbase.filter.CompareFilter > >> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > >> import org.apache.hadoop.hbase.filter.SubstringComparator > >> import org.apache.hadoop.hbase.util.Bytes > >> > >> count 'patient', { COLUMNS => "info:diagnosis", FILTER => > >> SingleColumnValueFilter.new(Bytes.toBytes('info'), > >> Bytes.toBytes('diagnosis'), > >> CompareFilter::CompareOp.valueOf('EQUAL'), > >> SubstringComparator.new('cardiac'))} > >> Output = 100,000 row > >> > >> Even though I tried it using Hbase Java API, Aggregation Client Instance, > >> and I enabled the Coprocessor aggregation for the table. > >> rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) > >> > >> Also when measuring the improved performance on case of adding more nodes > >> the operation takes the same time. > >> > >> So any advice please? > >> > >> I have been throughout all this mess from a couple of weeks > >> > >> Thanks, > >> > >> > >> > >>
Re: Hbase Question
I think you can take a look at your row-key design and evenly distribute your data in your cluster, as you mentioned even if you added more nodes, there was no improvement of performance. Maybe you have a node who is a hot spot, and the other nodes have no work to do. regards! Yong On Tue, Dec 25, 2012 at 3:31 AM, 周梦想 wrote: > Hi Dalia, > > I think you can make a small sample of the table to do the test, then > you'll find what's the difference of scan and count. > because you can count it by human. > > Best regards, > Andy > > 2012/12/24 Dalia Sobhy > >> >> Dear all, >> >> I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 >> rows with "renal". >> >> When I type this in Hbase shell, >> >> import org.apache.hadoop.hbase.filter.CompareFilter >> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter >> import org.apache.hadoop.hbase.filter.SubstringComparator >> import org.apache.hadoop.hbase.util.Bytes >> >> scan 'patient', { COLUMNS => "info:diagnosis", FILTER => >> SingleColumnValueFilter.new(Bytes.toBytes('info'), >> Bytes.toBytes('diagnosis'), >> CompareFilter::CompareOp.valueOf('EQUAL'), >> SubstringComparator.new('cardiac'))} >> >> Output = 50,000 row >> >> import org.apache.hadoop.hbase.filter.CompareFilter >> import org.apache.hadoop.hbase.filter.SingleColumnValueFilter >> import org.apache.hadoop.hbase.filter.SubstringComparator >> import org.apache.hadoop.hbase.util.Bytes >> >> count 'patient', { COLUMNS => "info:diagnosis", FILTER => >> SingleColumnValueFilter.new(Bytes.toBytes('info'), >> Bytes.toBytes('diagnosis'), >> CompareFilter::CompareOp.valueOf('EQUAL'), >> SubstringComparator.new('cardiac'))} >> Output = 100,000 row >> >> Even though I tried it using Hbase Java API, Aggregation Client Instance, >> and I enabled the Coprocessor aggregation for the table. >> rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) >> >> Also when measuring the improved performance on case of adding more nodes >> the operation takes the same time. >> >> So any advice please? >> >> I have been throughout all this mess from a couple of weeks >> >> Thanks, >> >> >> >>
Re: Hbase Question
Hi Dalia, I think you can make a small sample of the table to do the test, then you'll find what's the difference of scan and count. because you can count it by human. Best regards, Andy 2012/12/24 Dalia Sobhy > > Dear all, > > I have 50,000 row with diagnosis qualifier = "cardiac", and another 50,000 > rows with "renal". > > When I type this in Hbase shell, > > import org.apache.hadoop.hbase.filter.CompareFilter > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > import org.apache.hadoop.hbase.filter.SubstringComparator > import org.apache.hadoop.hbase.util.Bytes > > scan 'patient', { COLUMNS => "info:diagnosis", FILTER => > SingleColumnValueFilter.new(Bytes.toBytes('info'), > Bytes.toBytes('diagnosis'), > CompareFilter::CompareOp.valueOf('EQUAL'), > SubstringComparator.new('cardiac'))} > > Output = 50,000 row > > import org.apache.hadoop.hbase.filter.CompareFilter > import org.apache.hadoop.hbase.filter.SingleColumnValueFilter > import org.apache.hadoop.hbase.filter.SubstringComparator > import org.apache.hadoop.hbase.util.Bytes > > count 'patient', { COLUMNS => "info:diagnosis", FILTER => > SingleColumnValueFilter.new(Bytes.toBytes('info'), > Bytes.toBytes('diagnosis'), > CompareFilter::CompareOp.valueOf('EQUAL'), > SubstringComparator.new('cardiac'))} > Output = 100,000 row > > Even though I tried it using Hbase Java API, Aggregation Client Instance, > and I enabled the Coprocessor aggregation for the table. > rowCount = aggregationClient.rowCount(TABLE_NAME, null, scan) > > Also when measuring the improved performance on case of adding more nodes > the operation takes the same time. > > So any advice please? > > I have been throughout all this mess from a couple of weeks > > Thanks, > > > >
Re: HBase question
> I have 2 question: > 1. Does HBase support scan data of rowkey by column? You mean secondary indexes? No: http://hbase.apache.org/book.html#secondary.indices > 2. Which design is better? I think that design 2 is better when user have > large amount of follower. I cover a bunch of designs in this very recent thread: http://search-hadoop.com/m/AXRdP1KKR5T1 J-D