Re: 答复: hbase jvm problem

2013-04-07 Thread jian fan
Liang: Thanks! Regards Jian Fan 2013/4/8 ramkrishna vasudevan > The above blog is informative. Thanks for the link. > > Regards > Ram > > > On Mon, Apr 8, 2013 at 7:59 AM, 谢良 wrote: > > > Would you have a chance to test w/o UseAdaptiveSizePolicy option? > > There's a related hotspot iss

Re: schema design: rows vs wide columns

2013-04-07 Thread ramkrishna vasudevan
"So things are fine as long as all CFs have roughly the same size. But if you have one that gets a lot of data and many others that are smaller, we'd end up with a lot of unnecessary and small store files from the smaller CFs." This is true. I am not very sure of other reasons. We any way ensure

RE: Disabling balancer permanently in HBase

2013-04-07 Thread Anoop Sam John
HBASE-6260 made the balancer state to be persisted in ZK so that the restart of the Master wont have an issue. But this is available with 0.95 only. Just telling FYI -Anoop- From: Jean-Marc Spaggiari [jean-m...@spaggiari.org] Sent: Monday, April 08, 2013

Re: schema design: rows vs wide columns

2013-04-07 Thread lars hofhansl
I think the main problem is that all CFs have to be flushed if one gets large enough to require a flush. (Does anyone remember why exactly that is? And do we still need that now that the memstoreTS is stored in the HFiles?) So things are fine as long as all CFs have roughly the same size. But i

Re: 答复: hbase jvm problem

2013-04-07 Thread ramkrishna vasudevan
The above blog is informative. Thanks for the link. Regards Ram On Mon, Apr 8, 2013 at 7:59 AM, 谢良 wrote: > Would you have a chance to test w/o UseAdaptiveSizePolicy option? > There's a related hotspot issue discussed several days ago: > http://marc.info/?l=openjdk-serviceability-dev&m=136367

Re: schema design: rows vs wide columns

2013-04-07 Thread ramkrishna vasudevan
I agree with Andrew here and also Stack's comment on FB usage with 15 CFs is interesting. Whenever people read that line from the doc, people used to ask why is it so and also i was thinking that one restriction of having max 3 CFs was one factor which sometimes made schema design a bit challengin

Re: Essential column family performance

2013-04-07 Thread lars hofhansl
Looking at the joined scanner test code, it sets it up such that 1% of the rows match, which would somewhat be in line with James' results. In my own testing a while ago I found a 100% improvement with 0% match. -- Lars From: Ted Yu To: user@hbase.apache.or

RE: HBase tasks

2013-04-07 Thread Azuryy Yu
I guess he has only one CF, which is in memory, so he called in-memory table. --Send from my Sony mobile. On Apr 8, 2013 11:46 AM, "Anoop Sam John" wrote: > Hi > >But what to do, if I have an HBase in-memory table, > Why you say in memory table? All the data in memory? Can u explain a bit > abt

RE: HBase tasks

2013-04-07 Thread Anoop Sam John
Hi >But what to do, if I have an HBase in-memory table, Why you say in memory table? All the data in memory? Can u explain a bit abt this? Yes there is MR job to scan the HBase table data. (Full or part) When you say you want to retrieve data fast, what is the ammount of data? How many regions

Re: ANN: hbase-0.95.0, the first in our 0.95 "Development" Series, is available for download

2013-04-07 Thread Azuryy Yu
I noticed in the release notes, you've added PB-based RPC. On Mon, Apr 8, 2013 at 11:25 AM, Azuryy Yu wrote: > what's difference between > > hbase-0.95.0-hadoop1-bin.tar.gz > > and hbase-0.95.0-hadoop2-bin.tar.

Re: ANN: hbase-0.95.0, the first in our 0.95 "Development" Series, is available for download

2013-04-07 Thread Azuryy Yu
what's difference between hbase-0.95.0-hadoop1-bin.tar.gz and hbase-0.95.0-hadoop2-bin.tar.gz ? Does that hadoop2-bin compiled usin

答复: hbase jvm problem

2013-04-07 Thread 谢良
Would you have a chance to test w/o UseAdaptiveSizePolicy option? There's a related hotspot issue discussed several days ago: http://marc.info/?l=openjdk-serviceability-dev&m=136367606426463&w=1 Best, Liang 发件人: jian fan [xiaofanhb...@gmail.com] 发送时间: 2013年

Re: hbase jvm problem

2013-04-07 Thread Jean-Marc Spaggiari
Might be better (recommended) . Can you try with the oracle one? Le 7 avr. 2013 22:06, "jian fan" a écrit : > I am using java-1.6.0-openjdk.x86_64, must be oracle jvm? > > 2013/4/8 Jean-Marc Spaggiari > > > Hi Jian, > > > > Which JVM are ou using? Have you tried with the last 1.6 Oracle JVM? >

Re: hbase jvm problem

2013-04-07 Thread jian fan
I am using java-1.6.0-openjdk.x86_64, must be oracle jvm? 2013/4/8 Jean-Marc Spaggiari > Hi Jian, > > Which JVM are ou using? Have you tried with the last 1.6 Oracle JVM? > > JM > > 2013/4/7 jian fan : > > Hi guys: > > > >I get the jvm error with these information: > > > >2013-04-07 23:3

Re: hbase jvm problem

2013-04-07 Thread Jean-Marc Spaggiari
Hi Jian, Which JVM are ou using? Have you tried with the last 1.6 Oracle JVM? JM 2013/4/7 jian fan : > Hi guys: > >I get the jvm error with these information: > >2013-04-07 23:36:58,344 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > a2,60020,13653231

hbase jvm problem

2013-04-07 Thread jian fan
Hi guys: I get the jvm error with these information: 2013-04-07 23:36:58,344 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server a2,60020,1365323145468: Unhandled exception: committed = 4190785536 should be < max = 4187619328 java.lang.IllegalArgumentException:

Re: Disabling balancer permanently in HBase

2013-04-07 Thread Jean-Marc Spaggiari
2 other options: 1) Build your own balancer which always returns null and set it with hbase.master.loadbalancer.class; (*) 2) Give a dummy non-existing class for hbase.master.loadbalancer.class ? (**) (*) hbase.master.loadbalancer.class is still missing in the documentation. HBASE-7296 has been op

Re: ANN: hbase-0.95.0, the first in our 0.95 "Development" Series, is available for download

2013-04-07 Thread Stack
0.95.0 is available also in the apache maven repository now but only for hadoop1. We will include hadoop2 in later releases when hadoop-2.0.4 is more than just a SNAPSHOT. Meantime, for those interested in an hbase-0.95.0 built against hadoop2, I've uploaded a SNAPSHOT which hopefully is enough f

Re: schema design: rows vs wide columns

2013-04-07 Thread Viral Bajaria
I think this whole idea of don't go over a certain number of column families was a 2+ year old story. I remember hearing numbers like 5 or 6 (not 3) come up when talking at Hadoop conferences with engineers who were at companies that were heavy HBase users. I agree with Andrew's suggestion that we

Re: Essential column family performance

2013-04-07 Thread Ted Yu
I have attached 5416-TestJoinedScanners-0.94.txt to HBASE-5416 for your reference. On my MacBook, I got the following results from the test: 2013-04-07 16:08:17,474 INFO [main] regionserver.TestJoinedScanners(157): Slow scanner finished in 7.973822 seconds, got 100 rows ... 2013-04-07 16:08:17,9

Re: Essential column family performance

2013-04-07 Thread Ted Yu
Looking at https://issues.apache.org/jira/secure/attachment/12564340/5416-0.94-v3.txt, I found that it didn't contain TestJoinedScanners which shows difference in scanner performance: LOG.info((slow ? "Slow" : "Joined") + " scanner finished in " + Double.toString(timeSec) + " seconds, g

Re: schema design: rows vs wide columns

2013-04-07 Thread Andrew Purtell
Is there a pointer to evidence/experiment backed analysis of this question? I'm sure there is some basis for this text in the book but I recommend we strike it. We could replace it with YCSB or LoadTestTool driven latency graphs for different workloads maybe. Although that would also be a big simpl

Re: schema design: rows vs wide columns

2013-04-07 Thread Stack
On Sun, Apr 7, 2013 at 3:27 PM, Ted Yu wrote: > From http://hbase.apache.org/book.html#number.of.cfs : > > HBase currently does not do well with anything above two or three column > families so keep the number of column families in your schema low. > We should add more to that section. FB run w

Re: schema design: rows vs wide columns

2013-04-07 Thread Ted Yu
>From http://hbase.apache.org/book.html#number.of.cfs : HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low. Cheers On Sun, Apr 7, 2013 at 3:04 PM, Stack wrote: > On Sun, Apr 7, 2013 at 11:58 AM, Ted wrote:

Re: Disabling balancer permanently in HBase

2013-04-07 Thread Stack
Try setting the hbase.balancer.period to a very high number in you hbase-site.xml: http://hbase.apache.org/book.html#hbase.master.dns.nameserver St.Ack On Sun, Apr 7, 2013 at 3:14 PM, Akshay Singh wrote: > Hi, > > I am trying to permanently switch off the balancer in HBase, as my request > dis

Disabling balancer permanently in HBase

2013-04-07 Thread Akshay Singh
Hi, I am trying to permanently switch off the balancer in HBase, as my request distribution is not uniform across the data. I understand that this can be done by, setting balance_switch to false in hbase shell hbase(main):023:0> balance_switch false However, value of balance_switch is reset b

Re: schema design: rows vs wide columns

2013-04-07 Thread Stack
On Sun, Apr 7, 2013 at 11:58 AM, Ted wrote: > With regard to number of column families, 3 is the recommended maximum. > How did you come up w/ the number '3'? Is it a 'hard' 3? Or does it depend? If the latter, on what does it depend? Thanks, St.Ack

ANN: hbase-0.95.0, the first in our 0.95 "Development" Series, is available for download

2013-04-07 Thread Stack
The first release in a series of "Development" releases, hbase-0.95.0 is available for download from your favorite Apache mirror. See: http://www.apache.org/dyn/closer.cgi/hbase/ (It may take a few hours for the release to show up everywhere) About 1500 issues have been closed against this 0.

Re: schema design: rows vs wide columns

2013-04-07 Thread Ted
If you store service Id by month, how do you deal with time range in query that spans partial month(s) ? With regard to number of column families, 3 is the recommended maximum. Cheers On Apr 7, 2013, at 1:03 AM, shawn du wrote: > Hello, > > I am newer for hbase, but i have some experience o

Re: Essential column family performance

2013-04-07 Thread James Taylor
Yes, on 0.94.6. We have our own custom filter derived from FilterBase, so filterIfMissing isn't the issue - the results of the scan are correct. I can see that if the essential column family has more data compared to the non essential column family that the results would eventually even out. I

schema design: rows vs wide columns

2013-04-07 Thread shawn du
Hello, I am newer for hbase, but i have some experience on cassandra. In the official document, it is said prefer to use rows instead of columns. I don't know whether I should follow. This is my user case: I have about hundreds of services. each service is stored by a number(service id). we try to

Re: Essential column family performance

2013-04-07 Thread Ted Yu
James: Your test was based on 0.94.6.1, right ? What Filter were you using ? If you used SingleColumnValueFilter, have you seen my comment here ? https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen