Re: Multiple column families vs Multiple tables

chutium Tue, 19 Aug 2014 16:50:15 -0700

ö_ö  you should send this message to hbase user list, not spark user list...


but i can give you some personal advice about this, keep column families as
few as possible!

at least, use some prefix of column qualifier could also be an idea. but
read performance may be worse for your use case like "search for a row with
value x in column family A and with value Y in column family B".

so it depends on which workload is important for you, if your use case is
very read-heavy and you really want to use multi column families to hold a
good read performance, you should try to disable region split, adjust
compaction interval carefully, and so on.

there is a good slide for this:
http://photo.weibo.com/1431095941/wbphotos/large/mid/3735178188435939/pid/554cca85gw1eiloddlqa5j20or0ik77z

more slides about hbase + coprocessor, hbase + hive and hbase + spark:
http://www.weibo.com/1431095941/BeL90zozx




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Multiple-column-families-vs-Multiple-tables-tp12425p12439.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Multiple column families vs Multiple tables

Reply via email to