ö_ö you should send this message to hbase user list, not spark user list...
but i can give you some personal advice about this, keep column families as few as possible! at least, use some prefix of column qualifier could also be an idea. but read performance may be worse for your use case like "search for a row with value x in column family A and with value Y in column family B". so it depends on which workload is important for you, if your use case is very read-heavy and you really want to use multi column families to hold a good read performance, you should try to disable region split, adjust compaction interval carefully, and so on. there is a good slide for this: http://photo.weibo.com/1431095941/wbphotos/large/mid/3735178188435939/pid/554cca85gw1eiloddlqa5j20or0ik77z more slides about hbase + coprocessor, hbase + hive and hbase + spark: http://www.weibo.com/1431095941/BeL90zozx -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Multiple-column-families-vs-Multiple-tables-tp12425p12439.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org