Re: Get region for row key

2016-07-10 Thread Simon Wang
About the use case: We want to do JDBC queries for each row in a Hive partition. Currently, we use Spark to partition the Hive dataFrame, then do batch query in foreachPartition. Since each partition is accessing multiple regionservers, there are a lot of overhead. So we are thinking about part

Get region for row key

2016-07-10 Thread Simon Wang
Hi all, Happy weekend! I am writing to ask if there is a way that I can get the region number of any given row key? For the case will salting is applied, I discovered `SaltingUtils.getSaltedKey` method, but I am not sure how I can get serialize the key as `ImmutableBytesWritable`. In genera

Re: Index tables at scale

2016-07-10 Thread Simon Wang
Hi James, Thanks for the response. In our use case, there is a 256 region table, and we want to build ~12 indexes on it. We have 15 region servers. If each index is in its own table, that would be a total of 221 regions per region server of this single table. I think the extra write time cost