Hi, all,

We have a hbase table which has 1 billion rows, and we want to randomly get 1M 
from that table. We are now trying the RandomRowFilter, but it is still very 
slow. If I understand it correctly, in the Server side, RandomRowFilter still 
need to read all 1 billions but return randomly 1% for them. But read 1 billion 
rows is very slow. Is this true?

So is there any other better way to randomly get 1% rows from a given table? 
Any idea will be very appreciated.
We don't know the distribution of the 1 billion rows in advance.

Thanks,
Ming

Reply via email to