Hi Jianshi,

Do you mean that you want to sort the row keys? If yes, then you don't have
to worry about it because HBase sorts the row keys on its own but
lexicographically.

Cheers,
Arun

Sent from a mobile device. Please don't mind the typos.
On Jul 30, 2014 9:02 PM, "Jianshi Huang" <jianshi.hu...@gmail.com> wrote:

> I need to generate from a 2TB dataset and exploded it to 4 Column Families.
>
> The result dataset is likely to be 20TB or more. I'm currently using Spark
> so I sorted the (rk, cf, cq) myself. It's huge and I'm considering how to
> optimize it.
>
> My question is:
> Should I sort and write each column family one by one, or should I put them
> all together then do sort and write?
>
> Does my question make sense?
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>

Reply via email to