Well, there is no easy way to resalt the table. The main problem that when
salting byte is calculated, the number of buckets is used. So if we want to
change the number of buckets, all rowkeys should be rewritten. I think that
you still can use MR job for that, but I would recommend to write data to
hfiles instead of using upserts. How it can be implemented you may find in
CSV bulkload tool sources.

Thanks,
Sergey

On Thu, Feb 15, 2018 at 11:25 AM, Marcell Ortutay <mortu...@23andme.com>
wrote:

> I have a phoenix table that is about 7 TB (unreplicated) in size,
> corresponding to about 500B rows. It was set up a couple years ago, and we
> have determined that the number of salt buckets it has is not optimal for
> the current query pattern we are seeing. I want to change the number of
> salt buckets as I expect it will improve performance.
>
> I have written a MapReduce job that does this using a subclass of
> TableMapper. It scans the entire old table, and writes the re-salted data
> to a new table. The MapReduce job works on small tables, but I'm having
> trouble getting it to run on the larger table.
>
> I have two questions for anyone who has experience with this:
>
> (1) Are there any publicly available MapReduce jobs for re-salting a
> Phoenix table?
>
> (2) Generally, is there a better approach than MapReduce to re-salt a
> Phoenix table?
>
> Thanks,
> Marcell Ortutay
>
>

Reply via email to