No, you will not "lose" data. You will just have mappers that read from more than one Region (and thus, more than one RegionServer). The hope in this approach is that we can launch Mappers on the same node of the RegionServer hosting your Region and avoid any reading any data over the network.

This is just an optimization.

On 4/30/19 10:12 AM, Shawn Li wrote:
Hi,

The number of Map in Phoenix Mapreduce is determined by table region number. My question is: if the region is split due to other injection process while Phoenix Mapreduce job is running, do we lose reading some data due to this split? As now we have more regions than maps, and the maps only have region information before split.

Thanks,
Shawn

Reply via email to