No, you will not "lose" data. You will just have mappers that read from
more than one Region (and thus, more than one RegionServer). The hope in
this approach is that we can launch Mappers on the same node of the
RegionServer hosting your Region and avoid any reading any data over the
network.
This is just an optimization.
On 4/30/19 10:12 AM, Shawn Li wrote:
Hi,
The number of Map in Phoenix Mapreduce is determined by table region
number. My question is: if the region is split due to other injection
process while Phoenix Mapreduce job is running, do we lose reading some
data due to this split? As now we have more regions than maps, and the
maps only have region information before split.
Thanks,
Shawn