For case 1, HFile would be loaded into the region (via staging directory). Please see: http://hbase.apache.org/book.html#arch.bulk.load
On Mon, Jan 22, 2018 at 8:52 AM, vignesh <vignesh...@gmail.com> wrote: > If it is a bulk load I use spark hbase connector provided by hortonworks. > For time series writes I use normal hbase client API's. > > So does that mean in case 2(client API write) the write to memstore will > happen via network? In case 1(bulk load)the HFile will be moved to the > region which is supposed to hold or it will write to local and keep that as > a copy and the second replication would go to that particular region? > > On Jan 22, 2018 22:16, "Ted Yu" <yuzhih...@gmail.com> wrote: > > Which connector do you use to perform the write ? > > bq. Or spark will wisely launch an executor on that machine > > I don't think that is the case. Multiple writes may be performed which > would end up on different region servers. Spark won't provide the affinity > described above. > > On Mon, Jan 22, 2018 at 7:19 AM, vignesh <vignesh...@gmail.com> wrote: > > > Hi, > > > > I have a Spark job which reads some timeseries data and pushes that to > > HBASE using HBASE client API. I am executing this Spark job on a 10 > > node cluster. Say at first when spark kicks off it picks > > machine1,machine2,machine3 as its executors. Now when the job inserts > > a row to HBASE. Below is what my undersatnding on what it does. > > > > Based on the row key a particular region(from the META) would be > > chosen and that row will be pushed to that RegionServer's memstore and > > WAL and once the memestore is full it would be flushed to the disk.Now > > if assume a particular row is being processed by a executor on > > machine2 and the regionserver which handles that region to which the > > put is to be made is on machine6. Will the data be transferred from > > machine2 to machine6 over network and then the data will be stored in > > memstore of machine6. Or spark will wisely launch an executor on that > > machine during write(if the dynamic allocation is turned on) and > > pushes to it? > > > > > > -- > > I.VIGNESH > > >