Thanks Ram. I will look into EndPoints. On 20 February 2017 at 12:29, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> wrote:
> Yes. There is way. > > Have you seen Endpoints? Endpoints are triggers like points that allows > your client to trigger them parallely in one ore more regions using the > start and end key of the region. This executes parallely and then you may > have to sort out the results as per your need. > > But these endpoints have to running on your region servers and it is not a > client only soln. > https://blogs.apache.org/hbase/entry/coprocessor_introduction. > > Be careful when you use them. Since these endpoints run on server ensure > that these are not heavy or things that consume more memory which can have > adverse effects on the server. > > > Regards > Ram > > On Mon, Feb 20, 2017 at 12:18 PM, Anil <anilk...@gmail.com> wrote: > > > Thanks Ram. > > > > So, you mean that there is no harm in using HTable#getRegionsInRange in > > the application code. > > > > HTable#getRegionsInRange returned single entry for all my region start > key > > and end key. i need to explore more on this. > > > > "If you know the table region's start and end keys you could create > > parallel scans in your application code." - is there any way to scan a > > region in the application code other than the one i put in the original > > email ? > > > > "One thing to watch out is that if there is a split in the region then > > this start > > and end row may change so in that case it is better you try to get > > the regions every time before you issue a scan" > > - Agree. i am dynamically determining the region start key and end key > > before initiating scan operations for every initial load. > > > > Thanks. > > > > > > > > > > On 20 February 2017 at 10:59, ramkrishna vasudevan < > > ramkrishna.s.vasude...@gmail.com> wrote: > > > > > Hi Anil, > > > > > > HBase directly does not provide parallel scans. If you know the table > > > region's start and end keys you could create parallel scans in your > > > application code. > > > > > > In the above code snippet, the intent is right - you get the required > > > regions and can issue parallel scans from your app. > > > > > > One thing to watch out is that if there is a split in the region then > > this > > > start and end row may change so in that case it is better you try to > get > > > the regions every time before you issue a scan. Does that make sense to > > > you? > > > > > > Regards > > > Ram > > > > > > On Sat, Feb 18, 2017 at 1:44 PM, Anil <anilk...@gmail.com> wrote: > > > > > > > Hi , > > > > > > > > I am building an usecase where i have to load the hbase data into > > > In-memory > > > > database (IMDB). I am scanning the each region and loading data into > > > IMDB. > > > > > > > > i am looking at parallel scanner ( https://issues.apache.org/ > > > > jira/browse/HBASE-8504, HBASE-1935 ) to reduce the load time and > > HTable# > > > > getRegionsInRange(byte[] startKey, byte[] endKey, boolean reload) is > > > > deprecated, HBASE-1935 is still open. > > > > > > > > I see Connection from ConnectionFactory is HConnectionImplementation > by > > > > default and creates HTable instance. > > > > > > > > Do you see any issues in using HTable from Table instance ? > > > > for each region { > > > > int i = 0; > > > > List<HRegionLocation> regions = > > > > hTable.getRegionsInRange(scans.getStartRow(), scans.getStopRow(), > > true); > > > > > > > > for (HRegionLocation region : regions){ > > > > startRow = i == 0 ? scans.getStartRow() : > > > > region.getRegionInfo().getStartKey(); > > > > i++; > > > > endRow = i == regions.size()? scans.getStopRow() > : > > > > region.getRegionInfo().getEndKey(); > > > > } > > > > } > > > > > > > > are there any alternatives to achieve parallel scan? Thanks. > > > > > > > > Thanks > > > > > > > > > >