Re: client cache for all region server information?

2012-08-28 Thread Lin Ma
Thanks for the detailed reply, Harsh. Some further comments / thoughts, 1. For Scan function used in mapper/reducer, supposing we are using 500 size configuration, I am not sure whether the returned 500 items in one batch call must from one region server? Or it could from multiple region servers

Re: client cache for all region server information?

2012-08-27 Thread Lin Ma
Thanks Harsh, A two more comments / thoughts, 1. For mapper: mapper normally runs on the same regional server which owns the row-key range for the mapper input because of locality reasons (I am not 100% confident whether it is always true mapper always runs on the same region server, please feel

Re: client cache for all region server information?

2012-08-27 Thread Harsh J
Not necessarily consecutive, unless the request itself is so. It only returns 500 rows that match the user's request. User's request of a specific row-range and filters are usually embedded into the Scan object, sent to the RS. Whatever is accumulated as the result of the Scan operation (server-si

Re: client cache for all region server information?

2012-08-27 Thread Lin Ma
Hi Harsh, I read through the document you referred, for the below comment, I am confused. Major confusion is, does it mean HBase will transfer consecutive 500 rows to client (supposing client mapper want row with row-key 100, Hbase will return row-key from 100 to 600 at one time to client, similar

Re: client cache for all region server information?

2012-08-23 Thread Harsh J
Hi Lin, On Thu, Aug 23, 2012 at 7:56 PM, Lin Ma wrote: > Harsh, thanks for the detailed information. > > Two more comments, > > 1. I want to confirm my understanding is correct. At the beginning client > cache has nothing, when it issue request for a table, if the region server > location is not

Re: client cache for all region server information?

2012-08-23 Thread Lin Ma
Harsh, thanks for the detailed information. Two more comments, 1. I want to confirm my understanding is correct. At the beginning client cache has nothing, when it issue request for a table, if the region server location is not known, it will request from root META region to get region server inf

Re: client cache for all region server information?

2012-08-23 Thread Harsh J
Hi Lin, On Thu, Aug 23, 2012 at 4:31 PM, Lin Ma wrote: > Thank you Abhishek, > > Two more comments, > > -- "Client only caches information as needed for its queries and not > necessarily for 'all' region servers." -- how did client know which region > server information is necessary to be cached

Re: client cache for all region server information?

2012-08-23 Thread Lin Ma
Thank you Abhishek, Two more comments, -- "Client only caches information as needed for its queries and not necessarily for 'all' region servers." -- how did client know which region server information is necessary to be cached in current HBase implementation? -- When the client loads region ser

Re: client cache for all region server information?

2012-08-22 Thread Pamecha, Abhishek
I think for the refresh case, client first uses the older region server derived from its cache it then connects to that older region server which responds with a failure code. and then client talks to the zookeeper and then the meta node server to find the new region server for that key. The

client cache for all region server information?

2012-08-22 Thread Lin Ma
Hello HBase masters, I am wondering whether in current implementation, each client of HBase cache all information of region server, for example, where is region server (physical hosting machine of region server), and also cache row-key range managed by the region server. If so, two more questions,