here's wat i m doin...
this is my get function
it should retrieve entities in parallel by creating parallel threads
for each get.
public String[] get(String tableName,String[] entityIDS){
ExecutorService threadExecutor = Executors.newFixedThreadPool(50);
String[] contents = new String[entityIDS.length];
long initime=System.currentTimeMillis();
int i = 0;
while (i < entityIDS.length) {
threadExecutor.execute(new ReadThread(conf,tableName,
contents, entityIDS[i], i));
i++;
}
threadExecutor.shutdown();
while(!threadExecutor.isTerminated());
return contents;
}
and here's the thread
public void run() {
long ab=System.currentTimeMillis();
try {
Cell c=table.get(entityID, "content:");
String content=new String(c.getValue());
if(content==null) j[index]="NULL";
else {
j[index]=content;
}
} catch (IOException ex) {
Logger.getLogger(ReadThread.class.getName()).log(Level.SEVERE,
null, ex);
}
System.out.println(System.currentTimeMillis()-ab + " " + "time
taken to complete for " + "process " + index);
}
i m creating new htable instance for each such thread
Is this way correct.....would i get a better performance from this.
will my get queries be executed in parallel by the hbase
On Wed, Feb 18, 2009 at 11:27 AM, shourabh rawat <[email protected]> wrote:
> does the number of regionservers affect this performance??
>
> On Wed, Feb 18, 2009 at 11:23 AM, shourabh rawat <[email protected]> wrote:
>> hey,
>>
>> "> What do you mean by the above when you say read sequentially? Are you
>>> scanning? (Getting a scanner and then nexting through your hbase table?)."
>>
>> well lets say i have 10 keys that are stored in hbase
>> i want to retrive them
>>
>> If I do the reads one by one the time would be summation of 'get'
>> times of each key
>> Could i do the same thing in parallel. so that all the get's cld occur
>> concurrently so i would get total time as the max of the time taken by
>> any of these keys rather than the summ of individual times
>>
>>
>> "
>>> You will have to wait for hbase 0.20.0 or do as Erik suggests and put a
>>> cache in front of hbase. What are you trying to do with hbase? Serve a
>>> website? "
>>
>> ya sort of but i want to check performance withought the use of cache
>> (random reads) ....can i get such performance in the range of 10 ms
>> with hbase
>>
>>> Yeah, the RPC keeps a single connection per remote server but channel is
>>> shared by request and receive. Testing in past, the more remote servers,
>>> the better, but even if a few only, concurrent HTables got better throughput
>>> than one running requests in series (the single connection is not fully
>>> occupied by requests and responses).
>>>
>>
>> so by a single connection u mean all the gets wld be treated
>> sequentially (one by one) by the hbase even wen the requests come in
>> parallel(even wen different htable instances for the same table are
>> employed)....is there any way i can make it parallel.....
>> The hbase master has one port that it specifies and other is the port
>> for the hdfs (hadoop)....what can be done to increase the number of
>> connection as u said.......
>>
>>
>> Thanx for yr help.
>>
>