[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-04-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016238#comment-13016238
 ] 

Hudson commented on HBASE-3694:
---

Integrated in HBase-TRUNK #1831 (See 
[https://hudson.apache.org/hudson/job/HBase-TRUNK/1831/])


 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.92.0

 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-04-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015236#comment-13015236
 ] 

stack commented on HBASE-3694:
--

@Liyin Then your last posted patch is good to go?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-04-02 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015075#comment-13015075
 ] 

Liyin Tang commented on HBASE-3694:
---

I think using AtomicLong is pretty safe here:) 

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011589#comment-13011589
 ] 

Ted Yu commented on HBASE-3694:
---

I agree with Todd's comment about the increment call. There are two reasons.
1. The return value is not used - after switching return type to void, the code 
compiles cleanly.
2. It somewhat exposes the implementation detail of the underlying class (in 
this case AtomicLong).

I am attaching a patch that utilizes Cliff Click Counter which Stack mentioned 
at 25/Mar/11 22:44.

Thanks for the great work Liyin.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011675#comment-13011675
 ] 

Liyin Tang commented on HBASE-3694:
---

+1 with the change method name to addAndGetMemstoreSize
But Cliff Click Counter is not thread safe. 
Are you sure to use it? 
We want everything in the RegionServerAccounting is accurate, something that is 
crucial to correct operation.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011682#comment-13011682
 ] 

stack commented on HBASE-3694:
--

Use AtomicLong if alternative is not thread safe.  Name should be 
addMemstoreSize and not addAndGetMemstoreSize if not returning a value (as per 
Todd and Ted above).  Thanks for being persistent Liyin.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011686#comment-13011686
 ] 

stack commented on HBASE-3694:
--

bq. But Cliff Click Counter is not thread safe. 

I thought whole point of the CC Counters was that they were (lockless) 
threadsafe

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011694#comment-13011694
 ] 

Liyin Tang commented on HBASE-3694:
---

Thanks stack and Ted,

I thought CC Counters was thread safe to add, since they have an array of 
counters internally to avoid cache contention,
but it looks like it is not thread safe the get.

From their javadoc:
public long get()
Current value of the counter. Since other threads are updating furiously the 
value is only approximate, but it includes all counts made by the current 
thread. Requires a pass over the internally striped counters.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011703#comment-13011703
 ] 

Ted Yu commented on HBASE-3694:
---

Here is javadoc for add_if_mask() which is called by add():
{code}
  The sum can overflow or 'x' can contain bits in
  // the mask. Value is CAS'd so no counts are lost.  The CAS is retried until
  // it succeeds or bits are found under the mask.
{code}
Where mask of 0 is used, meaning no failure.
Looking further into failure case inside add_if_mask() we can verify the above 
assumption:
{code}
  if( (oldmask) != 0 ) return old; // Failed for bit-set under mask
{code}


 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011717#comment-13011717
 ] 

stack commented on HBASE-3694:
--

@Liyin Approx count on get is fine by me.  If you need it to be 'exact', go w/ 
AtomicLong.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: 3694-cliffs-counter.txt, Hbase-3694[r1085306], 
 Hbase-3694[r1085306]_2.patch, Hbase-3694[r1085306]_3.patch, 
 Hbase-3694[r1085508]_4.patch, Hbase-3694[r1085592]_7.patch, 
 Hbase-3694[r1085593]_5.patch, Hbase-3694[r1085593]_6.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011414#comment-13011414
 ] 

stack commented on HBASE-3694:
--

Patch looks good but I stumble when I come to this:

{code}
+  /**
+   * @return the global mem store size in the region server
+   */
+  public AtomicLong getGlobalMemstoreSize();
{code}

Here we are adding the getting of a single value to the RSS Interface.  RSS is 
usually about more macro-type services than single data member value.  Rare 
would the user of RSS be interested in this single value.  More useful i'd 
think would be if the RSS returned a class that allowed client a (read-only) 
view on multiple RS values; e.g. Above there is talk of a 
MemoryAccountingManager which I imagine would have this memstore size among 
other values.

We could change getRpcMetrics to be a generic getMetrics and it would return a 
RegionServerMetrics instance taht would include instance of HBaseRpcMetrics and 
current state of above counter?





 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011421#comment-13011421
 ] 

Liyin Tang commented on HBASE-3694:
---

Thanks Stack.
I think adding globalMemstoreSize into RegionServerMetrics makes more sense 
than add a new class MemoryAccountingManager?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011437#comment-13011437
 ] 

Todd Lipcon commented on HBASE-3694:


I don't want to conflate metrics (things that get exported for monitoring 
purposes) with internal accounting (things which are necessarily correct and 
up-to-date for proper functioning of the server).

Some internal accounting may be exposed as metrics, but the two subsystems are 
quite separate in my mind.

Does that make sense?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011444#comment-13011444
 ] 

Ted Yu commented on HBASE-3694:
---

memstoreSizeMB is a member of RegionServerMetrics and is set at 
hbase.regionserver.msginterval
See line 1162 in HRegionServer.java:
{code}
this.metrics.memstoreSizeMB.set((int) (memstoreSize / (1024 * 1024)));
{code}
memstoreSizeMB is of type MetricsIntValue which is a subclass of MetricsBase 
and stores value in:
{code}
  private int value;
{code}
We can create MetricsAtomicLongValue class with following signature:
{code}
public class MetricsAtomicLongValue extends MetricsBase{
  private AtomicLong value;  
  private boolean changed;
{code}
If we reach agreement on adding this method to RegionServerServices (which is 
available in HRegionServer and being used by MemStoreFlusher):
{code}
  /**
   * @return Region server metrics instance.
   */
  public RegionServerMetrics getMetrics() {
{code}

then we can change memstoreSizeMB to memstoreSize which is of type 
MetricsAtomicLongValue and blend Liyin's changes onto memstoreSize.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011449#comment-13011449
 ] 

Liyin Tang commented on HBASE-3694:
---

The internal accounting makes sense. I just think MemoryAccountingManager is 
too specific.
We need something more general to reuse it in the future, 
RegionServerAccountingManager.

Thoughts?
Liyin

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011452#comment-13011452
 ] 

Jonathan Gray commented on HBASE-3694:
--

Do we really want to put things like this into RegionServerMetrics?  That class 
is a mess and is currently only used for the publishing of our metrics (not 
used for internal state tracking).  And we should avoid the hadoop Metrics* 
classes like the plague... heavily synchronized and generally confusing.

My vote would be to add a new class, maybe {{RegionServerHeapManager}} or 
something like that... might be a good opportunity to cleanup and centralize 
the code related to that.  But could just hold this one AtomicLong for now.  
Agree that adding a new interface method just for the long is not ideal since 
it buys us nothing down the road.  Better to add something new that we can use 
later.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011457#comment-13011457
 ] 

Todd Lipcon commented on HBASE-3694:


+1 to jgray's suggestion. Please please please let's not conflate metrics and 
something that is crucial to correct operation.

In terms of overall design, I would love to see RegionServerServices evolve 
into something like an IOC container - it's just used to provide wiring 
between the different components that make up a running RS. That makes mocking 
easier and should help with general modularity.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011461#comment-13011461
 ] 

stack commented on HBASE-3694:
--

bq. In terms of overall design, I would love to see RegionServerServices evolve 
into something like an IOC container

Yeah, thats the planNeed to keep it macro though.

Args on why this is not 'metrics' are good.  I go along.

Just say no to atomic long counters now we have cliff click counters in our 
CLASSPATH

bq. The internal accounting makes sense. I just think MemoryAccountingManager 
is too specific.
We need something more general to reuse it in the future, 
RegionServerAccountingManager.

Agreed.  Should be more than just about Memory accounting (and agree w/ Jon 
that it could be path out of our hairball HRegionServer class).  

For you Liyin and this patch, I think just make a class named 
RegionServerAccounting -- drop Manager I'd say, that might be a little 
megalomanicial -- and put just this one counter in it (as per Jon).  Add 
getRegionServerAccounting to RSS Interface.



 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011467#comment-13011467
 ] 

Todd Lipcon commented on HBASE-3694:


Sounds good to me.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011561#comment-13011561
 ] 

stack commented on HBASE-3694:
--

Please do not use HBaseClusterTestCase as basis for your test.  Its been 
deprecated ' * @deprecated Use junit4 and {@link HBaseTestingUtility}'.  Sorry 
about that.  We should have made sure you got the memo on that one.  The 
alternative HBaseTestingUtility has cleaner means of creating multiregion 
table. Fix copyright on your test -- also, the javadoc is copy/pasted from 
elsewhere -- and in your accounting class. Its 2011!  RegionServerAccounting 
needs a bit of class javadoc to say what the class is for.  I'd write 'private 
final AtomicLong atomicGlobalMemstoreSize = new AtomicLong(0);' rather than 
wait to assign in the Constructor (no need for a constructor then).  I'd rename 
incGlobalMemstoreSize as addAndGetGlobalMemstoreSize as in AtomicLong and I'd 
return the current value as per AtomicLong (why not?).  I'd also call it 
getAndAddMemstoreSize rather than incMemoryUsage.

Otherwise the patch looks great Liyin.  Thanks for doing this.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch, 
 Hbase-3694[r1085593]_5.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-25 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011570#comment-13011570
 ] 

Todd Lipcon commented on HBASE-3694:


I don't think we should return the current value from the increment call unless 
it's necessary. For striped counters and such, a blind increment can often be 
cheaper than an increment-and-get. Isn't this the case with the Cliff Click 
Counters?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch, 
 Hbase-3694[r1085306]_3.patch, Hbase-3694[r1085508]_4.patch, 
 Hbase-3694[r1085593]_5.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010974#comment-13010974
 ] 

stack commented on HBASE-3694:
--

RSS has:

{code}
  /**
   * Returns a reference to the RPC server metrics.
   */
  public HBaseRpcMetrics getRpcMetrics();
{code}

Could you add your counter to HBaseRpcMetrics class or would that be weird?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010980#comment-13010980
 ] 

Ted Yu commented on HBASE-3694:
---

How about piggybacking HServerInfo:
{code}
  public HServerInfo getServerInfo();
{code}


 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010981#comment-13010981
 ] 

Todd Lipcon commented on HBASE-3694:


RpcMetrics seems like the wrong spot to me.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010983#comment-13010983
 ] 

Jonathan Gray commented on HBASE-3694:
--

Neither of these seem right.  Issue with adding another method for this?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010984#comment-13010984
 ] 

stack commented on HBASE-3694:
--

We could add a new method.  Just trying to keep the methods to a minimum 
because mocking the Interface becomes a pain if a million methods to fill in 
(looks ugly too in tests).  But go for it. Add getting a Counts class or 
something.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010986#comment-13010986
 ] 

Ted Yu commented on HBASE-3694:
---

@Liyin can you run your test after incorporating HBASE-3654 ?
Just wonder how much influence the synchronization of onlineRegions might have 
on this issue.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011029#comment-13011029
 ] 

Ted Yu commented on HBASE-3694:
---

HRegion already has this:
{code}
  final RegionServerServices rsServices;
{code}
You can reuse it instead of adding HRegionServer reference directly.


 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: Hbase-3694[r1085306], Hbase-3694[r1085306]_2.patch


 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-23 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010376#comment-13010376
 ] 

Todd Lipcon commented on HBASE-3694:


I don't think a static variable is the way to go. In minicluster tests, you 
want to separately count memory for each RS, even though they share the same 
heap.

Instead, I think we should add it to HRegionServer, or a new class like 
'MemoryAccountingManager' which is accessible through HRegionServer. Thoughts?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-23 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010492#comment-13010492
 ] 

Liyin Tang commented on HBASE-3694:
---

We tried to add this var in RS and passing it to Region via its constructor at 
the beginning.
However since the HRegion is not created by RS, it is hard to implement and it 
will pass the NULL to the HRegion constructor in most cases. 
Of course, we can set the RegionServer reference to the Regions every time, but 
it will make the code much more complicated. 

As long as this change ONLY conflicts with some unit tests, we can make it work 
for that case. 
For example, we can write a function in minicluster tests to get the global mem 
store size for the given Region Server.

Any Thoughts:)

Liyin



 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-23 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010513#comment-13010513
 ] 

Todd Lipcon commented on HBASE-3694:


Just seems to me that static state is just the java equivalent of ugly global 
variables. They always come back to bite us in some way or another later on.

I don't have the code handy at the moment (booted into Windows to work on a ppt 
:( ) but it seems like there has to be some way that the HRegion can get at the 
region server. I thought it had a RegionServerServices instance somewhere 
inside?

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-23 Thread ryan rawson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010518#comment-13010518
 ] 

ryan rawson commented on HBASE-3694:


Lets avoid the static if at all possible. Ditto todd, it makes life hard
later


 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3694) high multiput latency due to checking global mem store size in a synchronized function

2011-03-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13010521#comment-13010521
 ] 

Ted Yu commented on HBASE-3694:
---

HRegion has reference to RegionServerServices
HRegionServer is the only implementer of RegionServerServices.

 high multiput latency due to checking global mem store size in a synchronized 
 function
 --

 Key: HBASE-3694
 URL: https://issues.apache.org/jira/browse/HBASE-3694
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Liyin Tang

 The problem is we found the multiput latency is very high.
 In our case, we have almost 22 Regions in each RS and there are no flush 
 happened during these puts.
 After investigation, we believe that the root cause is the function 
 getGlobalMemStoreSize, which is to check the high water mark of mem store. 
 This function takes almost 40% of total execution time of multiput when 
 instrumenting some metrics in the code.  
 The actual percentage may be more higher. The execution time is spent on 
 synchronize contention.
 One solution is to keep a static var in HRegion to keep the global MemStore 
 size instead of calculating them every time.
 Why using static variable?
 Since all the HRegion objects in the same JVM share the same memory heap, 
 they need to share fate as well.
 The static variable, globalMemStroeSize, naturally shows the total mem usage 
 in this shared memory heap for this JVM.
 If multiple RS need to run in the same JVM, they still need only one 
 globalMemStroeSize.
 If multiple RS run on different JVMs, everything is fine.
 After changing, in our cases, the avg multiput latency decrease from 60ms to 
 10ms.
 I will submit a patch based on the current trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira