[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

2012-12-27 Thread shen guanpu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539873#comment-13539873
 ] 

shen guanpu commented on HBASE-7263:


"HBASE-7051 and HBASE-4583 implement option #1. The downside, as mentioned, is 
that you have to wait for updates on other rows, since MVCC is per-row."

Do you mean it is not per-row for option 2( Have an MVCC per-row (table 
configuration): this avoids the unnecessary contention of 1))

> Investigate more fine grained locking for checkAndPut/append/increment
> --
>
> Key: HBASE-7263
> URL: https://issues.apache.org/jira/browse/HBASE-7263
> Project: HBase
>  Issue Type: Improvement
>  Components: Transactions/MVCC
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
>Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you 
> have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary 
> contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an 
> increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, 
> is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a 
> row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it 
> until the MVCC is completed.  The nice property that this gives you is that 
> read/updates can tell when the MVCC is done on a per-row basis, because they 
> can just try to acquire the write-lock which will block until the MVCC is 
> competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it 
> should be small, since there will never be any blocking on acquiring the 
> row-level read lock.  This is because the read lock can only block if someone 
> else holds the write lock, but both the write and read lock are only acquired 
> under the row lock.
> I ran a quick test of this approach over a region (this directly interacts 
> with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I 
> need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7263) Investigate more fine grained locking for checkAndPut/append/increment

2012-12-27 Thread shen guanpu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539865#comment-13539865
 ] 

shen guanpu commented on HBASE-7263:


hi Gregory Chanan 
Does your test just operate one rowkey?
How much will it be slower when you put different rowkey? as you mentioned ,you 
have to wait for MVCC on other rows.

> Investigate more fine grained locking for checkAndPut/append/increment
> --
>
> Key: HBASE-7263
> URL: https://issues.apache.org/jira/browse/HBASE-7263
> Project: HBase
>  Issue Type: Improvement
>  Components: Transactions/MVCC
>Reporter: Gregory Chanan
>Assignee: Gregory Chanan
>Priority: Minor
>
> HBASE-7051 lists 3 options for fixing an ACID-violation wrt checkAndPut:
> {quote}
> 1) Waiting for the MVCC to advance for read/updates: the downside is that you 
> have to wait for updates on other rows.
> 2) Have an MVCC per-row (table configuration): this avoids the unnecessary 
> contention of 1)
> 3) Transform the read/updates to write-only with rollup on read.. E.g. an 
> increment would just have the number of values to increment.
> {quote}
> HBASE-7051 and HBASE-4583 implement option #1.  The downside, as mentioned, 
> is that you have to wait for updates on other rows, since MVCC is per-row.
> Another option occurred to me that I think is worth investigating: rely on a 
> row-level read/write lock rather than MVCC.
> Here is pseudo-code for what exists today for read/updates like checkAndPut
> {code}
> (1)  Acquire RowLock
> (1a) BeginMVCC + Finish MVCC
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> {code}
> Write-only operations (e.g. puts) are the same, just without step 1a.
> Now, consider the following instead:
> {code}
> (1)  Acquire RowLock
> (1a) Grab+Release RowWriteLock (instead of BeginMVCC + Finish MVCC)
> (1b) Grab RowReadLock (new step!)
> (2)  Begin MVCC
> (3)  Do work
> (4)  Release RowLock
> (5)  Append to WAL
> (6)  Finish MVCC
> (7)  Release RowReadLock (new step!)
> {code}
> As before, write-only operations are the same, just without step 1a.
> The difference here is that writes grab a row-level read lock and hold it 
> until the MVCC is completed.  The nice property that this gives you is that 
> read/updates can tell when the MVCC is done on a per-row basis, because they 
> can just try to acquire the write-lock which will block until the MVCC is 
> competed for that row in step 7.
> There is overhead for acquiring the read lock that I need to measure, but it 
> should be small, since there will never be any blocking on acquiring the 
> row-level read lock.  This is because the read lock can only block if someone 
> else holds the write lock, but both the write and read lock are only acquired 
> under the row lock.
> I ran a quick test of this approach over a region (this directly interacts 
> with HRegion, so no client effects):
> - 30 threads
> - 5000 increments per thread
> - 30 columns per increment
> - Each increment uniformly distributed over 500,000 rows
> - 5 trials
> Better-Than-Theoretical-Max: (No locking or MVCC on step 1a): 10362.2 ms
> Today: 13950 ms
> The locking approach: 10877 ms
> So it looks like an improvement, at least wrt increment.  As mentioned, I 
> need to measure the overhead of acquiring the read lock for puts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6465) Load balancer repeatedly close and open region in the same regionserver.

2012-07-26 Thread shen guanpu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423706#comment-13423706
 ] 

shen guanpu commented on HBASE-6465:


ok thanks
i am not quit catch the rule,sorry!

> Load balancer repeatedly close and open region in the same regionserver.
> 
>
> Key: HBASE-6465
> URL: https://issues.apache.org/jira/browse/HBASE-6465
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Affects Versions: 0.94.0
>Reporter: shen guanpu
>
> Through the master and regionserver log,I find load balancer repeatedly
> close and open region in the same regionserver(period in
> hbase.balancer.period ).
> Does this is a bug in load balancer and how can I dig into or avoid this?
> the hbase and hadoop version is
> HBase Version0.94.0, r1332822Hadoop Version0.20.2-cdh3u1,
> rbdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
> the following is a detail log about the same region
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956,
> and it repeats again and again.:
> 2012-07-16 00:12:49,843 INFO org.apache.hadoop.hbase.master.HMaster: balance
> hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
> src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
> 2012-07-16 00:12:49,843 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of
> region
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
> (offlining)
> 2012-07-16 00:12:49,843 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> master:6-0x4384d0a47f40068 Creating unassigned node for
> 93caf5147d40f5dd4625e160e1b7e956 in a CLOSING state
> 2012-07-16 00:12:49,845 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
> 192.168.1.2,60020,1342017399608 for region
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
> 2012-07-16 00:12:50,555 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_CLOSED, server=192.168.1.2,60020,1342017399608,
> region=93caf5147d40f5dd4625e160e1b7e956
> 2012-07-16 00:12:50,555 DEBUG
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED
> event for 93caf5147d40f5dd4625e160e1b7e956
> 2012-07-16 00:12:50,555 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
> was=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
> state=CLOSED, ts=1342368770556, server=192.168.1.2,60020,1342017399608
> 2012-07-16 00:12:50,555 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> master:6-0x4384d0a47f40068 Creating (or updating) unassigned node for
> 93caf5147d40f5dd4625e160e1b7e956 with OFFLINE state
> 2012-07-16 00:12:50,558 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=M_ZK_REGION_OFFLINE, server=10.75.18.34,6,1342017369575,
> region=93caf5147d40f5dd4625e160e1b7e956
> 2012-07-16 00:12:50,558 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
> for
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
> destination server is 192.168.1.2,60020,1342002082592
> 2012-07-16 00:12:50,558 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan
> for region
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.;
> plan=hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
> src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
> 2012-07-16 00:12:50,558 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
> to 192.168.1.2,60020,1342002082592
> 2012-07-16 00:12:50,574 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
> region=93caf5147d40f5dd4625e160e1b7e956
> 2012-07-16 00:12:50,635 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
> region=93caf5147d40f5dd4625e160e1b7e956
> 2012-07-16 00:12:50,639 DEBUG
> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> transition=RS_ZK_REGION_OPENED, server=192.168.1.2,60020,1342017399608,
> region=93caf5147d40f5dd4625e160e1b7e956
> 2012-07-16 00:12:50,639 DEBUG
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED
> event for
> trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
> from 192.168.1.2,60020,1342017399608; deleting unassigned node
> 2012-07-16 00:12:50,640 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
> master:6-0x

[jira] [Created] (HBASE-6465) Load balancer repeatedly close and open region in the same regionserver.

2012-07-26 Thread shen guanpu (JIRA)
shen guanpu created HBASE-6465:
--

 Summary: Load balancer repeatedly close and open region in the 
same regionserver.
 Key: HBASE-6465
 URL: https://issues.apache.org/jira/browse/HBASE-6465
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.0
Reporter: shen guanpu


Through the master and regionserver log,I find load balancer repeatedly
close and open region in the same regionserver(period in
hbase.balancer.period ).
Does this is a bug in load balancer and how can I dig into or avoid this?


the hbase and hadoop version is
HBase Version0.94.0, r1332822Hadoop Version0.20.2-cdh3u1,
rbdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638
the following is a detail log about the same region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956,
and it repeats again and again.:
2012-07-16 00:12:49,843 INFO org.apache.hadoop.hbase.master.HMaster: balance
hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
2012-07-16 00:12:49,843 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of
region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
(offlining)
2012-07-16 00:12:49,843 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Creating unassigned node for
93caf5147d40f5dd4625e160e1b7e956 in a CLOSING state
2012-07-16 00:12:49,845 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to
192.168.1.2,60020,1342017399608 for region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
2012-07-16 00:12:50,555 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_CLOSED, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,555 DEBUG
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED
event for 93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,555 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE;
was=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
state=CLOSED, ts=1342368770556, server=192.168.1.2,60020,1342017399608
2012-07-16 00:12:50,555 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Creating (or updating) unassigned node for
93caf5147d40f5dd4625e160e1b7e956 with OFFLINE state
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=M_ZK_REGION_OFFLINE, server=10.75.18.34,6,1342017369575,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan
for
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
destination server is 192.168.1.2,60020,1342002082592
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan
for region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.;
plan=hri=trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.,
src=192.168.1.2,60020,1342017399608, dest=192.168.1.2,60020,1342002082592
2012-07-16 00:12:50,558 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Assigning region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
to 192.168.1.2,60020,1342002082592
2012-07-16 00:12:50,574 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,635 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENING, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,639 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: Handling
transition=RS_ZK_REGION_OPENED, server=192.168.1.2,60020,1342017399608,
region=93caf5147d40f5dd4625e160e1b7e956
2012-07-16 00:12:50,639 DEBUG
org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED
event for
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
from 192.168.1.2,60020,1342017399608; deleting unassigned node
2012-07-16 00:12:50,640 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Deleting existing unassigned node for
93caf5147d40f5dd4625e160e1b7e956 that is in expected state
RS_ZK_REGION_OPENED
2012-07-16 00:12:50,641 DEBUG
org.apache.hadoop.hbase.master.AssignmentManager: The znode of region
trackurl_status_list,zO6u4o8,1342291884831.93caf5147d40f5dd4625e160e1b7e956.
has been deleted.
2012-07-16 00:12:50,641 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign:
master:6-0x4384d0a47f40068 Successfully deleted unassi