[jira] [Commented] (HBASE-24095) HBase Bad Substitution ERROR on hadoop-functions.sh
[ https://issues.apache.org/jira/browse/HBASE-24095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17388274#comment-17388274 ] Gaurav Kanade commented on HBASE-24095: --- have you checked if you might be hitting this ? https://issues.apache.org/jira/browse/HADOOP-16167 > HBase Bad Substitution ERROR on hadoop-functions.sh > --- > > Key: HBASE-24095 > URL: https://issues.apache.org/jira/browse/HBASE-24095 > Project: HBase > Issue Type: Bug > Components: hadoop3 >Affects Versions: 2.2.3 > Environment: hbase 2.2.3 with hadoop 3.2.1: > Installed both hadoop and hbase according the apache "Getting Started" > guides. for hbase, i have removed the hadoop jar files it downloaded with, > which do not match my current version of hadoop, as per the documentation. >Reporter: Alex Swarner >Priority: Major > > Any Time i make a call to hbase (e.g. "hbase version" or "hbase-daemon.sh > start master", i receive this error message: > */usr/hdeco/hadoop/bin/../libexec/hadoop-functions.sh: line 2366: > HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution* > */usr/hdeco/hadoop/bin/../libexec/hadoop-functions.sh: line 2461: > HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution* > > "hbase version" does provide version information after this error message, > but i am unable to start the hbase master, so i am unable to use hbase > further. > > I have never posted in any forum before, so let me know if more information > is needed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385011#comment-17385011 ] Gaurav Kanade commented on HBASE-24984: --- [~huaxiangsun] added PR for branch-2, [~anoop.hbase] will merge and cherry pick to branch-2.3 and branch-2.4 > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383436#comment-17383436 ] Gaurav Kanade commented on HBASE-24984: --- Hi [~apurtell] - thx for your feedback. I believe as mentioned in further comments there rpc release and wal release could happen at same time, hence sync mechanism is needed and atomicinteger achieves that. Plz let me know if any further concerns! > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24984: -- Status: Patch Available (was: Open) > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381610#comment-17381610 ] Gaurav Kanade commented on HBASE-24984: --- Like [~anoop.hbase] mentions that inability to repro was test issue, we now have the repro for 2.3 and even on master. Put out a new PR with latest update. > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381496#comment-17381496 ] Gaurav Kanade edited comment on HBASE-24984 at 7/15/21, 5:50 PM: - Added patch for branch-2.2. Issue was originally reported on branch-2.1.x and should apply to that branch. This particular repro is not working for 2.3.x onwards and we are checking on this. Thx [~anoop.hbase] for the help with figuring out the repro tests. was (Author: gouravk): Added patch for branch-2.2. Issue was originally reported on branch-2.1.x and should apply to that branch. This particular repro is not working for 2.3.x onwards and we are checking on this. > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381496#comment-17381496 ] Gaurav Kanade commented on HBASE-24984: --- Added patch for branch-2.2. Issue was originally reported on branch-2.1.x and should apply to that branch. This particular repro is not working for 2.3.x onwards and we are checking on this. > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24984: -- Attachment: 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > Attachments: > 0001-HBASE-24984-WAL-corruption-due-to-early-DBBs-re-use-.patch > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381032#comment-17381032 ] Gaurav Kanade commented on HBASE-24984: --- Incidentally the original Jira HBASE-22539, was first reported on a 2.1.x branch and then mentioned that it affects only 2.x based releases, however the patch seems to be applied to master as well. Hence we are still checking if something different in the buffer pool release mechanism in the master which is causing the difference in repro. > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17381029#comment-17381029 ] Gaurav Kanade commented on HBASE-24984: --- [~stack] yes, we had a repro based on our version (2.1.6) and believed it applied directly to master before realizing it didnt. I will put out a separate PR first for 2.x branches and then validate what needs to change in the test to repro it on master. > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17378187#comment-17378187 ] Gaurav Kanade commented on HBASE-24984: --- Actively working on this, patch coming soon > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade reassigned HBASE-24984: - Assignee: Gaurav Kanade > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Assignee: Gaurav Kanade >Priority: Critical > Fix For: 2.5.0, 2.3.6, 3.0.0-alpha-2, 2.4.5 > > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24984) WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used with multi operation
[ https://issues.apache.org/jira/browse/HBASE-24984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361187#comment-17361187 ] Gaurav Kanade commented on HBASE-24984: --- [~mopishv0] any sample call stack, err file log for when this kind of issue happens? > WAL corruption due to early DBBs re-use when Durability.ASYNC_WAL is used > with multi operation > -- > > Key: HBASE-24984 > URL: https://issues.apache.org/jira/browse/HBASE-24984 > Project: HBase > Issue Type: Bug > Components: rpc, wal >Affects Versions: 2.1.6 >Reporter: Liu Junhong >Priority: Major > > After bugfix HBASE-22539, When client use BufferedMutator or multiple > mutation , there will be one RpcCall and mutliple FSWALEntry . At the time > RpcCall finish and one FSWALEntry call release() , the remain FSWALEntries > may trigger RuntimeException or segmentation fault . > We should use RefCnt instead of AtomicInteger for > org.apache.hadoop.hbase.ipc.ServerCall.reference? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-25026) Create a metric to track full region scans RPCs
[ https://issues.apache.org/jira/browse/HBASE-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-25026: -- Release Note: Adds a new metric where we collect the number of full region scan requests at the RPC layer. This will be collected under "name" : "Hadoop:service=HBase,name=RegionServer,sub=Server" > Create a metric to track full region scans RPCs > --- > > Key: HBASE-25026 > URL: https://issues.apache.org/jira/browse/HBASE-25026 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0-alpha-1 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0-alpha-1, 2.4.0 > > > A metric that indicates how many of the scan requests were without start row > and/or stop row. Generally such queries may be wrongly written or may require > better schema design and those may be some queries doing some sanity check to > verify if their actual application logic has done the necessary updates and > the all that expected rows are processed. > We do have some logs at the RPC layer to see what queries take time but > nothing as a metric. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-25026) Create a metric to track scans that have no start row and/or stop row
[ https://issues.apache.org/jira/browse/HBASE-25026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade reassigned HBASE-25026: - Assignee: Gaurav Kanade (was: ramkrishna.s.vasudevan) > Create a metric to track scans that have no start row and/or stop row > - > > Key: HBASE-25026 > URL: https://issues.apache.org/jira/browse/HBASE-25026 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0-alpha-1, 2.4.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > > A metric that indicates how many of the scan requests were without start row > and/or stop row. Generally such queries may be wrongly written or may require > better schema design and those may be some queries doing some sanity check to > verify if their actual application logic has done the necessary updates and > the all that expected rows are processed. > We do have some logs at the RPC layer to see what queries take time but > nothing as a metric. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-20511) Set the scan's read type on the subsequent scan requests
[ https://issues.apache.org/jira/browse/HBASE-20511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade reassigned HBASE-20511: - Assignee: Gaurav Kanade (was: ramkrishna.s.vasudevan) > Set the scan's read type on the subsequent scan requests > > > Key: HBASE-20511 > URL: https://issues.apache.org/jira/browse/HBASE-20511 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Major > > See discussion in > https://issues.apache.org/jira/browse/HBASE-20457?focusedCommentId=16445329=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445329 > Its better we set the scan's read type to the subsequent scan requests in > cases where the scan actually switched over from preads to STREAM type. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24312) Backport HBase-24199 to 2.1.x branches
[ https://issues.apache.org/jira/browse/HBASE-24312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24312: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Backport HBase-24199 to 2.1.x branches > -- > > Key: HBASE-24312 > URL: https://issues.apache.org/jira/browse/HBASE-24312 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 2.1.6 >Reporter: Gaurav Kanade >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 2.1.10 > > > Noticed that 2.1.x line has this issue also and we need this patch backported > to 2.1.x line -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24312) Backport HBase-24199 to 2.1.x branches
[ https://issues.apache.org/jira/browse/HBASE-24312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24312: -- Status: Patch Available (was: Open) > Backport HBase-24199 to 2.1.x branches > -- > > Key: HBASE-24312 > URL: https://issues.apache.org/jira/browse/HBASE-24312 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 2.1.6 >Reporter: Gaurav Kanade >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 2.1.10 > > > Noticed that 2.1.x line has this issue also and we need this patch backported > to 2.1.x line -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-24312) Backport HBase-24199 to 2.x branches
Gaurav Kanade created HBASE-24312: - Summary: Backport HBase-24199 to 2.x branches Key: HBASE-24312 URL: https://issues.apache.org/jira/browse/HBASE-24312 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 2.1.6 Reporter: Gaurav Kanade Assignee: Gaurav Kanade Fix For: 2.1.6 Noticed that 2.1.x line has this issue also and we need this patch backported to 2.1.x line -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24312) Backport HBase-24199 to 2.1.x branches
[ https://issues.apache.org/jira/browse/HBASE-24312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24312: -- Summary: Backport HBase-24199 to 2.1.x branches (was: Backport HBase-24199 to 2.x branches) > Backport HBase-24199 to 2.1.x branches > -- > > Key: HBASE-24312 > URL: https://issues.apache.org/jira/browse/HBASE-24312 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 2.1.6 >Reporter: Gaurav Kanade >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 2.1.6 > > > Noticed that 2.1.x line has this issue also and we need this patch backported > to 2.1.x line -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24199) Procedure related metrics is not consumed in the JMX metric
[ https://issues.apache.org/jira/browse/HBASE-24199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24199: -- Status: Patch Available (was: In Progress) > Procedure related metrics is not consumed in the JMX metric > --- > > Key: HBASE-24199 > URL: https://issues.apache.org/jira/browse/HBASE-24199 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.4.0, 2.1.10, 2.2.5 > > Attachments: screenshot_2.png > > > We have ProcedureMetrics and that is being tracked for every procedure that > we create for all the ops in the system. > But when we check the UI, the UI does not display those information at all. > It may be useful to know atleast in the case of ServerCrashProcedure exactly > to know how much it has taken for the procedure to complete. Similarly other > procedures also can be added to the UI. > Currently checked in master code - but think it will apply to other branches > also. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HBASE-24199) Procedure related metrics is not consumed in the JMX metric
[ https://issues.apache.org/jira/browse/HBASE-24199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-24199 started by Gaurav Kanade. - > Procedure related metrics is not consumed in the JMX metric > --- > > Key: HBASE-24199 > URL: https://issues.apache.org/jira/browse/HBASE-24199 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.4.0, 2.1.10, 2.2.5 > > Attachments: screenshot_2.png > > > We have ProcedureMetrics and that is being tracked for every procedure that > we create for all the ops in the system. > But when we check the UI, the UI does not display those information at all. > It may be useful to know atleast in the case of ServerCrashProcedure exactly > to know how much it has taken for the procedure to complete. Similarly other > procedures also can be added to the UI. > Currently checked in master code - but think it will apply to other branches > also. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24199) Procedure related metrics is not consumed in the JMX metric
[ https://issues.apache.org/jira/browse/HBASE-24199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24199: -- Attachment: screenshot_2.png > Procedure related metrics is not consumed in the JMX metric > --- > > Key: HBASE-24199 > URL: https://issues.apache.org/jira/browse/HBASE-24199 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.4.0, 2.1.10, 2.2.5 > > Attachments: screenshot_2.png > > > We have ProcedureMetrics and that is being tracked for every procedure that > we create for all the ops in the system. > But when we check the UI, the UI does not display those information at all. > It may be useful to know atleast in the case of ServerCrashProcedure exactly > to know how much it has taken for the procedure to complete. Similarly other > procedures also can be added to the UI. > Currently checked in master code - but think it will apply to other branches > also. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24199) Procedure related metrics is not consumed in the JMX metric
[ https://issues.apache.org/jira/browse/HBASE-24199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095033#comment-17095033 ] Gaurav Kanade commented on HBASE-24199: --- Yes working on it - should have a patch in the next few days > Procedure related metrics is not consumed in the JMX metric > --- > > Key: HBASE-24199 > URL: https://issues.apache.org/jira/browse/HBASE-24199 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.4.0, 2.1.10, 2.2.5 > > > We have ProcedureMetrics and that is being tracked for every procedure that > we create for all the ops in the system. > But when we check the UI, the UI does not display those information at all. > It may be useful to know atleast in the case of ServerCrashProcedure exactly > to know how much it has taken for the procedure to complete. Similarly other > procedures also can be added to the UI. > Currently checked in master code - but think it will apply to other branches > also. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24199) Procedure related metrics is not consumed in the JMX metric
[ https://issues.apache.org/jira/browse/HBASE-24199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095033#comment-17095033 ] Gaurav Kanade edited comment on HBASE-24199 at 4/29/20, 3:45 AM: - [~ndimiduk] Yes working on it - should have a patch in the next few days was (Author: gouravk): Yes working on it - should have a patch in the next few days > Procedure related metrics is not consumed in the JMX metric > --- > > Key: HBASE-24199 > URL: https://issues.apache.org/jira/browse/HBASE-24199 > Project: HBase > Issue Type: Improvement > Components: metrics >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.4.0, 2.1.10, 2.2.5 > > > We have ProcedureMetrics and that is being tracked for every procedure that > we create for all the ops in the system. > But when we check the UI, the UI does not display those information at all. > It may be useful to know atleast in the case of ServerCrashProcedure exactly > to know how much it has taken for the procedure to complete. Similarly other > procedures also can be added to the UI. > Currently checked in master code - but think it will apply to other branches > also. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-24208) Remove RS entry from zk draining servers node after RS been stopped
[ https://issues.apache.org/jira/browse/HBASE-24208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade reassigned HBASE-24208: - Assignee: Gaurav Kanade (was: Anoop Sam John) > Remove RS entry from zk draining servers node after RS been stopped > --- > > Key: HBASE-24208 > URL: https://issues.apache.org/jira/browse/HBASE-24208 > Project: HBase > Issue Type: Improvement >Reporter: Anoop Sam John >Assignee: Gaurav Kanade >Priority: Major > Fix For: 3.0.0, 2.3.1, 2.2.5 > > > When a RS is been decommissioned, we will add an entry into the zk node. This > will be there unless the same RS instance is recommissioned. > But if we want to scale down a cluster, the best path would be to > decommission the RSs in the scaling down nodes. The regions in these RSs > will get moved to live RSs. In this case these decommissioned RSs will get > stopped later. These will never get recommissioned. The zk nodes will still > be there under draining servers path. > We can remove this zk node when the RS is getting stopped. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24177) MetricsTable#updateFlushTime is wrong
[ https://issues.apache.org/jira/browse/HBASE-24177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24177: -- Status: Patch Available (was: Open) > MetricsTable#updateFlushTime is wrong > - > > Key: HBASE-24177 > URL: https://issues.apache.org/jira/browse/HBASE-24177 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 1.1.2 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.1.10, 2.2.5 > > Attachments: after.png, before.png > > > MetricsRegionServer does an update on the MetricsRegionServerSource, > MetricsTable etc. > While doing updateFlushTime, the time taken for flush is rightly updated in > the RegionServerSource but at the MetricsTable level we update the > memstoresize instead of the time. > This applies from 1.1 version onwards. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24177) MetricsTable#updateFlushTime is wrong
[ https://issues.apache.org/jira/browse/HBASE-24177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085611#comment-17085611 ] Gaurav Kanade edited comment on HBASE-24177 at 4/17/20, 10:03 AM: -- Without this fix we were outputting the same value for data size of flush and time taken. This fixes it. See log below and before and after screenshots attached 2020-04-17 15:08:28,832 INFO [rs(ramvasu-vm,16020,1587116254987)-flush-proc-pool9-t1] regionserver.HRegion: Finished flush of *dataSize ~27 B*/27, heapSize ~344 B/344, currentSize=0 B/0 for b0745021e7535fe7a984eddded4347c9 in *400ms*, sequenceid=12, compaction requested=false was (Author: gouravk): Without this fix we were outputting the same value for data size of flush and time taken. This fixes it. See log below and before and after screenshots attached 2020-04-17 15:08:28,832 INFO [rs(ramvasu-vm,16020,1587116254987)-flush-proc-pool9-t1] regionserver.HRegion: Finished flush of *dataSize ~27 B*/27, heapSize ~344 B/344, currentSize=0 B/0 for b0745021e7535fe7a984eddded4347c9 in *400m*s, sequenceid=12, compaction requested=false > MetricsTable#updateFlushTime is wrong > - > > Key: HBASE-24177 > URL: https://issues.apache.org/jira/browse/HBASE-24177 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 1.1.2 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.1.10, 2.2.5 > > Attachments: after.png, before.png > > > MetricsRegionServer does an update on the MetricsRegionServerSource, > MetricsTable etc. > While doing updateFlushTime, the time taken for flush is rightly updated in > the RegionServerSource but at the MetricsTable level we update the > memstoresize instead of the time. > This applies from 1.1 version onwards. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24177) MetricsTable#updateFlushTime is wrong
[ https://issues.apache.org/jira/browse/HBASE-24177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085615#comment-17085615 ] Gaurav Kanade commented on HBASE-24177: --- plz review [~anoop.hbase] [~ram_krish] > MetricsTable#updateFlushTime is wrong > - > > Key: HBASE-24177 > URL: https://issues.apache.org/jira/browse/HBASE-24177 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 1.1.2 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.1.10, 2.2.5 > > Attachments: after.png, before.png > > > MetricsRegionServer does an update on the MetricsRegionServerSource, > MetricsTable etc. > While doing updateFlushTime, the time taken for flush is rightly updated in > the RegionServerSource but at the MetricsTable level we update the > memstoresize instead of the time. > This applies from 1.1 version onwards. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24177) MetricsTable#updateFlushTime is wrong
[ https://issues.apache.org/jira/browse/HBASE-24177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gaurav Kanade updated HBASE-24177: -- Attachment: after.png before.png > MetricsTable#updateFlushTime is wrong > - > > Key: HBASE-24177 > URL: https://issues.apache.org/jira/browse/HBASE-24177 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 1.1.2 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.1.10, 2.2.5 > > Attachments: after.png, before.png > > > MetricsRegionServer does an update on the MetricsRegionServerSource, > MetricsTable etc. > While doing updateFlushTime, the time taken for flush is rightly updated in > the RegionServerSource but at the MetricsTable level we update the > memstoresize instead of the time. > This applies from 1.1 version onwards. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24177) MetricsTable#updateFlushTime is wrong
[ https://issues.apache.org/jira/browse/HBASE-24177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085611#comment-17085611 ] Gaurav Kanade commented on HBASE-24177: --- Without this fix we were outputting the same value for data size of flush and time taken. This fixes it. See log below and before and after screenshots attached 2020-04-17 15:08:28,832 INFO [rs(ramvasu-vm,16020,1587116254987)-flush-proc-pool9-t1] regionserver.HRegion: Finished flush of *dataSize ~27 B*/27, heapSize ~344 B/344, currentSize=0 B/0 for b0745021e7535fe7a984eddded4347c9 in *400m*s, sequenceid=12, compaction requested=false > MetricsTable#updateFlushTime is wrong > - > > Key: HBASE-24177 > URL: https://issues.apache.org/jira/browse/HBASE-24177 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 1.1.2 >Reporter: ramkrishna.s.vasudevan >Assignee: Gaurav Kanade >Priority: Minor > Fix For: 3.0.0, 2.3.0, 2.1.10, 2.2.5 > > > MetricsRegionServer does an update on the MetricsRegionServerSource, > MetricsTable etc. > While doing updateFlushTime, the time taken for flush is rightly updated in > the RegionServerSource but at the MetricsTable level we update the > memstoresize instead of the time. > This applies from 1.1 version onwards. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-17460) enable_table_replication can not perform cyclic replication of a table
[ https://issues.apache.org/jira/browse/HBASE-17460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822394#comment-15822394 ] Gaurav Kanade commented on HBASE-17460: --- +1 > enable_table_replication can not perform cyclic replication of a table > -- > > Key: HBASE-17460 > URL: https://issues.apache.org/jira/browse/HBASE-17460 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: NITIN VERMA >Assignee: NITIN VERMA > Labels: replication > Attachments: HBASE-17460.patch > > Original Estimate: 96h > Remaining Estimate: 96h > > The enable_table_replication operation is broken for cyclic replication of > HBase table as we compare all the properties of column families (including > REPLICATION_SCOPE). > Below is exactly what happens: > 1. Running "enable_table_replication 'table1' " opeartion on first cluster > will set the REPLICATION_SCOPE of all column families to peer id '1'. This > will also create a table on second cluster where REPLICATION_SCOPE is still > set to peer id '0'. > 2. Now when we run "enable_table_replication 'table1'" on second cluster, we > compare all the properties of table (including REPLICATION_SCOPE_, which > obviously is different now. > I am proposing a fix for this issue where we should avoid comparing > REPLICATION_SCOPE inside HColumnDescriotor::compareTo() method, especially > when replication is not already enabled on the desired table. > I have made that change and it is working. I will submit the patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)