[jira] [Resolved] (HBASE-25627) HBase replication should have a metric to represent if the source is stuck getting initialized

2021-03-22 Thread Bharath Vissapragada (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Vissapragada resolved HBASE-25627.
--
Resolution: Fixed

Thanks [~sandeep.pal]

> HBase replication should have a metric to represent if the source is stuck 
> getting initialized
> --
>
> Key: HBASE-25627
> URL: https://issues.apache.org/jira/browse/HBASE-25627
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.3.5, 2.4.3
>Reporter: Sandeep Pal
>Assignee: Sandeep Pal
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.5.0, 2.4.3
>
>
> There can be situation when the cluster is not able to talk to peer cluster 
> ZK, in that case, yes the logQueue will be accumulating but without digging 
> into the logs, we cannot know what's the reason of loqQueue getting 
> accumulating on the source. 
> Since the replication source doesn't even start the shipper in this case, it 
> is good to have a dedicated metric if the RS cannot talk to the peer's ZK at 
> all. 
>  
> {code:java}
> 2021-03-03 04:02:10,704 DEBUG [peerId] zookeeper.RecoverableZooKeeper - 
> Possibly transient ZooKeeper, 
> quorum=zookeeper-0.zookeeper-a.fakeAddress:2181,zookeeper-1.zookeeper-a.fakeAddress:2181,zookeeper-2.zookeeper-a.fakeAddress:2181,zookeeper-3.zookeeper-a.fakeAddress:2181,zookeeper-4.zookeeper-a.fakeAddress:2181,
>  exception=org.apache.zookeeper.KeeperException$AuthFailedException: 
> KeeperErrorCode = AuthFailed for /hbase/hbaseid2021-03-03 04:02:10,704 DEBUG 
> [peerId] zookeeper.RecoverableZooKeeper - Possibly transient ZooKeeper, 
> quorum=zookeeper-0.zookeeper-a.fakeAddress:2181,zookeeper-1.zookeeper-a.fakeAddress:2181,zookeeper-2.zookeeper-a.fakeAddress:2181,zookeeper-3.zookeeper-a.fakeAddress:2181,zookeeper-4.zookeeper-a.fakeAddress:2181,
>  exception=org.apache.zookeeper.KeeperException$AuthFailedException: 
> KeeperErrorCode = AuthFailed for 
> /hbase/hbaseidorg.apache.zookeeper.KeeperException$AuthFailedException: 
> KeeperErrorCode = AuthFailed for /hbase/hbaseid at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:126) at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:54) at 
> org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1119) at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:284)
>  at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:469) at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>  at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.getUUIDForCluster(ZKClusterId.java:96)
>  at 
> org.apache.hadoop.hbase.replication.HBaseReplicationEndpoint.getPeerUUID(HBaseReplicationEndpoint.java:104)
>  at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:306)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25685) asyncprofiler2.0 no longer supports svg; wants html

2021-03-22 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-25685.
---
Fix Version/s: 2.4.3
   2.3.5
   2.5.0
   3.0.0-alpha-1
 Hadoop Flags: Reviewed
 Release Note: 
If asyncprofiler 1.x, all is good. If asyncprofiler 2.x and it is hbase-2.3.x 
or hbase-2.4.x, add '?output=html' to get flamegraphs from the profiler.

Otherwise, if hbase-2.5+ and asyncprofiler2, all works. If asyncprofiler1 and 
hbase-2.5+, you may have to add '?output=svg' to the query.
   Resolution: Fixed

Thanks for the review [~weichiu]. Pushed #3079 on branch-2.3+branch-2.4. Pushed 
#3078 on branch-2 and master.

> asyncprofiler2.0 no longer supports svg; wants html
> ---
>
> Key: HBASE-25685
> URL: https://issues.apache.org/jira/browse/HBASE-25685
> Project: HBase
>  Issue Type: Bug
>Reporter: Michael Stack
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.5, 2.4.3
>
>
> asyncprofiler2.0 is out. Its a nice tool. Unfortunately, it dropped the svg 
> formatting option that we use in our servlet. Now it wants you  to pass html. 
> Lets fix.
> Old -o on asyncprofiler1.x
> -o fmtoutput format: summary|traces|flat|collapsed|svg|tree|jfr
> New -o asyncprofiler 2.x
> -o fmtoutput format: flat|traces|collapsed|flamegraph|tree|jfr
> If you pass svg to 2.0, it does nothing ... If you run the command hbase is 
> running you see:
> {code}
> /tmp/prof-output$ sudo -u hbase /usr/lib/async-profiler/profiler.sh -e cpu -d 
> 10 -o svg -f /tmp/prof-output/async-prof-pid-8346-cpu-1x.svg 8346
> [ERROR] SVG format is obsolete, use .html for FlameGraph
> {code}
> At a minimum can make it so the OUTPUT param supports HTML. Here is current 
> enum state:
> {code}
>   enum Output {
> SUMMARY,
> TRACES,
> FLAT,
> COLLAPSED,
> SVG,
> TREE,
> JFR
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25688) Use CustomRequestLog instead of Slf4jRequestLog for jetty

2021-03-22 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-25688.
---
Fix Version/s: 2.4.3
   2.5.0
   3.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to branch-2.4+.

Thanks [~stack] for reviewing.

> Use CustomRequestLog instead of Slf4jRequestLog for jetty
> -
>
> Key: HBASE-25688
> URL: https://issues.apache.org/jira/browse/HBASE-25688
> Project: HBase
>  Issue Type: Improvement
>  Components: logging, UI
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3
>
>
> Slf4jRequestLog has been deprecated and replaced by Slf4jRequestLogWriter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Split Meta Design Reset Status

2021-03-22 Thread Stack
Now the requirements are in [1], we're going to move to the next stage --
actual design for split-meta -- and have set up a chat for this thursday
afternoon (4PM California time/8AM Beijing time) to get the ball rolling.
Please come if interested. Zoom details are below.

Yours,
S
1.
https://docs.google.com/document/d/11ChsSb2LGrSzrSJz8pDCAw5IewmaMV0ZDN1LrMkAj4s/edit#heading=h.hdf0rnyevxz2


Topic: hbase split-meta design warmup chat
Time: Mar 25, 2021 04:00 PM Pacific Time (US and Canada)

Join Zoom Meeting
https://us04web.zoom.us/j/75988003798?pwd=Wi9mU0w0T2ZjTFNBaE9lUmtTbHRpQT09

Meeting ID: 759 8800 3798
Passcode: hbase


On Tue, Jan 5, 2021 at 9:13 AM Stack  wrote:

> FYI, a few of us have been working on the redo/reset of the split meta
> design (HBASE-25382). We (think we've) finished the requirements. Are there
> any others to consider?
>
> Feedback and contribs welcome. Otherwise, on to the next phase -- design.
>
> Thanks,
> S
>


Re: [DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-22 Thread Andrew Purtell
Cool, this is my point of view as well. I filed HBASE-25690
 for specifying and
documenting the criteria (whatever it is) for moving the 'stable' pointer.


On Mon, Mar 22, 2021 at 9:27 AM Stack  wrote:

> On Thu, Mar 18, 2021 at 1:07 PM Andrew Purtell 
> wrote:
>
> > Would they do that before or after we designate it stable? Asking, not
> > trying to be difficult. Kind of a chicken and egg problem?
> >
> >
> Before the release was designated stable.
>
>
> > It would be fine I think to consider reported experience when and if it
> > happens but can't be primary criteria because it has nothing directly to
> do
> > with our PMC or project. We need a criteria we as project and PMC can
> > achieve and implement effectively, and IMHO "one of our project devs has
> it
> > running" does not meet that requirement, because this depends on third
> > party organizations (a dev's employer, and such) and idiosyncratic
> > criteria.
> >
> >
> That's fair.
>
> It would be better if we spec'd what a 'stable release' is and then ran
> candidates through the hoops.
>
> S
>
>
> >
> > > On Mar 18, 2021, at 12:47 PM, Stack  wrote:
> > >
> > > On Thu, Mar 18, 2021 at 11:55 AM Andrew Purtell <
> > andrew.purt...@gmail.com>
> > > wrote:
> > >
> > >> And how would we know we have one? We don't track usage telemetry.
> > >>
> > >>
> > > Someone of us w/ standing volunteers that they have made the move (was
> > what
> > > I was thinking).
> > > S
> > >
> > >
> > >
> > >
> > >>
> >  On Mar 18, 2021, at 11:29 AM, Stack  wrote:
> > >>>
> > >>> On Wed, Mar 17, 2021 at 1:49 PM Andrew Purtell  >
> > >> wrote:
> > >>>
> >  I would like to propose we update the 'stable' release pointer,
> > >> currently
> >  pointing at 2.3.4, to 2.4.2.
> > 
> >  In my testing with aggressive chaos and ITBLL (but in,
> unfortunately,
> > >> due
> >  to resource constraints, in small cluster settings of approximately
> 10
> >  nodes) 2.4.2 is very stable.
> > 
> >  Our sister project Phoenix has updated their build system to support
> >  building against 2.4.1 and later, and the stability of their unit
> and
> >  integration test suite is not impacted by any known HBase issue.
> > 
> >  If there is other criteria that should be considered, I'd like for
> us
> > to
> >  discuss it. Does there need to be public acknowledgement of a
> > production
> >  user? At scale? (How would we know?) Would you like me to attempt an
> >  at-scale test? On the order of 100 nodes might be possible? If so,
> > what
> >  should be the test scenario and criteria for success? What
> > distinguishes
> >  2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the
> > >> area(s)
> >  of concern with respect to moving the stable pointer forward?
> > 
> > 
> > >>> I suggest a happy production deploy as a prerequisite to moving the
> > >> pointer.
> > >>> S
> > >>>
> > >>>
> > >>>
> >  --
> >  Best regards,
> >  Andrew
> > 
> >  Words like orphans lost among the crosstalk, meaning torn from
> truth's
> >  decrepit hands
> >   - A23, Crosstalk
> > 
> > >>
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk


[jira] [Created] (HBASE-25690) Specify and document the criteria for moving the 'stable' pointer

2021-03-22 Thread Andrew Kyle Purtell (Jira)
Andrew Kyle Purtell created HBASE-25690:
---

 Summary: Specify and document the criteria for moving the 'stable' 
pointer
 Key: HBASE-25690
 URL: https://issues.apache.org/jira/browse/HBASE-25690
 Project: HBase
  Issue Type: Task
  Components: documentation
Reporter: Andrew Kyle Purtell


We can consider reported experience with new code lines released ahead of where 
the stable pointer currently points, when and if it happens but this should not 
be a formal criteria because it has nothing directly to do with our PMC or 
project. We need a criteria for deciding when new code lines are sufficiently 
stable as to warrant moving the 'stable' pointer that we as project and PMC can 
achieve and implement effectively.

IMHO "one of our project devs has it running" does not meet that requirement, 
because this depends on third party organizations (a dev's employer, and such) 
and idiosyncratic criteria.

Collect and document requirements, test scenarios, and success (and failure) 
criteria.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Time to 2.3.5 (was: Delaying 2.3.5 another month)

2021-03-22 Thread Huaxiang Sun
Thanks Nick. I am starting the release process today.

Huaxiang

On Wed, Mar 17, 2021 at 10:28 AM Nick Dimiduk  wrote:

> Hi Everyone,
>
> Looks like we have a nice 40-ish commits on branch-2.3, so I think it's
> time for another release. Huaxiang has again volunteered to run this
> release, so I will defer 2.3.5 to him. As for timing, I think it's best if
> we let the current 2.4 release complete (looks like it's close). Please
> speak up if you have any nice patches you're ready to land, we'll see about
> their inclusion.
>
> Thank you, Huaxiang!
>
> Thanks,
> Nick
>
> On Fri, Feb 26, 2021 at 10:31 AM Nick Dimiduk  wrote:
>
> > Heya team,
> >
> > There are fewer than 20 issues resolved against the head of branch-2.3
> and
> > none of them are marked as Critical. Thus I think we can postpone the
> next
> > 2.3 release by another month. If you have concerns or disagree, please
> > reply here to let me know.
> >
> > Thanks,
> > Nick
> >
>


Re: [DISCUSS] Updating the 'stable' pointer to 2.4.2

2021-03-22 Thread Stack
On Thu, Mar 18, 2021 at 1:07 PM Andrew Purtell 
wrote:

> Would they do that before or after we designate it stable? Asking, not
> trying to be difficult. Kind of a chicken and egg problem?
>
>
Before the release was designated stable.


> It would be fine I think to consider reported experience when and if it
> happens but can't be primary criteria because it has nothing directly to do
> with our PMC or project. We need a criteria we as project and PMC can
> achieve and implement effectively, and IMHO "one of our project devs has it
> running" does not meet that requirement, because this depends on third
> party organizations (a dev's employer, and such) and idiosyncratic
> criteria.
>
>
That's fair.

It would be better if we spec'd what a 'stable release' is and then ran
candidates through the hoops.

S


>
> > On Mar 18, 2021, at 12:47 PM, Stack  wrote:
> >
> > On Thu, Mar 18, 2021 at 11:55 AM Andrew Purtell <
> andrew.purt...@gmail.com>
> > wrote:
> >
> >> And how would we know we have one? We don't track usage telemetry.
> >>
> >>
> > Someone of us w/ standing volunteers that they have made the move (was
> what
> > I was thinking).
> > S
> >
> >
> >
> >
> >>
>  On Mar 18, 2021, at 11:29 AM, Stack  wrote:
> >>>
> >>> On Wed, Mar 17, 2021 at 1:49 PM Andrew Purtell 
> >> wrote:
> >>>
>  I would like to propose we update the 'stable' release pointer,
> >> currently
>  pointing at 2.3.4, to 2.4.2.
> 
>  In my testing with aggressive chaos and ITBLL (but in, unfortunately,
> >> due
>  to resource constraints, in small cluster settings of approximately 10
>  nodes) 2.4.2 is very stable.
> 
>  Our sister project Phoenix has updated their build system to support
>  building against 2.4.1 and later, and the stability of their unit and
>  integration test suite is not impacted by any known HBase issue.
> 
>  If there is other criteria that should be considered, I'd like for us
> to
>  discuss it. Does there need to be public acknowledgement of a
> production
>  user? At scale? (How would we know?) Would you like me to attempt an
>  at-scale test? On the order of 100 nodes might be possible? If so,
> what
>  should be the test scenario and criteria for success? What
> distinguishes
>  2.3.x (2.3.4) from 2.4.x (2.4.2) at this point? What would be the
> >> area(s)
>  of concern with respect to moving the stable pointer forward?
> 
> 
> >>> I suggest a happy production deploy as a prerequisite to moving the
> >> pointer.
> >>> S
> >>>
> >>>
> >>>
>  --
>  Best regards,
>  Andrew
> 
>  Words like orphans lost among the crosstalk, meaning torn from truth's
>  decrepit hands
>   - A23, Crosstalk
> 
> >>
>


[jira] [Resolved] (HBASE-25672) Backport HBASE-25608 to branch-1

2021-03-22 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-25672.
---
Fix Version/s: 1.7.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged to branch-1. Thanks for the PR [~lineyshinya]

> Backport HBASE-25608 to branch-1
> 
>
> Key: HBASE-25672
> URL: https://issues.apache.org/jira/browse/HBASE-25672
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Shinya Yoshida
>Assignee: Shinya Yoshida
>Priority: Major
> Fix For: 1.7.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25683) Simplify UTs using DummyServer

2021-03-22 Thread Michael Stack (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack resolved HBASE-25683.
---
Fix Version/s: 3.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged. Nice cleanup. Thanks for the PR [~Ddupg]

> Simplify UTs using DummyServer
> --
>
> Key: HBASE-25683
> URL: https://issues.apache.org/jira/browse/HBASE-25683
> Project: HBase
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.0.0-alpha-1
>Reporter: Sun Xin
>Assignee: Sun Xin
>Priority: Trivial
> Fix For: 3.0.0-alpha-1
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)