Re: [ANNOUNCE] PhoenixCon 2016 on Wed, May 25th 9am-1pm

2016-04-26 Thread Anil Gupta
Hi James,
I spoke to my manager and he is fine with the idea of giving the talk. Now, he 
is gonna ask higher management for final approval. I am assuming there is still 
a slot for my talk in use case srction. I should go ahead with my approval 
process. Correct?

Thanks,
Anil Gupta 
Sent from my iPhone

> On Apr 26, 2016, at 5:56 PM, James Taylor  wrote:
> 
> We invite you to attend the inaugural PhoenixCon on Wed, May 25th 9am-1pm
> (the day after HBaseCon) hosted by Salesforce.com in San Francisco. There
> will be two tracks: one for use cases and one for internals. Drop me a note
> if you're interested in giving a talk. To RSVP and for more details, see
> here[1].
> 
> Thanks,
> James
> 
> [1] http://www.meetup.com/SF-Bay-Area-Apache-Phoenix-Meetup/events/230545182


[ANNOUNCE] PhoenixCon 2016 on Wed, May 25th 9am-1pm

2016-04-26 Thread James Taylor
We invite you to attend the inaugural PhoenixCon on Wed, May 25th 9am-1pm
(the day after HBaseCon) hosted by Salesforce.com in San Francisco. There
will be two tracks: one for use cases and one for internals. Drop me a note
if you're interested in giving a talk. To RSVP and for more details, see
here[1].

Thanks,
James

[1] http://www.meetup.com/SF-Bay-Area-Apache-Phoenix-Meetup/events/230545182


Re: Slow sync cost

2016-04-26 Thread Saad Mufti
That is interesting. Would it be possible for you to share what GC settings
you ended up on that gave you the most predictable performance?

Thanks.


Saad


On Tue, Apr 26, 2016 at 11:56 AM, Bryan Beaudreault <
bbeaudrea...@hubspot.com> wrote:

> We were seeing this for a while with our CDH5 HBase clusters too. We
> eventually correlated it very closely to GC pauses. Through heavily tuning
> our GC we were able to drastically reduce the logs, by keeping most GC's
> under 100ms.
>
> On Tue, Apr 26, 2016 at 6:25 AM Saad Mufti  wrote:
>
> > From what I can see in the source code, the default is actually even
> lower
> > at 100 ms (can be overridden with hbase.regionserver.hlog.slowsync.ms).
> >
> > 
> > Saad
> >
> >
> > On Tue, Apr 26, 2016 at 3:13 AM, Kevin Bowling  >
> > wrote:
> >
> > > I see similar log spam while system has reasonable performance.  Was
> the
> > > 250ms default chosen with SSDs and 10ge in mind or something?  I guess
> > I'm
> > > surprised a sync write several times through JVMs to 2 remote datanodes
> > > would be expected to consistently happen that fast.
> > >
> > > Regards,
> > >
> > > On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti 
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > In our large HBase cluster based on CDH 5.5 in AWS, we're constantly
> > > seeing
> > > > the following messages in the region server logs:
> > > >
> > > > 2016-04-25 14:02:55,178 INFO
> > > > org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 258
> > ms,
> > > > current pipeline:
> > > > [DatanodeInfoWithStorage[10.99.182.165:50010
> > > > ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK],
> > > > DatanodeInfoWithStorage[10.99.182.236:50010
> > > > ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK],
> > > > DatanodeInfoWithStorage[10.99.182.195:50010
> > > > ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]]
> > > >
> > > >
> > > > These happen regularly while HBase appear to be operating normally
> with
> > > > decent read and write performance. We do have occasional performance
> > > > problems when regions are auto-splitting, and at first I thought this
> > was
> > > > related but now I se it happens all the time.
> > > >
> > > >
> > > > Can someone explain what this means really and should we be
> concerned?
> > I
> > > > tracked down the source code that outputs it in
> > > >
> > > >
> > > >
> > >
> >
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
> > > >
> > > > but after going through the code I think I'd need to know much more
> > about
> > > > the code to glean anything from it or the associated JIRA ticket
> > > > https://issues.apache.org/jira/browse/HBASE-11240.
> > > >
> > > > Also, what is this "pipeline" the ticket and code talks about?
> > > >
> > > > Thanks in advance for any information and/or clarification anyone can
> > > > provide.
> > > >
> > > > 
> > > >
> > > > Saad
> > > >
> > >
> >
>


Re: Retiring empty regions

2016-04-26 Thread Nick Dimiduk
I'm looking forward to your talk Vlad.

In the mean time, I filed HBASE-15712. We'll get our implementation posted
up there. We have these deployed on one of the masters, running daily with
cron.

@Mikhail, to get this feature into the normalizer, how about this: let's
add a min number of regions property to user tables. This can be set when
someone creates a table with split points, or maintained manually. The
normalizer can use that as a constraint to guide its convergence.

On Wed, Apr 20, 2016 at 5:18 PM, Vladimir Rodionov 
wrote:

> >I'd love to hear your thoughts on this design, Vlad. Maybe you'd like to
> >write up a post for the blog? Meanwhile, I'm sure of a couple of us on
> here
> >on the list would appreciate your Cliff's Notes version. I can take this
> >into account for my v2 schema design.
>
> Nick, there will be a presentation on time-series HBase (hbasecon.com)
> Come
> join us :)
>
>
> On Mon, Apr 4, 2016 at 8:34 AM, Nick Dimiduk  wrote:
>
> > > Crazy idea, but you might be able to take stripped down version of
> region
> > > normalizer code and make a Tool to run? Requesting split or merge is
> done
> > > through the client API, and the only weighing information you need is
> > > whether region empty or not, that you could find out too?
> >
> > Yeah, that's the direction I'm headed.
> >
> > > A bit off topic, but I think unfortunately region normalizer now
> ignores
> > > empty regions to avoid undoing pre-split on the table.
> >
> > Unfortunate indeed. Maybe we should be keeping around the initial splits
> > list as a metadata attribute on the table?
> >
> > > With a right row-key design you will never have empty regions due to
> TTL.
> >
> > I'd love to hear your thoughts on this design, Vlad. Maybe you'd like to
> > write up a post for the blog? Meanwhile, I'm sure of a couple of us on
> here
> > on the list would appreciate your Cliff's Notes version. I can take this
> > into account for my v2 schema design.
> >
> > > So Nick, merge on 1.1 is not recommended??? Was working very well on
> > > previous versions. Is ProcV2 really impact it that bad??
> >
> > How to answer here carefully... I have no reason to believe merge is not
> > working on 1.1. I've been on the wrong end of enough "regions stuck in
> > transition" support tickets that I'm not keen to put undue stress on my
> > master. ProcV2 insures against many scenarios that cause master trauma,
> > hence my interest in the implementation details and my preference for
> > cluster administration tasks that use it as their source of authority.
> >
> > Thanks for the thoughts folks.
> > -n
> >
> > On Fri, Apr 1, 2016 at 10:52 AM, Jean-Marc Spaggiari <
> > jean-m...@spaggiari.org> wrote:
> >
> > > ;) That was not the question ;)
> > >
> > > So Nick, merge on 1.1 is not recommended??? Was working very well on
> > > previous versions. Is ProcV2 really impact it that bad??
> > >
> > > JMS
> > >
> > > 2016-04-01 13:49 GMT-04:00 Vladimir Rodionov :
> > >
> > > > >> This is something
> > > > >> which makes it far less useful for time-series databases with
> short
> > > TTL
> > > > on
> > > > >> the tables.
> > > >
> > > > With a right row-key design you will never have empty regions due to
> > TTL.
> > > >
> > > > -Vlad
> > > >
> > > > On Thu, Mar 31, 2016 at 10:31 PM, Mikhail Antonov <
> > olorinb...@gmail.com>
> > > > wrote:
> > > >
> > > > > Crazy idea, but you might be able to take stripped down version of
> > > region
> > > > > normalizer code and make a Tool to run? Requesting split or merge
> is
> > > done
> > > > > through the client API, and the only weighing information you need
> is
> > > > > whether region empty or not, that you could find out too?
> > > > >
> > > > >
> > > > > "Short of upgrading to 1.2 for the region normalizer,"
> > > > >
> > > > > A bit off topic, but I think unfortunately region normalizer now
> > > ignores
> > > > > empty regions to avoid undoing pre-split on the table. This is
> > > something
> > > > > which makes it far less useful for time-series databases with short
> > TTL
> > > > on
> > > > > the tables. We'll need to address that.
> > > > >
> > > > > -Mikhail
> > > > >
> > > > > On Thu, Mar 31, 2016 at 9:56 PM, Nick Dimiduk 
> > > > wrote:
> > > > >
> > > > > > Hi folks,
> > > > > >
> > > > > > I have a table with TTL enabled. It's been receiving data for a
> > while
> > > > > > beyond the TTL and I now have a number of empty regions. I'd like
> > to
> > > > drop
> > > > > > those empty regions to free up heap space on the region servers
> and
> > > > > reduce
> > > > > > master load. I'm running a 1.1 derivative.
> > > > > >
> > > > > > The only threads I found on this topic are from circa 0.92
> > timeframe.
> > > > > >
> > > > > > Short of upgrading to 1.2 for the region normalizer, what's the
> > > > > recommended
> > > > > > method of cleaning up this cruft? Should I be merging empty
> regions
> > > > into

Re: Slow sync cost

2016-04-26 Thread Bryan Beaudreault
We were seeing this for a while with our CDH5 HBase clusters too. We
eventually correlated it very closely to GC pauses. Through heavily tuning
our GC we were able to drastically reduce the logs, by keeping most GC's
under 100ms.

On Tue, Apr 26, 2016 at 6:25 AM Saad Mufti  wrote:

> From what I can see in the source code, the default is actually even lower
> at 100 ms (can be overridden with hbase.regionserver.hlog.slowsync.ms).
>
> 
> Saad
>
>
> On Tue, Apr 26, 2016 at 3:13 AM, Kevin Bowling 
> wrote:
>
> > I see similar log spam while system has reasonable performance.  Was the
> > 250ms default chosen with SSDs and 10ge in mind or something?  I guess
> I'm
> > surprised a sync write several times through JVMs to 2 remote datanodes
> > would be expected to consistently happen that fast.
> >
> > Regards,
> >
> > On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti 
> wrote:
> >
> > > Hi,
> > >
> > > In our large HBase cluster based on CDH 5.5 in AWS, we're constantly
> > seeing
> > > the following messages in the region server logs:
> > >
> > > 2016-04-25 14:02:55,178 INFO
> > > org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 258
> ms,
> > > current pipeline:
> > > [DatanodeInfoWithStorage[10.99.182.165:50010
> > > ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK],
> > > DatanodeInfoWithStorage[10.99.182.236:50010
> > > ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK],
> > > DatanodeInfoWithStorage[10.99.182.195:50010
> > > ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]]
> > >
> > >
> > > These happen regularly while HBase appear to be operating normally with
> > > decent read and write performance. We do have occasional performance
> > > problems when regions are auto-splitting, and at first I thought this
> was
> > > related but now I se it happens all the time.
> > >
> > >
> > > Can someone explain what this means really and should we be concerned?
> I
> > > tracked down the source code that outputs it in
> > >
> > >
> > >
> >
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
> > >
> > > but after going through the code I think I'd need to know much more
> about
> > > the code to glean anything from it or the associated JIRA ticket
> > > https://issues.apache.org/jira/browse/HBASE-11240.
> > >
> > > Also, what is this "pipeline" the ticket and code talks about?
> > >
> > > Thanks in advance for any information and/or clarification anyone can
> > > provide.
> > >
> > > 
> > >
> > > Saad
> > >
> >
>


Re: Re: Re: question on "drain region servers"

2016-04-26 Thread Ted Yu
Please see HBASE-4298 where this feature was introduced.

On Tue, Apr 26, 2016 at 5:12 AM, WangYQ  wrote:

> yes,  there is a tool graceful_stop.sh to graceful stop regionserver, and
> can move the regions back to the rs after rs come back.
> but i can not find the relation with drain region servers...
>
>
> i think drain region servers function is good, but can not think up with a
> pracital use case
>
>
>
>
>
>
>
>
> At 2016-04-26 16:01:55, "Dejan Menges"  wrote:
> >One of use cases we use it is graceful stop of regionserver - you unload
> >regions from the server before you restart it. Of course, after restart
> you
> >expect HBase to move regions back.
> >
> >Now I'm not really remembering correctly, but I kinda remember that one of
> >the features was at least that it will move back regions which were
> already
> >there, hence not destroy too much block locality.
> >
> >On Tue, Apr 26, 2016 at 8:15 AM WangYQ  wrote:
> >
> >> thanks
> >> in hbase 0.99.0,  I find the rb file: draining_servers.rb
> >>
> >>
> >> i have some suggestions on this tool:
> >> 1. if I add rs hs1 to draining_servers, when hs1 restart, the zk node
> >> still exists in zk, but hmaster will not treat hs1 as draining_servers
> >> i think when we add a hs to draining_servers, we do not need to
> store
> >> the start code in zk, just store the hostName and port
> >> 2.  we add hs1 to draining_servers, but if hs1 always restart, we will
> >> need to add hs1 several times
> >>   when we need to delete the draining_servers info of hs1, we  will
> need
> >> to delete hs1 several times
> >>
> >>
> >>
> >> finally, what is the original motivation of this tool, some scenario
> >> descriptions are good.
> >>
> >>
> >>
> >>
> >>
> >>
> >> At 2016-04-26 11:33:10, "Ted Yu"  wrote:
> >> >Please take a look at:
> >> >bin/draining_servers.rb
> >> >
> >> >On Mon, Apr 25, 2016 at 8:12 PM, WangYQ 
> >> wrote:
> >> >
> >> >> in hbase,  I find there is a "drain regionServer" feature
> >> >>
> >> >>
> >> >> if a rs is added to drain regionServer in ZK, then regions will not
> be
> >> >> move to on these regionServers
> >> >>
> >> >>
> >> >> but, how can a rs be add to  drain regionServer,   we add it handly
> or
> >> rs
> >> >> will add itself automaticly
> >>
>


Re:Re: Re: question on "drain region servers"

2016-04-26 Thread WangYQ
yes,  there is a tool graceful_stop.sh to graceful stop regionserver, and can 
move the regions back to the rs after rs come back.
but i can not find the relation with drain region servers...


i think drain region servers function is good, but can not think up with a 
pracital use case 








At 2016-04-26 16:01:55, "Dejan Menges"  wrote:
>One of use cases we use it is graceful stop of regionserver - you unload
>regions from the server before you restart it. Of course, after restart you
>expect HBase to move regions back.
>
>Now I'm not really remembering correctly, but I kinda remember that one of
>the features was at least that it will move back regions which were already
>there, hence not destroy too much block locality.
>
>On Tue, Apr 26, 2016 at 8:15 AM WangYQ  wrote:
>
>> thanks
>> in hbase 0.99.0,  I find the rb file: draining_servers.rb
>>
>>
>> i have some suggestions on this tool:
>> 1. if I add rs hs1 to draining_servers, when hs1 restart, the zk node
>> still exists in zk, but hmaster will not treat hs1 as draining_servers
>> i think when we add a hs to draining_servers, we do not need to store
>> the start code in zk, just store the hostName and port
>> 2.  we add hs1 to draining_servers, but if hs1 always restart, we will
>> need to add hs1 several times
>>   when we need to delete the draining_servers info of hs1, we  will  need
>> to delete hs1 several times
>>
>>
>>
>> finally, what is the original motivation of this tool, some scenario
>> descriptions are good.
>>
>>
>>
>>
>>
>>
>> At 2016-04-26 11:33:10, "Ted Yu"  wrote:
>> >Please take a look at:
>> >bin/draining_servers.rb
>> >
>> >On Mon, Apr 25, 2016 at 8:12 PM, WangYQ 
>> wrote:
>> >
>> >> in hbase,  I find there is a "drain regionServer" feature
>> >>
>> >>
>> >> if a rs is added to drain regionServer in ZK, then regions will not be
>> >> move to on these regionServers
>> >>
>> >>
>> >> but, how can a rs be add to  drain regionServer,   we add it handly or
>> rs
>> >> will add itself automaticly
>>


Re: Slow sync cost

2016-04-26 Thread Saad Mufti
>From what I can see in the source code, the default is actually even lower
at 100 ms (can be overridden with hbase.regionserver.hlog.slowsync.ms).


Saad


On Tue, Apr 26, 2016 at 3:13 AM, Kevin Bowling 
wrote:

> I see similar log spam while system has reasonable performance.  Was the
> 250ms default chosen with SSDs and 10ge in mind or something?  I guess I'm
> surprised a sync write several times through JVMs to 2 remote datanodes
> would be expected to consistently happen that fast.
>
> Regards,
>
> On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti  wrote:
>
> > Hi,
> >
> > In our large HBase cluster based on CDH 5.5 in AWS, we're constantly
> seeing
> > the following messages in the region server logs:
> >
> > 2016-04-25 14:02:55,178 INFO
> > org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 258 ms,
> > current pipeline:
> > [DatanodeInfoWithStorage[10.99.182.165:50010
> > ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK],
> > DatanodeInfoWithStorage[10.99.182.236:50010
> > ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK],
> > DatanodeInfoWithStorage[10.99.182.195:50010
> > ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]]
> >
> >
> > These happen regularly while HBase appear to be operating normally with
> > decent read and write performance. We do have occasional performance
> > problems when regions are auto-splitting, and at first I thought this was
> > related but now I se it happens all the time.
> >
> >
> > Can someone explain what this means really and should we be concerned? I
> > tracked down the source code that outputs it in
> >
> >
> >
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
> >
> > but after going through the code I think I'd need to know much more about
> > the code to glean anything from it or the associated JIRA ticket
> > https://issues.apache.org/jira/browse/HBASE-11240.
> >
> > Also, what is this "pipeline" the ticket and code talks about?
> >
> > Thanks in advance for any information and/or clarification anyone can
> > provide.
> >
> > 
> >
> > Saad
> >
>


Re: Re: question on "drain region servers"

2016-04-26 Thread Dejan Menges
One of use cases we use it is graceful stop of regionserver - you unload
regions from the server before you restart it. Of course, after restart you
expect HBase to move regions back.

Now I'm not really remembering correctly, but I kinda remember that one of
the features was at least that it will move back regions which were already
there, hence not destroy too much block locality.

On Tue, Apr 26, 2016 at 8:15 AM WangYQ  wrote:

> thanks
> in hbase 0.99.0,  I find the rb file: draining_servers.rb
>
>
> i have some suggestions on this tool:
> 1. if I add rs hs1 to draining_servers, when hs1 restart, the zk node
> still exists in zk, but hmaster will not treat hs1 as draining_servers
> i think when we add a hs to draining_servers, we do not need to store
> the start code in zk, just store the hostName and port
> 2.  we add hs1 to draining_servers, but if hs1 always restart, we will
> need to add hs1 several times
>   when we need to delete the draining_servers info of hs1, we  will  need
> to delete hs1 several times
>
>
>
> finally, what is the original motivation of this tool, some scenario
> descriptions are good.
>
>
>
>
>
>
> At 2016-04-26 11:33:10, "Ted Yu"  wrote:
> >Please take a look at:
> >bin/draining_servers.rb
> >
> >On Mon, Apr 25, 2016 at 8:12 PM, WangYQ 
> wrote:
> >
> >> in hbase,  I find there is a "drain regionServer" feature
> >>
> >>
> >> if a rs is added to drain regionServer in ZK, then regions will not be
> >> move to on these regionServers
> >>
> >>
> >> but, how can a rs be add to  drain regionServer,   we add it handly or
> rs
> >> will add itself automaticly
>


Re: Slow sync cost

2016-04-26 Thread Kevin Bowling
I see similar log spam while system has reasonable performance.  Was the
250ms default chosen with SSDs and 10ge in mind or something?  I guess I'm
surprised a sync write several times through JVMs to 2 remote datanodes
would be expected to consistently happen that fast.

Regards,

On Mon, Apr 25, 2016 at 12:18 PM, Saad Mufti  wrote:

> Hi,
>
> In our large HBase cluster based on CDH 5.5 in AWS, we're constantly seeing
> the following messages in the region server logs:
>
> 2016-04-25 14:02:55,178 INFO
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Slow sync cost: 258 ms,
> current pipeline:
> [DatanodeInfoWithStorage[10.99.182.165:50010
> ,DS-281d4c4f-23bd-4541-bedb-946e57a0f0fd,DISK],
> DatanodeInfoWithStorage[10.99.182.236:50010
> ,DS-f8e7e8c9-6fa0-446d-a6e5-122ab35b6f7c,DISK],
> DatanodeInfoWithStorage[10.99.182.195:50010
> ,DS-3beae344-5a4a-4759-ad79-a61beabcc09d,DISK]]
>
>
> These happen regularly while HBase appear to be operating normally with
> decent read and write performance. We do have occasional performance
> problems when regions are auto-splitting, and at first I thought this was
> related but now I se it happens all the time.
>
>
> Can someone explain what this means really and should we be concerned? I
> tracked down the source code that outputs it in
>
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
>
> but after going through the code I think I'd need to know much more about
> the code to glean anything from it or the associated JIRA ticket
> https://issues.apache.org/jira/browse/HBASE-11240.
>
> Also, what is this "pipeline" the ticket and code talks about?
>
> Thanks in advance for any information and/or clarification anyone can
> provide.
>
> 
>
> Saad
>


Re:Re: question on "drain region servers"

2016-04-26 Thread WangYQ
thanks
in hbase 0.99.0,  I find the rb file: draining_servers.rb


i have some suggestions on this tool:
1. if I add rs hs1 to draining_servers, when hs1 restart, the zk node still 
exists in zk, but hmaster will not treat hs1 as draining_servers
i think when we add a hs to draining_servers, we do not need to store the 
start code in zk, just store the hostName and port 
2.  we add hs1 to draining_servers, but if hs1 always restart, we will need to 
add hs1 several times 
  when we need to delete the draining_servers info of hs1, we  will  need to 
delete hs1 several times


 
finally, what is the original motivation of this tool, some scenario 
descriptions are good.






At 2016-04-26 11:33:10, "Ted Yu"  wrote:
>Please take a look at:
>bin/draining_servers.rb
>
>On Mon, Apr 25, 2016 at 8:12 PM, WangYQ  wrote:
>
>> in hbase,  I find there is a "drain regionServer" feature
>>
>>
>> if a rs is added to drain regionServer in ZK, then regions will not be
>> move to on these regionServers
>>
>>
>> but, how can a rs be add to  drain regionServer,   we add it handly or rs
>> will add itself automaticly