Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

2017-01-14 Thread Jerry He
Hi, Andrew

Stack was talking to me about this area when I met him in the HBase Meetup
last December.
Let me take a shot at HBASE-14375.

Thanks,

Jerry

On Sat, Jan 14, 2017 at 9:22 PM, Andrew Purtell 
wrote:

>
>
> > On Jan 14, 2017, at 9:07 PM, Jerry He  wrote:
> >
> > I think it will be a big disappointment for the community if the
> > hbase-spark module is not going into 2.0.
> > I understand there are still a few blockers, including HBASE-16179.
>
> Patches welcome. :-)
>
>
> > We have it in our distribution, probably in other vendors' as well.  It
> is
> > little easier for us because we can be flexible on the supported
> > Spark/Scala version combinations and the APIs.
> > But a major release still without a good Spark story for the HBase open
> > source community does not look good.
> >
> > Jerry
> >
> >> On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu  wrote:
> >>
> >> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
> >> (which is becoming branch-2).
> >>
> >> Cheers
> >>
> >>
> >>
> >> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das 
> >> wrote:
> >>
> >>> Hi Sean, I did a quick check with someone from the Spark team here and
> >> his
> >>> opinion was that the hbase-spark module as it currently stands can be
> >> used
> >>> by downstream users to do basic stuff and to try some simple things
> out,
> >>> etc. The integration is improving.
> >>> I think we should get what we have in 2.0 (which is the default action
> >>> anyways).
> >>> Thanks
> >>> Devaraj
> >>> 
> >>> From: Sean Busbey 
> >>> Sent: Wednesday, November 16, 2016 9:49 AM
> >>> To: dev
> >>> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
> >>>
> >>> Hi folks!
> >>>
> >>> With 2.0 releases coming up, I'd like to revive our prior discussion
> >>> on the readiness of the hbase-spark module for downstream users.
> >>>
> >>> We've had a ticket for tracking the milestones set up for inclusion in
> >>> branch-1 releases for about 1.5 years:
> >>>
> >>> https://issues.apache.org/jira/browse/HBASE-14160
> >>>
> >>> We still haven't gotten all of the blocker issues completed, AFAIK.
> >>>
> >>> Is anyone interested in volunteering to knock the rest of these out?
> >>>
> >>> If they aren't, shall we plan to leave hbase-spark in master and
> >>> revert it from branch-2 once it forks for the HBase 2.0 release line?
> >>>
> >>> This feature isn't a blocker for 2.0; just as we've been planning to
> >>> add the hbase-spark module to some 1.y release we can also include it
> >>> in a 2.1+ release.
> >>>
> >>> This does appear to be a feature our downstream users could benefit
> >>> from, so I'd hate to continue the current situation where no official
> >>> releases include it. This is especially true now that we're looking at
> >>> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
> >>>
> >>> -
> >>> busbey
> >>>
> >>>
> >>
>


Re: Merge and HMerge

2017-01-14 Thread Lars George
I think that makes sense. The tool with its custom code dates back to where we 
had no built in version. I am all for removing all of the tools and leave the 
API call only. That is the same for an admin then compared to calling flush or 
split. 

No?

Lars

Sent from my iPhone

On 15 Jan 2017, at 04:25, Stephen Jiang  wrote:

>> If you remove the util.Merge tool, how then does an operator ask for a merge
> in its absence?
> 
> We have a shell command to merge region.  In the past, it calls the same RS
> side code.  I don't think there is a need to have util.Merge (even if we
> really want, we can ask this utility to call HBaseAdmin.mergeRegions, which
> is the same path from the merge command through 'hbase shell').
> 
> Thanks
> Stephen
> 
>> On Fri, Jan 13, 2017 at 11:29 PM, Stack  wrote:
>> 
>> On Fri, Jan 13, 2017 at 7:16 PM, Stephen Jiang 
>> wrote:
>> 
>>> Revive this thread
>>> 
>>> I am in the process of removing Region Server side merge (and split)
>>> transaction code in master branch; as now we have merge (and split)
>>> procedure(s) from master doing the same thing.
>>> 
>>> 
>> Good (Issue?)
>> 
>> 
>>> The Merge tool depends on RS-side merge code.  I'd like to use this
>> chance
>>> to remove the util.Merge tool.  This is for 2.0 and up releases only.
>>> Deprecation does not work here; as keeping the RS-side merge code would
>>> have duplicate logic in source code and make the new Assignment manager
>>> code more complicated.
>>> 
>>> 
>> Could util.Merge be changed to ask the Master run the merge (via AMv2)?
>> 
>> If you remove the util.Merge tool, how then does an operator ask for a
>> merge in its absence?
>> 
>> Thanks Stephen
>> 
>> S
>> 
>> 
>>> Please let me know whether you have objection.
>>> 
>>> Thanks
>>> Stephen
>>> 
>>> PS.  I could deprecated HMerge code if anyone is really using it.  It has
>>> its own logic and standalone (supposed to dangerously work offline and
>>> merge more than 2 regions - the util.Merge and shell not support these
>>> functionality for now).
>>> 
>>> On Wed, Nov 16, 2016 at 11:04 AM, Enis Söztutar 
>>> wrote:
>>> 
 @Appy what is not clear from above?
 
 I think we should get rid of both Merge and HMerge.
 
 We should not have any tool which will work in offline mode by going
>> over
 the HDFS data. Seems very brittle to be broken when things get changed.
 Only use case I can think of is that somehow you end up with a lot of
 regions and you cannot bring the cluster back up because of OOMs, etc
>> and
 you have to reduce the number of regions in offline mode. However, we
>> did
 not see this kind of thing in any of our customers for the last couple
>> of
 years so far.
 
 I think we should seriously look into improving normalizer and enabling
 that by default for all the tables. Ideally, normalizer should be
>> running
 much more frequently, and should be configured with higher-level goals
>>> and
 heuristics. Like on average how many regions per node, etc and should
>> be
 looking at the global state (like the balancer) to decide on split /
>>> merge
 points.
 
 Enis
 
 On Wed, Nov 16, 2016 at 1:17 AM, Apekshit Sharma 
 wrote:
 
> bq. HMerge can merge multiple regions by going over the list of
> regions and checking
> their sizes.
> bq. But both of these tools (Merge and HMerge) are very dangerous
> 
> I came across HMerge and it looks like dead code. Isn't referenced
>> from
> anywhere except one test. (This is what lars also pointed out in the
 first
> email too).
> It would make perfect sense if it was a tool or was being referenced
>>> from
> somewhere, but with lack of either of that, am a bit confused here.
> @Enis, you seem to know everything about them, please educate me.
> Thanks
> - Appy
> 
> 
> 
> On Thu, Sep 29, 2016 at 12:43 AM, Enis Söztutar 
> wrote:
> 
>> Merge has very limited usability singe it can do a single merge and
>>> can
>> only run when HBase is offline.
>> HMerge can merge multiple regions by going over the list of regions
>>> and
>> checking their sizes.
>> And of course we have the "supported" online merge which is the
>> shell
>> command.
>> 
>> But both of these tools (Merge and HMerge) are very dangerous I
>>> think.
 I
>> would say we should deprecate both to be replaced by the online
>>> merger
>> tool. We should not allow offline merge at all. I fail to see the
 usecase
>> that you have to use an offline merge.
>> 
>> Enis
>> 
>> On Wed, Sep 28, 2016 at 7:32 AM, Lars George <
>> lars.geo...@gmail.com>
>> wrote:
>> 
>>> Hey,
>>> 
>>> Sorry to resurrect this old thread, but working on the book
>>> update, I
>>> came across the 

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

2017-01-14 Thread Andrew Purtell


> On Jan 14, 2017, at 9:07 PM, Jerry He  wrote:
> 
> I think it will be a big disappointment for the community if the
> hbase-spark module is not going into 2.0.
> I understand there are still a few blockers, including HBASE-16179.

Patches welcome. :-) 


> We have it in our distribution, probably in other vendors' as well.  It is
> little easier for us because we can be flexible on the supported
> Spark/Scala version combinations and the APIs.
> But a major release still without a good Spark story for the HBase open
> source community does not look good.
> 
> Jerry
> 
>> On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu  wrote:
>> 
>> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
>> (which is becoming branch-2).
>> 
>> Cheers
>> 
>> 
>> 
>> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das 
>> wrote:
>> 
>>> Hi Sean, I did a quick check with someone from the Spark team here and
>> his
>>> opinion was that the hbase-spark module as it currently stands can be
>> used
>>> by downstream users to do basic stuff and to try some simple things out,
>>> etc. The integration is improving.
>>> I think we should get what we have in 2.0 (which is the default action
>>> anyways).
>>> Thanks
>>> Devaraj
>>> 
>>> From: Sean Busbey 
>>> Sent: Wednesday, November 16, 2016 9:49 AM
>>> To: dev
>>> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
>>> 
>>> Hi folks!
>>> 
>>> With 2.0 releases coming up, I'd like to revive our prior discussion
>>> on the readiness of the hbase-spark module for downstream users.
>>> 
>>> We've had a ticket for tracking the milestones set up for inclusion in
>>> branch-1 releases for about 1.5 years:
>>> 
>>> https://issues.apache.org/jira/browse/HBASE-14160
>>> 
>>> We still haven't gotten all of the blocker issues completed, AFAIK.
>>> 
>>> Is anyone interested in volunteering to knock the rest of these out?
>>> 
>>> If they aren't, shall we plan to leave hbase-spark in master and
>>> revert it from branch-2 once it forks for the HBase 2.0 release line?
>>> 
>>> This feature isn't a blocker for 2.0; just as we've been planning to
>>> add the hbase-spark module to some 1.y release we can also include it
>>> in a 2.1+ release.
>>> 
>>> This does appear to be a feature our downstream users could benefit
>>> from, so I'd hate to continue the current situation where no official
>>> releases include it. This is especially true now that we're looking at
>>> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
>>> 
>>> -
>>> busbey
>>> 
>>> 
>> 


Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

2017-01-14 Thread Jerry He
I think it will be a big disappointment for the community if the
hbase-spark module is not going into 2.0.
I understand there are still a few blockers, including HBASE-16179.
We have it in our distribution, probably in other vendors' as well.  It is
little easier for us because we can be flexible on the supported
Spark/Scala version combinations and the APIs.
But a major release still without a good Spark story for the HBase open
source community does not look good.

Jerry

On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu  wrote:

> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
> (which is becoming branch-2).
>
> Cheers
>
>
>
> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das 
> wrote:
>
> > Hi Sean, I did a quick check with someone from the Spark team here and
> his
> > opinion was that the hbase-spark module as it currently stands can be
> used
> > by downstream users to do basic stuff and to try some simple things out,
> > etc. The integration is improving.
> > I think we should get what we have in 2.0 (which is the default action
> > anyways).
> > Thanks
> > Devaraj
> > 
> > From: Sean Busbey 
> > Sent: Wednesday, November 16, 2016 9:49 AM
> > To: dev
> > Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
> >
> > Hi folks!
> >
> > With 2.0 releases coming up, I'd like to revive our prior discussion
> > on the readiness of the hbase-spark module for downstream users.
> >
> > We've had a ticket for tracking the milestones set up for inclusion in
> > branch-1 releases for about 1.5 years:
> >
> > https://issues.apache.org/jira/browse/HBASE-14160
> >
> > We still haven't gotten all of the blocker issues completed, AFAIK.
> >
> > Is anyone interested in volunteering to knock the rest of these out?
> >
> > If they aren't, shall we plan to leave hbase-spark in master and
> > revert it from branch-2 once it forks for the HBase 2.0 release line?
> >
> > This feature isn't a blocker for 2.0; just as we've been planning to
> > add the hbase-spark module to some 1.y release we can also include it
> > in a 2.1+ release.
> >
> > This does appear to be a feature our downstream users could benefit
> > from, so I'd hate to continue the current situation where no official
> > releases include it. This is especially true now that we're looking at
> > ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
> >
> > -
> > busbey
> >
> >
>


Re: Merge and HMerge

2017-01-14 Thread Stephen Jiang
>If you remove the util.Merge tool, how then does an operator ask for a merge
in its absence?

We have a shell command to merge region.  In the past, it calls the same RS
side code.  I don't think there is a need to have util.Merge (even if we
really want, we can ask this utility to call HBaseAdmin.mergeRegions, which
is the same path from the merge command through 'hbase shell').

Thanks
Stephen

On Fri, Jan 13, 2017 at 11:29 PM, Stack  wrote:

> On Fri, Jan 13, 2017 at 7:16 PM, Stephen Jiang 
> wrote:
>
> > Revive this thread
> >
> > I am in the process of removing Region Server side merge (and split)
> > transaction code in master branch; as now we have merge (and split)
> > procedure(s) from master doing the same thing.
> >
> >
> Good (Issue?)
>
>
> > The Merge tool depends on RS-side merge code.  I'd like to use this
> chance
> > to remove the util.Merge tool.  This is for 2.0 and up releases only.
> > Deprecation does not work here; as keeping the RS-side merge code would
> > have duplicate logic in source code and make the new Assignment manager
> > code more complicated.
> >
> >
> Could util.Merge be changed to ask the Master run the merge (via AMv2)?
>
> If you remove the util.Merge tool, how then does an operator ask for a
> merge in its absence?
>
> Thanks Stephen
>
> S
>
>
> > Please let me know whether you have objection.
> >
> > Thanks
> > Stephen
> >
> > PS.  I could deprecated HMerge code if anyone is really using it.  It has
> > its own logic and standalone (supposed to dangerously work offline and
> > merge more than 2 regions - the util.Merge and shell not support these
> > functionality for now).
> >
> > On Wed, Nov 16, 2016 at 11:04 AM, Enis Söztutar 
> > wrote:
> >
> > > @Appy what is not clear from above?
> > >
> > > I think we should get rid of both Merge and HMerge.
> > >
> > > We should not have any tool which will work in offline mode by going
> over
> > > the HDFS data. Seems very brittle to be broken when things get changed.
> > > Only use case I can think of is that somehow you end up with a lot of
> > > regions and you cannot bring the cluster back up because of OOMs, etc
> and
> > > you have to reduce the number of regions in offline mode. However, we
> did
> > > not see this kind of thing in any of our customers for the last couple
> of
> > > years so far.
> > >
> > > I think we should seriously look into improving normalizer and enabling
> > > that by default for all the tables. Ideally, normalizer should be
> running
> > > much more frequently, and should be configured with higher-level goals
> > and
> > > heuristics. Like on average how many regions per node, etc and should
> be
> > > looking at the global state (like the balancer) to decide on split /
> > merge
> > > points.
> > >
> > > Enis
> > >
> > > On Wed, Nov 16, 2016 at 1:17 AM, Apekshit Sharma 
> > > wrote:
> > >
> > > > bq. HMerge can merge multiple regions by going over the list of
> > > > regions and checking
> > > > their sizes.
> > > > bq. But both of these tools (Merge and HMerge) are very dangerous
> > > >
> > > > I came across HMerge and it looks like dead code. Isn't referenced
> from
> > > > anywhere except one test. (This is what lars also pointed out in the
> > > first
> > > > email too).
> > > > It would make perfect sense if it was a tool or was being referenced
> > from
> > > > somewhere, but with lack of either of that, am a bit confused here.
> > > > @Enis, you seem to know everything about them, please educate me.
> > > > Thanks
> > > > - Appy
> > > >
> > > >
> > > >
> > > > On Thu, Sep 29, 2016 at 12:43 AM, Enis Söztutar 
> > > > wrote:
> > > >
> > > > > Merge has very limited usability singe it can do a single merge and
> > can
> > > > > only run when HBase is offline.
> > > > > HMerge can merge multiple regions by going over the list of regions
> > and
> > > > > checking their sizes.
> > > > > And of course we have the "supported" online merge which is the
> shell
> > > > > command.
> > > > >
> > > > > But both of these tools (Merge and HMerge) are very dangerous I
> > think.
> > > I
> > > > > would say we should deprecate both to be replaced by the online
> > merger
> > > > > tool. We should not allow offline merge at all. I fail to see the
> > > usecase
> > > > > that you have to use an offline merge.
> > > > >
> > > > > Enis
> > > > >
> > > > > On Wed, Sep 28, 2016 at 7:32 AM, Lars George <
> lars.geo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hey,
> > > > > >
> > > > > > Sorry to resurrect this old thread, but working on the book
> > update, I
> > > > > > came across the same today, i.e. we have Merge and HMerge. I
> tried
> > > and
> > > > > > Merge works fine now. It is also the only one of the two flagged
> as
> > > > > > being a tool. Should HMerge be removed? At least deprecated?
> > > > > >
> > > > > > Cheers,
> > > > > > Lars
> > > > > >
> > > > > >
> > > > > > On Thu, 

[jira] [Created] (HBASE-17470) Remove merge region code from region server

2017-01-14 Thread Stephen Yuan Jiang (JIRA)
Stephen Yuan Jiang created HBASE-17470:
--

 Summary: Remove merge region code from region server
 Key: HBASE-17470
 URL: https://issues.apache.org/jira/browse/HBASE-17470
 Project: HBase
  Issue Type: Sub-task
  Components: regionserver
Affects Versions: 2.0.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang


HBASE-16119 moves the merge region to the master-side.  There is no need to 
keep region_server-side merge region code to remove logic duplication.  

util.Merge and HMerge tools depends on RS-side merge region logic.  However, 
now we can merge regions using shell command.  It is dangerous to do offline 
merge.  For 2.0, it is a good time to remove those out-of-date tools.   





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

2017-01-14 Thread Ted Yu
I agree with Devaraj's assessment w.r.t. hbase-spark module in master
(which is becoming branch-2).

Cheers



On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das  wrote:

> Hi Sean, I did a quick check with someone from the Spark team here and his
> opinion was that the hbase-spark module as it currently stands can be used
> by downstream users to do basic stuff and to try some simple things out,
> etc. The integration is improving.
> I think we should get what we have in 2.0 (which is the default action
> anyways).
> Thanks
> Devaraj
> 
> From: Sean Busbey 
> Sent: Wednesday, November 16, 2016 9:49 AM
> To: dev
> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
>
> Hi folks!
>
> With 2.0 releases coming up, I'd like to revive our prior discussion
> on the readiness of the hbase-spark module for downstream users.
>
> We've had a ticket for tracking the milestones set up for inclusion in
> branch-1 releases for about 1.5 years:
>
> https://issues.apache.org/jira/browse/HBASE-14160
>
> We still haven't gotten all of the blocker issues completed, AFAIK.
>
> Is anyone interested in volunteering to knock the rest of these out?
>
> If they aren't, shall we plan to leave hbase-spark in master and
> revert it from branch-2 once it forks for the HBase 2.0 release line?
>
> This feature isn't a blocker for 2.0; just as we've been planning to
> add the hbase-spark module to some 1.y release we can also include it
> in a 2.1+ release.
>
> This does appear to be a feature our downstream users could benefit
> from, so I'd hate to continue the current situation where no official
> releases include it. This is especially true now that we're looking at
> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
>
> -
> busbey
>
>


Re: Moving 2.0 forward

2017-01-14 Thread Stack
Hey Eric!

See this thread where it is suggested we remove the module for want of a
sponsor [1]. The hbase-spark story is also muddled by there being different
options available neither of which is complete instead of just 'the one'
(Do a google search).

St.Ack
1.
http://apache-hbase.679495.n3.nabble.com/DISCUSS-hbase-spark-module-in-branch-1-and-branch-2-td4084343.html



On Sat, Jan 14, 2017 at 12:30 PM, Eric Charles  wrote:

> I read "3.3 hbase-spark STATUS: Needs work. No one on it at mo. Doc. is
> just wrong. What is there is dodgy. Could get punted."
>
> Unit tests are working and base functionality is there. Besides the doc
> and compilation against spark-2 (and scala-2.11), what else do you want to
> see?
>
>
>
> On 14/01/17 10:29, Ted Yu wrote:
>
>> For 3.3, hbase-spark module, there is HBASE-16179 which enables support
>> for Spark 2.0
>> It needs some review.
>>
>> Cheers
>>
>> On Jan 13, 2017, at 11:25 PM, Stack  wrote:
>>>
>>> On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang >> >
>>> wrote:
>>>
>>> Hello, Andrew, I was a helper on Matteo so that we can help each other
 while we are focusing on the new Assignment Manager work.  Now he is not
 available (at least in the next few months).  I have to be more focused
 on
 the new AM work; plus other work in my company; it would be too much
 for me
 to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM
 role
 while I am still help to make this 2.0 release smooth.

>>> (I could help out Stephen. We could co-RM?)
>>>
>>>
>>> For branch-2, I think it is too early to cut it, as we still have a lot
 of
 moving parts and on-going project that needs to be part of 2.0.  For
 example, the mentioned new AM (and other projects, such as HBASE-14414,
 HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just
 name
 a few).  Cutting branch now would add burden to complete those projects.

>>> Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
>>> be all loose ends and it'd make for a messy narrative.
>>>
>>> I started a doc listing state of 2.0.0:
>>> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9i
>>> Eu_ktczrlKHK8N4SZzs/edit?usp=sharing
>>>
>>> In the doc I made an estimate of what the community considers core 2.0.0
>>> items based in part off old lists and after survey of current state of
>>> JIRA. The doc is open for comment. Please chime in if I am off or if I am
>>> missing something that should be included. I also make a rough estimate
>>> on
>>> state of each core item.
>>>
>>> I intend to keep up this macro-view doc as we progress on 2.0.0 with
>>> reflection where pertinent in JIRA . Suggest we branch only when code
>>> compete on the core set most of which are complete or near-so.
>>> End-of-February should be time enough (First 2.0.0 RC in at the start of
>>> May?).
>>>
>>> Thanks,
>>> St.Ack
>>>
>>>
>>>
>>> thanks
 Stephen

 On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell <
 andrew.purt...@gmail.com
 wrote:

 Hi all,
>
> I've heard a rumor the co-RM situation with 2.0 may have changed. Can
> we
> get an update from co-RMs Matteo and Steven on their availability and
> interest in continuing in this role?
>
> To assist in moving 2.0 forward I intend to branch branch-2 from master
> next week. Unless there is an objection I will take this action under
> assumption of lazy consensus. Master branch will be renumbered to
> 3.0.0-SNAPSHOT. Once we have a branch-2 I will immediately begin scale
> tests and stabilization (via bug fixes or reverts of unfinished work)
> and
> invite interested collaborators to do the same.
>




Re: Moving 2.0 forward

2017-01-14 Thread Stack
On Sat, Jan 14, 2017 at 11:10 AM, Andrew Purtell 
wrote:

> Thanks for putting that document together Stack, that was really helpful.
>
> > 1.1 New Assignment Manager, AMv2
>
> ​Can we get a virtual show of hands who is working on this and plans to
> finish it? It was Stephen and Matteo originally, right? Matteo seems
> temporarily sidelined, is that correct?
>
>
Stephen and I have this as our priority. It is about 90% code complete with
some big patches just about to land. Will need a mountain of testing
thereafter.

Yeah, our mighty Matteo, the mastermind, is no longer actively working on
the project (I think!).



> > 1.3 Offheaping of Write Path
> ​> ​
> 1.4 HBASE-11425 Offheaping of Read Path
> ​> ​
> 1.6 HBASE-15265 AsyncWAL/HBase DFSClient
> ​
> ​Maybe we can organize some efforts to test small deploys of 2.0.0-SNAPSHOT
> with these features​ enabled since they are code complete but need testing
> and more doc, which can be generated from notes from testers on setup and
> experience. I can stand up a few clusterdock-based virtual clusters on EC2
> D2-class instances running integration tests, PE, and YCSB etc; surface
> issues up into JIRA; and provide SSH access on demand. Let me see ... not
> sure if 2.0.0-SNAPSHOT is stable enough to get that far. If so hopefully
> the developers behind these features will be willing to jump on them and
> lead debugging/fix if issues are found.
>
>
This sounds good. For offheaping, we're talking about it being the default
for 2.0.0 so could do with a bit of testing first before we actually commit
(smile).

2.0.0-SNAPSHOT seems pretty stable to me in my limited testing so far. I
could publish a SNAPSHOT on a weekly basis going forward if that'd help.



> ​​> 2.3 HBASE-6721 RegionServer Group-based Assignment
>
> Same as above, although in this case I suspect interested users are on our
> own to debug/fix.
> ​
>

Agree. This would be true of all features in the Ancillary (non-core)
section I'd say (though as it happens, I'm interesting in the above in
particular).

I added a new section to the doc on "Decisions" we need to make for 2.0.0.
Primary is a conclusion to the long-running Re: [DISCUSS] No regions on
Master node in 2.0

 thread.

St.Ack



>
> On Fri, Jan 13, 2017 at 11:49 PM, Andrew Purtell  >
> wrote:
>
> > While I don't disagree that half finished features are undesirable, I'm
> > not suggesting that as a strategy so much as we kick out stuff that just
> > doesn't seem to be getting done. Pushing 2.0 out another three months is
> > fine if there's a good chance this is realistic and we won't be having
> this
> > discussion again then. Let me have a look at the doc and return with
> > specific points for further discussion (if any).
> >
> >
> > On Jan 13, 2017, at 11:25 PM, Stack  wrote:
> >
> > On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang  >
> > wrote:
> >
> >> Hello, Andrew, I was a helper on Matteo so that we can help each other
> >> while we are focusing on the new Assignment Manager work.  Now he is not
> >> available (at least in the next few months).  I have to be more focused
> on
> >> the new AM work; plus other work in my company; it would be too much for
> >> me
> >> to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM
> >> role
> >> while I am still help to make this 2.0 release smooth.
> >>
> >>
> > (I could help out Stephen. We could co-RM?)
> >
> >
> >> For branch-2, I think it is too early to cut it, as we still have a lot
> of
> >> moving parts and on-going project that needs to be part of 2.0.  For
> >> example, the mentioned new AM (and other projects, such as HBASE-14414,
> >> HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just
> name
> >> a few).  Cutting branch now would add burden to complete those projects.
> >>
> >>
> > Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
> > be all loose ends and it'd make for a messy narrative.
> >
> > I started a doc listing state of 2.0.0: https://docs.google.
> > com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=
> > sharing
> >
> > In the doc I made an estimate of what the community considers core 2.0.0
> > items based in part off old lists and after survey of current state of
> > JIRA. The doc is open for comment. Please chime in if I am off or if I am
> > missing something that should be included. I also make a rough estimate
> on
> > state of each core item.
> >
> > I intend to keep up this macro-view doc as we progress on 2.0.0 with
> > reflection where pertinent in JIRA . Suggest we branch only when code
> > compete on the core set most of which are complete or near-so.
> > End-of-February should be time enough (First 2.0.0 RC in at the start of
> > May?).
> >
> > Thanks,
> > St.Ack
> >
> >
> >
> >> thanks
> 

Re: Moving 2.0 forward

2017-01-14 Thread Ted Yu
After HBASE-16179 gets in, we can get wider feedback from interested users in 
using hbase-spark module. 

We would then be able to find missing pieces. 

> On Jan 14, 2017, at 12:30 PM, Eric Charles  wrote:
> 
> I read "3.3 hbase-spark STATUS: Needs work. No one on it at mo. Doc. is just 
> wrong. What is there is dodgy. Could get punted."
> 
> Unit tests are working and base functionality is there. Besides the doc and 
> compilation against spark-2 (and scala-2.11), what else do you want to see?
> 
> 
>> On 14/01/17 10:29, Ted Yu wrote:
>> For 3.3, hbase-spark module, there is HBASE-16179 which enables support for 
>> Spark 2.0
>> It needs some review.
>> 
>> Cheers
>> 
>>> On Jan 13, 2017, at 11:25 PM, Stack  wrote:
>>> 
>>> On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang 
>>> wrote:
>>> 
 Hello, Andrew, I was a helper on Matteo so that we can help each other
 while we are focusing on the new Assignment Manager work.  Now he is not
 available (at least in the next few months).  I have to be more focused on
 the new AM work; plus other work in my company; it would be too much for me
 to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM role
 while I am still help to make this 2.0 release smooth.
>>> (I could help out Stephen. We could co-RM?)
>>> 
>>> 
 For branch-2, I think it is too early to cut it, as we still have a lot of
 moving parts and on-going project that needs to be part of 2.0.  For
 example, the mentioned new AM (and other projects, such as HBASE-14414,
 HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just name
 a few).  Cutting branch now would add burden to complete those projects.
>>> Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
>>> be all loose ends and it'd make for a messy narrative.
>>> 
>>> I started a doc listing state of 2.0.0:
>>> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=sharing
>>> 
>>> In the doc I made an estimate of what the community considers core 2.0.0
>>> items based in part off old lists and after survey of current state of
>>> JIRA. The doc is open for comment. Please chime in if I am off or if I am
>>> missing something that should be included. I also make a rough estimate on
>>> state of each core item.
>>> 
>>> I intend to keep up this macro-view doc as we progress on 2.0.0 with
>>> reflection where pertinent in JIRA . Suggest we branch only when code
>>> compete on the core set most of which are complete or near-so.
>>> End-of-February should be time enough (First 2.0.0 RC in at the start of
>>> May?).
>>> 
>>> Thanks,
>>> St.Ack
>>> 
>>> 
>>> 
 thanks
 Stephen
 
 On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell  Hi all,
> 
> I've heard a rumor the co-RM situation with 2.0 may have changed. Can we
> get an update from co-RMs Matteo and Steven on their availability and
> interest in continuing in this role?
> 
> To assist in moving 2.0 forward I intend to branch branch-2 from master
> next week. Unless there is an objection I will take this action under
> assumption of lazy consensus. Master branch will be renumbered to
> 3.0.0-SNAPSHOT. Once we have a branch-2 I will immediately begin scale
> tests and stabilization (via bug fixes or reverts of unfinished work) and
> invite interested collaborators to do the same.
 


Re: Moving 2.0 forward

2017-01-14 Thread Eric Charles
I read "3.3 hbase-spark STATUS: Needs work. No one on it at mo. Doc. is 
just wrong. What is there is dodgy. Could get punted."


Unit tests are working and base functionality is there. Besides the doc 
and compilation against spark-2 (and scala-2.11), what else do you want 
to see?



On 14/01/17 10:29, Ted Yu wrote:

For 3.3, hbase-spark module, there is HBASE-16179 which enables support for 
Spark 2.0
It needs some review.

Cheers


On Jan 13, 2017, at 11:25 PM, Stack  wrote:

On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang 
wrote:


Hello, Andrew, I was a helper on Matteo so that we can help each other
while we are focusing on the new Assignment Manager work.  Now he is not
available (at least in the next few months).  I have to be more focused on
the new AM work; plus other work in my company; it would be too much for me
to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM role
while I am still help to make this 2.0 release smooth.

(I could help out Stephen. We could co-RM?)



For branch-2, I think it is too early to cut it, as we still have a lot of
moving parts and on-going project that needs to be part of 2.0.  For
example, the mentioned new AM (and other projects, such as HBASE-14414,
HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just name
a few).  Cutting branch now would add burden to complete those projects.

Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
be all loose ends and it'd make for a messy narrative.

I started a doc listing state of 2.0.0:
https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=sharing

In the doc I made an estimate of what the community considers core 2.0.0
items based in part off old lists and after survey of current state of
JIRA. The doc is open for comment. Please chime in if I am off or if I am
missing something that should be included. I also make a rough estimate on
state of each core item.

I intend to keep up this macro-view doc as we progress on 2.0.0 with
reflection where pertinent in JIRA . Suggest we branch only when code
compete on the core set most of which are complete or near-so.
End-of-February should be time enough (First 2.0.0 RC in at the start of
May?).

Thanks,
St.Ack




thanks
Stephen

On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell 

Re: Proposal: Create "branch RM" roles for non releasing branches branch-1, branch-2 (when it exists), and master

2017-01-14 Thread Andrew Purtell
Thanks everyone for the discussion. I'll informally watch branch-1. Planning on 
monthly passes for cherry picks and release testing as I did for 0.98, but in 
this case the output will be nonrelease snapshots. Release RMs will not see 
pick backs from me. The changes you can expect is reverts if we something 
problematic and it has not gone out in a release yet. Looking forward to a 
productive LTS of branch-1. 


> On Jan 11, 2017, at 1:43 PM, Enis Söztutar  wrote:
> 
> I was thinking in similar lines that the RM for 1.X which is the next one
> would be managing branch-1, but I am also concerned about the large gap in
> terms of timing. For example, unless we are close to 1.4, an 1.4 RM will
> not materialize.
> 
> So, I am in favor of having an informal branch-1 RM that will work with the
> 1.x RMs. An +1 for Andrew for that role.
> 
> Enis
> 
> On Wed, Jan 11, 2017 at 1:17 PM, Andrew Purtell 
> wrote:
> 
>> We could do it that way but there would be nobody promising to watch
>> branch-1 for any length of time. I'd like to do that. We could do this
>> alternative for branch-2. And it makes sense once we have this sorted to
>> write down what we'd like to do.
>> 
>> 
>>> On Jan 9, 2017, at 3:27 PM, Nick Dimiduk  wrote:
>>> 
>>> Somewhat late to the reply --
>>> 
>>> Does it make sense, for branch-1, to have the person planning to RM the
>>> next minor release act as the RM for the major-level branch? That person
>>> would hand responsibility to the next minor RM upon cutting the
>>> stabilization branch.
>>> 
>>> This could be applied to master/branch-2 as well, but the further away we
>>> get from a target release date, the more nebulous the RM role becomes.
>>> 
 On Fri, Jan 6, 2017 at 5:07 PM Andrew Purtell 
>> wrote:
 
 HBasers,
 
 
 
 I would like to propose extending our informal "branch RM" concept just
>> a
 
 bit to include the nonreleasing branches like branch-1, branch-2 (when
>> it
 
 exists), and master. These branches are where all commits are made
>> passing
 
 through down to the releasing branches targeted for the change (like,
 
 branch-1.1, branch-1.2, branch-1.3, etc.)
 
 
 
 The releasing branches all have their own RM. I assume that RM is
 
 diligently monitoring its state, by way of review of commit history,
 
 occasional execution of the unit test suite, occasional execution of the
 
 integration tests, and has perhaps some automation in place to help with
 
 that on a nightly or weekly basis. No matter, let's assume there is a
 
 nonzero level of scrutiny applied to them, which leads to feedback to
 
 committers about inappropriate commits via compat guidelines, commits
>> which
 
 have broken unit tests, or other indications of quality or functional
 
 concerns.  I think it would improve our overall velocity as a project
>> if we
 
 could also have volunteers tending the development branches upstream
>> from
 
 the releasing branches. Less work would fall to the RMs tending the
>> release
 
 branches if a common troublesome commit can be caught upstream first. In
 
 particular I am thinking about branch-1.
 
 
 
 I would like to volunteer to become the new RM for branch-1, to test and
 
 refine my above proposal in practice. Unless I hear objections I will
 
 assume by lazy consensus everyone is ok with this experiment.
 
 
 
 What this would mean:
 
 
 
  - JIRAs like "TestFooBar is broken on branch-1" will show up sooner,
>> and
 
  more likely with fix patches
 
  - Semiregular performance reports on branch-1 code as of date X/Y/Z,
>> can
 
  compare with earlier reports for trending
 
  - Occasional sweep through master history looking for appropriate
 
  candidates for backport to branch-1, execution of said backport
 
  - Occasional 1B row ITBLL torture tests, probably if failure with
>> bisect
 
  back to commit that introduced instability
 
 
 
 What this does not mean:
 
 
 
  - The branch-1 RM will not attempt to tell other branch RMs what or
>> what
 
  not to include in their release branches
 
  - The branch-1 RM won't commit anything backported from master to any
>> of
 
  the release branches; it will continue to be up to the release branch
 RMs
 
  what they would or would not like to be included
 
 
 
 ​Also, I don't see why I couldn't spend some time looking at master now
>> and
 
 then.
 
 
 
 I am going to assume our current co-RM team for branch-2 would maybe do
 
 something similar for branch-2, once it materializes.
 
 
 
 Thoughts? Comments? Concerns?
 

Re: [VOTE] The 1st HBase 1.3.0 release candidate (RC0) is available

2017-01-14 Thread Jerry He
+1

- Downloaded the src tarball.
  - Checked the md5 and signature.  All good.
  - Built from source.   OpenJDk 1.8.0_101-b13
  - Unit test suite.  Passed.
  - mvn apache-rat:check   Passed

- Download the bin tarball.
 -  Checked the md5 and signature.  All good.
 -  Untar it.  Layout and files look good.

- Put the binaries on a distributed Kerberos cluster
 - With all the security co-processors enabled.
 - Ran hbase shell. Table put, scan, split, major-compact, merge-region.
Permission checkes good.
 - SecureBulkLoad good.
 - Security audit trace looks good. Master and region server logs look good.

 - Check master and region server UI.  Clicked on the tabs, looked at the
tasks running, Debug dump, Configuration dump, zk dump. Looks good.

 - Got this exception when running PerformanceEvaluation or other MR job:
  org.codehaus.jackson.map.exc.UnrecognizedPropertyException: Unrecognized
field "Token" (Class
org.apache.hadoop.yarn.api.records.timeline.TimelineDelegationTokenResponse),
not marked as ignorable
 at [Source: N/A; line: -1, column: -1] (through reference chain:
org.apache.hadoop.yarn.api.records.timeline.TimelineDelegationTokenResponse["Token"])

   My Hadoop/Yarn cluster is 2.7.2.  HBase 1.3.0 bundles 2.5.1. That may be
the problem.
   After copying hadoop 2.7.2 jars into hbase.  The jobs run fine.
   Loaded data.

On Sat, Jan 14, 2017 at 12:51 AM, Elliott Clark  wrote:

> +1 Checked sigs.
> Downloaded everything looks good.
> License and Notice files look good.
>
> On Mon, Jan 9, 2017 at 7:11 PM, Andrew Purtell 
> wrote:
>
> > +1
> >
> > Checked sums and signatures: ok
> > Spot check LICENSE and NOTICE files: ok
> > Built from source (7u80): ok
> > RAT audit (7u80): ok
> > Unit test suite passes (8u102):
> > Loaded 1 million rows with LTT (8u102, 10 readers, 10 writers, 10
> updaters
> > @ 20%): all keys verified, no unexpected or unusual messages in the logs,
> > latencies in the ballpark
> > 1 billion row ITBLL (8u102): failed, but known issue HBASE-17069
> >
> >
> >
> > On Fri, Jan 6, 2017 at 12:42 PM, Mikhail Antonov 
> > wrote:
> >
> > > Hello everyone,
> > >
> > > I'm pleased to announce the first release candidate for HBase 1.3.0 is
> > > available to download and testing.
> > >
> > > Artifacts are available here:
> > >
> > > https://dist.apache.org/repos/dist/dev/hbase/1.3.0RC0/
> > >
> > > Maven artifacts are available in the staging repository:
> > >
> > > https://repository.apache.org/content/repositories/
> orgapachehbase-1162/
> > >
> > > All artifacts are signed with my code signing key 35A4ABE2, which is
> also
> > > in the project KEYS file at
> > >
> > > http://www.apache.org/dist/hbase/KEYS
> > >
> > > these artifacts correspond to commit hash
> > >
> > > e359c76e8d9fd0d67396456f92bcbad9ecd7a710 tagged as 1.3.0RC0.
> > >
> > > HBase 1.3.0 is the third minor release in the HBase 1.x line and
> includes
> > > approximately 1700 resolved issues.
> > >
> > > Notable new features include:
> > >
> > >  - Date-based tiered compactions (HBASE-15181, HBASE-15339)
> > >  - Maven archetypes for HBase client applications (HBASE-14877)
> > >  - Throughput controller for flushes (HBASE-14969)
> > >  - Controlled delay (CoDel) based RPC scheduler (HBASE-15136)
> > >  - Bulk loaded HFile replication (HBASE-13153)
> > >  - More improvements to Procedure V2
> > >  - Improvements to Multi WAL (HBASE-14457)
> > >  - Many improvements and optimizations in metrics subsystem
> > >  - Reduced memory allocation in RPC layer (HBASE-15177)
> > >  - Region location lookups optimizations in HBase client
> > >
> > > and numerous bugfixes and performance improvements.
> > >
> > > The full list of issues addressed is available at
> > >
> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > > projectId=12310753=12332794
> > >
> > > and also in the CHANGES.txt file in the root directory of the source
> > > tarball.
> > >
> > > Please take a few minutes to verify the release and vote on
> > > releasing it:
> > >
> > > [ ] +1 Release this package as Apache HBase 1.3.0
> > > [ ] +0 no opinion
> > > [ ] -1 Do not release this package because...
> > >
> > > This Vote will run for one week and close Fri Jan 13, 2017 22:00 PST.
> > >
> > > Thank you!
> > > Mikhail
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> >- Andy
> >
> > If you are given a choice, you believe you have acted freely. - Raymond
> > Teller (via Peter Watts)
> >
>


Re: Moving 2.0 forward

2017-01-14 Thread Andrew Purtell
Thanks for putting that document together Stack, that was really helpful.

> 1.1 New Assignment Manager, AMv2

​Can we get a virtual show of hands who is working on this and plans to
finish it? It was Stephen and Matteo originally, right? Matteo seems
temporarily sidelined, is that correct?

> 1.3 Offheaping of Write Path
​> ​
1.4 HBASE-11425 Offheaping of Read Path
​> ​
1.6 HBASE-15265 AsyncWAL/HBase DFSClient
​
​Maybe we can organize some efforts to test small deploys of 2.0.0-SNAPSHOT
with these features​ enabled since they are code complete but need testing
and more doc, which can be generated from notes from testers on setup and
experience. I can stand up a few clusterdock-based virtual clusters on EC2
D2-class instances running integration tests, PE, and YCSB etc; surface
issues up into JIRA; and provide SSH access on demand. Let me see ... not
sure if 2.0.0-SNAPSHOT is stable enough to get that far. If so hopefully
the developers behind these features will be willing to jump on them and
lead debugging/fix if issues are found.

​​> 2.3 HBASE-6721 RegionServer Group-based Assignment

Same as above, although in this case I suspect interested users are on our
own to debug/fix.
​

On Fri, Jan 13, 2017 at 11:49 PM, Andrew Purtell 
wrote:

> While I don't disagree that half finished features are undesirable, I'm
> not suggesting that as a strategy so much as we kick out stuff that just
> doesn't seem to be getting done. Pushing 2.0 out another three months is
> fine if there's a good chance this is realistic and we won't be having this
> discussion again then. Let me have a look at the doc and return with
> specific points for further discussion (if any).
>
>
> On Jan 13, 2017, at 11:25 PM, Stack  wrote:
>
> On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang 
> wrote:
>
>> Hello, Andrew, I was a helper on Matteo so that we can help each other
>> while we are focusing on the new Assignment Manager work.  Now he is not
>> available (at least in the next few months).  I have to be more focused on
>> the new AM work; plus other work in my company; it would be too much for
>> me
>> to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM
>> role
>> while I am still help to make this 2.0 release smooth.
>>
>>
> (I could help out Stephen. We could co-RM?)
>
>
>> For branch-2, I think it is too early to cut it, as we still have a lot of
>> moving parts and on-going project that needs to be part of 2.0.  For
>> example, the mentioned new AM (and other projects, such as HBASE-14414,
>> HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just name
>> a few).  Cutting branch now would add burden to complete those projects.
>>
>>
> Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
> be all loose ends and it'd make for a messy narrative.
>
> I started a doc listing state of 2.0.0: https://docs.google.
> com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=
> sharing
>
> In the doc I made an estimate of what the community considers core 2.0.0
> items based in part off old lists and after survey of current state of
> JIRA. The doc is open for comment. Please chime in if I am off or if I am
> missing something that should be included. I also make a rough estimate on
> state of each core item.
>
> I intend to keep up this macro-view doc as we progress on 2.0.0 with
> reflection where pertinent in JIRA . Suggest we branch only when code
> compete on the core set most of which are complete or near-so.
> End-of-February should be time enough (First 2.0.0 RC in at the start of
> May?).
>
> Thanks,
> St.Ack
>
>
>
>> thanks
>> Stephen
>>
>> On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell <
>> andrew.purt...@gmail.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > I've heard a rumor the co-RM situation with 2.0 may have changed. Can we
>> > get an update from co-RMs Matteo and Steven on their availability and
>> > interest in continuing in this role?
>> >
>> > To assist in moving 2.0 forward I intend to branch branch-2 from master
>> > next week. Unless there is an objection I will take this action under
>> > assumption of lazy consensus. Master branch will be renumbered to
>> > 3.0.0-SNAPSHOT. Once we have a branch-2 I will immediately begin scale
>> > tests and stabilization (via bug fixes or reverts of unfinished work)
>> and
>> > invite interested collaborators to do the same.
>> >
>> >
>> >
>>
>
>


-- 
Best regards,

   - Andy

If you are given a choice, you believe you have acted freely. - Raymond
Teller (via Peter Watts)


Successful: HBase Generate Website

2017-01-14 Thread Apache Jenkins Server
Build status: Successful

If successful, the website and docs have been generated. To update the live 
site, follow the instructions below. If failed, skip to the bottom of this 
email.

Use the following commands to download the patch and apply it to a clean branch 
based on origin/asf-site. If you prefer to keep the hbase-site repo around 
permanently, you can skip the clone step.

  git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git

  cd hbase-site
  wget -O- 
https://builds.apache.org/job/hbase_generate_website/461/artifact/website.patch.zip
 | funzip > 4cb09a494c4148de2b4e8c6cd011bacdf7f33b1a.patch
  git fetch
  git checkout -b asf-site-4cb09a494c4148de2b4e8c6cd011bacdf7f33b1a 
origin/asf-site
  git am --whitespace=fix 4cb09a494c4148de2b4e8c6cd011bacdf7f33b1a.patch

At this point, you can preview the changes by opening index.html or any of the 
other HTML pages in your local 
asf-site-4cb09a494c4148de2b4e8c6cd011bacdf7f33b1a branch.

There are lots of spurious changes, such as timestamps and CSS styles in 
tables, so a generic git diff is not very useful. To see a list of files that 
have been added, deleted, renamed, changed type, or are otherwise interesting, 
use the following command:

  git diff --name-status --diff-filter=ADCRTXUB origin/asf-site

To see only files that had 100 or more lines changed:

  git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}'

When you are satisfied, publish your changes to origin/asf-site using these 
commands:

  git commit --allow-empty -m "Empty commit" # to work around a current ASF 
INFRA bug
  git push origin asf-site-4cb09a494c4148de2b4e8c6cd011bacdf7f33b1a:asf-site
  git checkout asf-site
  git branch -D asf-site-4cb09a494c4148de2b4e8c6cd011bacdf7f33b1a

Changes take a couple of minutes to be propagated. You can verify whether they 
have been propagated by looking at the Last Published date at the bottom of 
http://hbase.apache.org/. It should match the date in the index.html on the 
asf-site branch in Git.

As a courtesy- reply-all to this email to let other committers know you pushed 
the site.



If failed, see https://builds.apache.org/job/hbase_generate_website/461/console

Re: Moving 2.0 forward

2017-01-14 Thread Ted Yu
For 3.3, hbase-spark module, there is HBASE-16179 which enables support for 
Spark 2.0  
It needs some review. 

Cheers

> On Jan 13, 2017, at 11:25 PM, Stack  wrote:
> 
> On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang 
> wrote:
> 
>> Hello, Andrew, I was a helper on Matteo so that we can help each other
>> while we are focusing on the new Assignment Manager work.  Now he is not
>> available (at least in the next few months).  I have to be more focused on
>> the new AM work; plus other work in my company; it would be too much for me
>> to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM role
>> while I am still help to make this 2.0 release smooth.
> (I could help out Stephen. We could co-RM?)
> 
> 
>> For branch-2, I think it is too early to cut it, as we still have a lot of
>> moving parts and on-going project that needs to be part of 2.0.  For
>> example, the mentioned new AM (and other projects, such as HBASE-14414,
>> HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just name
>> a few).  Cutting branch now would add burden to complete those projects.
> Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
> be all loose ends and it'd make for a messy narrative.
> 
> I started a doc listing state of 2.0.0:
> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=sharing
> 
> In the doc I made an estimate of what the community considers core 2.0.0
> items based in part off old lists and after survey of current state of
> JIRA. The doc is open for comment. Please chime in if I am off or if I am
> missing something that should be included. I also make a rough estimate on
> state of each core item.
> 
> I intend to keep up this macro-view doc as we progress on 2.0.0 with
> reflection where pertinent in JIRA . Suggest we branch only when code
> compete on the core set most of which are complete or near-so.
> End-of-February should be time enough (First 2.0.0 RC in at the start of
> May?).
> 
> Thanks,
> St.Ack
> 
> 
> 
>> thanks
>> Stephen
>> 
>> On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell > wrote:
>> 
>>> Hi all,
>>> 
>>> I've heard a rumor the co-RM situation with 2.0 may have changed. Can we
>>> get an update from co-RMs Matteo and Steven on their availability and
>>> interest in continuing in this role?
>>> 
>>> To assist in moving 2.0 forward I intend to branch branch-2 from master
>>> next week. Unless there is an objection I will take this action under
>>> assumption of lazy consensus. Master branch will be renumbered to
>>> 3.0.0-SNAPSHOT. Once we have a branch-2 I will immediately begin scale
>>> tests and stabilization (via bug fixes or reverts of unfinished work) and
>>> invite interested collaborators to do the same.
>> 


Re: [VOTE] The 1st HBase 1.3.0 release candidate (RC0) is available

2017-01-14 Thread Elliott Clark
+1 Checked sigs.
Downloaded everything looks good.
License and Notice files look good.

On Mon, Jan 9, 2017 at 7:11 PM, Andrew Purtell  wrote:

> +1
>
> Checked sums and signatures: ok
> Spot check LICENSE and NOTICE files: ok
> Built from source (7u80): ok
> RAT audit (7u80): ok
> Unit test suite passes (8u102):
> Loaded 1 million rows with LTT (8u102, 10 readers, 10 writers, 10 updaters
> @ 20%): all keys verified, no unexpected or unusual messages in the logs,
> latencies in the ballpark
> 1 billion row ITBLL (8u102): failed, but known issue HBASE-17069
>
>
>
> On Fri, Jan 6, 2017 at 12:42 PM, Mikhail Antonov 
> wrote:
>
> > Hello everyone,
> >
> > I'm pleased to announce the first release candidate for HBase 1.3.0 is
> > available to download and testing.
> >
> > Artifacts are available here:
> >
> > https://dist.apache.org/repos/dist/dev/hbase/1.3.0RC0/
> >
> > Maven artifacts are available in the staging repository:
> >
> > https://repository.apache.org/content/repositories/orgapachehbase-1162/
> >
> > All artifacts are signed with my code signing key 35A4ABE2, which is also
> > in the project KEYS file at
> >
> > http://www.apache.org/dist/hbase/KEYS
> >
> > these artifacts correspond to commit hash
> >
> > e359c76e8d9fd0d67396456f92bcbad9ecd7a710 tagged as 1.3.0RC0.
> >
> > HBase 1.3.0 is the third minor release in the HBase 1.x line and includes
> > approximately 1700 resolved issues.
> >
> > Notable new features include:
> >
> >  - Date-based tiered compactions (HBASE-15181, HBASE-15339)
> >  - Maven archetypes for HBase client applications (HBASE-14877)
> >  - Throughput controller for flushes (HBASE-14969)
> >  - Controlled delay (CoDel) based RPC scheduler (HBASE-15136)
> >  - Bulk loaded HFile replication (HBASE-13153)
> >  - More improvements to Procedure V2
> >  - Improvements to Multi WAL (HBASE-14457)
> >  - Many improvements and optimizations in metrics subsystem
> >  - Reduced memory allocation in RPC layer (HBASE-15177)
> >  - Region location lookups optimizations in HBase client
> >
> > and numerous bugfixes and performance improvements.
> >
> > The full list of issues addressed is available at
> >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > projectId=12310753=12332794
> >
> > and also in the CHANGES.txt file in the root directory of the source
> > tarball.
> >
> > Please take a few minutes to verify the release and vote on
> > releasing it:
> >
> > [ ] +1 Release this package as Apache HBase 1.3.0
> > [ ] +0 no opinion
> > [ ] -1 Do not release this package because...
> >
> > This Vote will run for one week and close Fri Jan 13, 2017 22:00 PST.
> >
> > Thank you!
> > Mikhail
> >
>
>
>
> --
> Best regards,
>
>- Andy
>
> If you are given a choice, you believe you have acted freely. - Raymond
> Teller (via Peter Watts)
>