[jira] [Created] (HBASE-17260) Procedure v2 - Add setOwner() overload taking a User instance

2016-12-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17260:
---

 Summary: Procedure v2 - Add setOwner() overload taking a User 
instance
 Key: HBASE-17260
 URL: https://issues.apache.org/jira/browse/HBASE-17260
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


since we should have a User instance in most of the cases, we should just be 
able to pass it, rather than converting it to getShortName() every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17228) precommit grep -c ERROR may grab non errors

2016-12-01 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17228:
---

 Summary: precommit grep -c ERROR may grab non errors
 Key: HBASE-17228
 URL: https://issues.apache.org/jira/browse/HBASE-17228
 Project: HBase
  Issue Type: Bug
  Components: scripts
Reporter: Matteo Bertozzi
Priority: Minor


it looks like that we do a simple "grep -c ERROR" to count the errors that we 
have from the build.
https://github.com/apache/hbase/blob/master/dev-support/hbase-personality.sh#L305

but in this way we ended up with a count=1 just because we have one enum called 
ERROR_CODE in hbase. and the enum shows up as debug message
{noformat}
$ grep ERROR patch-hbaseprotoc-hbase-server.txt 
[DEBUG] adding entry 
org/apache/hadoop/hbase/util/HBaseFsck$ErrorReporter$ERROR_CODE.class
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17149) Procedure v2 - Fix nonce submission

2016-11-21 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17149:
---

 Summary: Procedure v2 - Fix nonce submission
 Key: HBASE-17149
 URL: https://issues.apache.org/jira/browse/HBASE-17149
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.2.4, 1.1.7, 2.0.0, 1.3.0, 1.4.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi


instead of having all the logic in submitProcedure(), split in registerNonce() 
+ submitProcedure().
In this case we can avoid calling the coprocessor twice and having a clean 
submit logic knowing that there will only be one submission.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17148) Procedure v2 - add bulk proc submit

2016-11-21 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17148:
---

 Summary: Procedure v2 - add bulk proc submit
 Key: HBASE-17148
 URL: https://issues.apache.org/jira/browse/HBASE-17148
 Project: HBase
  Issue Type: Sub-task
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


Add the ability to submit multiple procedure as a single operation. useful for 
the AM to reduce some lock/unlock/wait times



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2016-11-18 Thread Matteo Bertozzi
I did one last pass to the mega patch. I don't see anything major that
should block the merge.

- most of the code is isolated in the backup package
- all the backup code is client side
- there are few changes to the server side, mainly for cleaners, wal
rolling and similar (which is ok)
- there is a good number of tests, and an integration test

the code seems to have still some left overs from the old implementation,
and some stuff needs a cleanup. but I don't think this should be used as an
argument to block the merge. I think the guys will keep working on this and
they may also get help of others once the patch is in master.

I still have my concerns about the current limitations, but these are
things already planned for phase 3, so some of this stuff may even be in
the final 2.0.
but as long as we have a "current limitations" section in the user guide
mentioning important stuff like the ones below, I'm ok with it.
 - if you write to the table with Durability.SKIP_WALS your data will not
be in the incremental-backup
 - if you bulkload files that data will not be in the incremental backup
(HBASE-14417)
 - the incremental backup will not only contains the data of the table you
specified but also the regions from other tables that are on the same set
of RSs (HBASE-14141) ...maybe a note about security around this topic
 - the incremental backup will not contains just the "latest row" between
backup A and B, but it will also contains all the updates occurred in
between. but the restore does not allow you to restore up to a certain
point in time, the restore will always be up to the "latest backup point".
 - you should limit the number of "incremental" up to N (or maybe SIZE), to
avoid replay time becoming the bottleneck. (HBASE-14135)

I'll be ok even with the above not being in the final 2.0,
but i'd like to see as blocker for the final 2.0 (not the merge)
 - the backup code moved in an hbase-backup module
 - and some more work around tools, especially to try to unify and make
simple the backup experience (simple example: in some case there is a
backup_id argument in others a backupId argument. or things like.. restore
is not clear if given an incremental id it will do the full restore from
full up to that point or if i need to apply manually everything).

in conclusion, I think we can open a merge vote. I'll be +1 on it, and I
think we should try to reject -1 with just a "code cleanup" motivation,
since there will still be work going on on the code after the merge.

Matteo


On Sun, Nov 6, 2016 at 10:54 PM, Devaraj Das  wrote:

> Stack and others, anything else on the patch? Merge to master now?
>


[jira] [Created] (HBASE-17104) Improve cryptic error message "Memstore size is" on region close

2016-11-15 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17104:
---

 Summary: Improve cryptic error message "Memstore size is" on 
region close
 Key: HBASE-17104
 URL: https://issues.apache.org/jira/browse/HBASE-17104
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


while grepping my RS log for ERROR I found a cryptic
{noformat}
ERROR [RS_CLOSE_REGION-u1604vm:35021-1] regionserver.HRegion(1601): Memstore 
size is 33744
{noformat}

from the code looks like we seems to want to notify the user about the fact 
that on close the rs was not able to flush and there were things in the RS. 
https://github.com/apache/hbase/blob/c3685760f004450667920144f926383eb307de53/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L1601
{code}
if (!canFlush) {
  this.decrMemstoreSize(new MemstoreSize(memstoreDataSize.get(), 
getMemstoreHeapOverhead()));
} else if (memstoreDataSize.get() != 0) {
  LOG.error("Memstore size is " + memstoreDataSize.get());
}
{code}
this should probably not even be an error but a warn or even info, unless we 
have puts that specifically asked to not be written to the wal,  otherwise the 
data in the memstore should be safe in the wals. 
In any case it will be nice to have a message describing what is going on and 
why we are notifying about the memstore size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17090) Procedure v2 - fast wake if nothing else is running

2016-11-14 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17090:
---

 Summary: Procedure v2 - fast wake if nothing else is running
 Key: HBASE-17090
 URL: https://issues.apache.org/jira/browse/HBASE-17090
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


We wait Nmsec to see if we can batch more procedures, but the pattern that we 
have allows us to wait only for what we know is running and avoid waiting for 
something that will never get there. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17068) Procedure v2 - inherit region locks

2016-11-10 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17068:
---

 Summary: Procedure v2 - inherit region locks 
 Key: HBASE-17068
 URL: https://issues.apache.org/jira/browse/HBASE-17068
 Project: HBase
  Issue Type: Sub-task
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi


Add support for inherited region locks. 
e.g. Split will have Assign/Unassign as child which will take the lock on the 
same region split is running on



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17067) Procedure v2 - remove zklock/tryLock and use wait/wake

2016-11-10 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-17067:
---

 Summary: Procedure v2 - remove zklock/tryLock and use wait/wake
 Key: HBASE-17067
 URL: https://issues.apache.org/jira/browse/HBASE-17067
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Once we have HBASE-16744, HBASE-16786, HBASE-16831. we can remove the tryLock() 
methods and replace them with the wait/wake methods that are using the 
framework events instead of spinning until we can start the proc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16937) Replace SnapshotType protobuf conversion when we can directly use the pojo object

2016-10-24 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16937:
---

 Summary: Replace SnapshotType protobuf conversion when we can 
directly use the pojo object
 Key: HBASE-16937
 URL: https://issues.apache.org/jira/browse/HBASE-16937
 Project: HBase
  Issue Type: Sub-task
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0
 Attachments: HBASE-16937-v0.patch

mostly find & replace work:
replace the back and forth protobuf conversion when we can just use the client 
SnapshotType enum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Drop legacy hadoop support at the 2.0 release

2016-10-24 Thread Matteo Bertozzi
cool, thanks all!

to sum up the discussion:
HBase 2.0 will have 2.6.1+ and 2.7.1+ has supported hadoop version.
(hadoop 3.x support will be decided later, since work is still going on)

HBASE-16884 - updates the supported version map in the docs,  with the
2.6.1+ and 2.7.1+ support for 2.0
HBASE-16887 - fixes pre-commit to run with 2.6.1+, 2.7.1+, 3.0 on master (hbase
2.0) and 2.4.0+, 2.5.0+, 2.6.1+, 2.7.1+ on other branches.

On Thu, Oct 20, 2016 at 7:53 AM, 张铎  wrote:

> See HBASE-16887
>
> 2016-10-20 22:50 GMT+08:00 Sean Busbey :
>
> > Unfortunately, I think this means we'll need to update the hadoopcheck
> > versions to vary by branch.
> >
> > On Wed, Oct 19, 2016 at 6:21 PM, 张铎  wrote:
> > > OK, seems no objection. I will file a issue to modify the hadoop
> version
> > > support matrix.
> > >
> > > And I think we also need to change the hadoopcheck versions for our
> > > precommit check?
> > >
> > > Thanks all.
> > >
> > > 2016-10-20 1:14 GMT+08:00 Andrew Purtell :
> > >
> > >> FWIW, we are running 2.7.x in production and it's stable.
> > >>
> > >>
> > >> On Tue, Oct 18, 2016 at 10:18 PM, Sean Busbey 
> > wrote:
> > >>
> > >> > we had not decided yet AFAIK. a big concern was the lack of
> > >> > maintenance releases on more recent Hadoop versions and the
> perception
> > >> > that 2.4 and 2.5 were the last big stable release lines.
> > >> >
> > >> > 2.6 and 2.7 have both gotten a few maintenance releases now, so
> maybe
> > >> > this isn't a concern any more?
> > >> >
> > >> > On Tue, Oct 18, 2016 at 7:00 PM, Enis Söztutar 
> > >> wrote:
> > >> > > I thought we already decided to do that, no?
> > >> > >
> > >> > > Enis
> > >> > >
> > >> > > On Tue, Oct 18, 2016 at 6:56 PM, Ted Yu 
> > wrote:
> > >> > >
> > >> > >> Looking at http://hadoop.apache.org/releases.html , 2.5.x hasn't
> > got
> > >> > new
> > >> > >> release for almost two years.
> > >> > >>
> > >> > >> Seems fine to drop support for 2.4 and 2.5
> > >> > >>
> > >> > >> On Tue, Oct 18, 2016 at 6:42 PM, Duo Zhang 
> > >> wrote:
> > >> > >>
> > >> > >> > This is the current hadoop version support matrix
> > >> > >> >
> > >> > >> > https://hbase.apache.org/book.html#hadoop
> > >> > >> >
> > >> > >> > 2016-10-19 9:40 GMT+08:00 Duo Zhang :
> > >> > >> >
> > >> > >> > > To be specific, hadoop-2.4.x and hadoop-2.5.x.
> > >> > >> > >
> > >> > >> > > The latest releases for these two lines are about two years
> > ago,
> > >> so
> > >> > I
> > >> > >> > > think it is the time to drop the support of them when 2.0
> out.
> > >> Then
> > >> > we
> > >> > >> > > could drop some code in our hadoop-compat module as we may
> > need to
> > >> > add
> > >> > >> > some
> > >> > >> > > code for the incoming hadoop-3.0...
> > >> > >> > >
> > >> > >> > > Thanks.
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >>
> > >>- Andy
> > >>
> > >> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > >> (via Tom White)
> > >>
> >
> >
> >
> > --
> > busbey
> >
>


[jira] [Created] (HBASE-16871) Procedure v2 - add waiting procs back to the queue after restart

2016-10-18 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16871:
---

 Summary: Procedure v2 - add waiting procs back to the queue after 
restart
 Key: HBASE-16871
 URL: https://issues.apache.org/jira/browse/HBASE-16871
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Procs in WAITING_TIMEOUT state don't get re-added to the queue after restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16865) Procedure v2 - Inherit lock from root proc

2016-10-17 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16865:
---

 Summary: Procedure v2 - Inherit lock from root proc
 Key: HBASE-16865
 URL: https://issues.apache.org/jira/browse/HBASE-16865
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0
 Attachments: HBASE-16865-v0.patch

At the moment we support inheriting locks from the parent procedure for a 2 
level procedures, but in case of reopen table regions we have a 3 level 
procedures (ModifyTable -> ReOpen -> [Unassign/Assign])  and reopen does not 
have any locks on its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16864) Procedure v2 - Fix StateMachineProcedure support for child procs at last step

2016-10-17 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16864:
---

 Summary: Procedure v2 - Fix StateMachineProcedure support for 
child procs at last step
 Key: HBASE-16864
 URL: https://issues.apache.org/jira/browse/HBASE-16864
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


There is a bug in the StateMachineProcedure when we add child procs in the last 
step. On recovery we end up spinning on the last step without ever completing. 
the fix is to introduce an eof step so recovery knows that we are already done 
once the all the children are terminated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16846) Procedure v2 - executor cleanup

2016-10-14 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16846:
---

 Summary: Procedure v2 - executor cleanup
 Key: HBASE-16846
 URL: https://issues.apache.org/jira/browse/HBASE-16846
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


Trying to reorganize the executor. move some code around grouping common code, 
remove some synchronization from Procedure and adding comments on various 
sections.

the execution logic remains unchanged



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16839) Procedure v2 - Move all protobuf handling to ProcedureUtil

2016-10-13 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16839:
---

 Summary: Procedure v2 - Move all protobuf handling to ProcedureUtil
 Key: HBASE-16839
 URL: https://issues.apache.org/jira/browse/HBASE-16839
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0
 Attachments: HBASE-16839-v0.patch

At the moment we have some of the protobuf conversion in Procedure and some in 
ProcedureUtil, let's move all the serialization to ProcedureUtil and try to 
keep the Procedure object "clean".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16813) Procedure v2 - Move ProcedureEvent to hbase-procedure module

2016-10-11 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16813:
---

 Summary: Procedure v2 - Move ProcedureEvent to hbase-procedure 
module
 Key: HBASE-16813
 URL: https://issues.apache.org/jira/browse/HBASE-16813
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


ProcedureEvent was added in MasterProcedureScheduler, but it is generic enough 
to move to hbase-procedure module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16802) Procedure v2 - group procedure cleaning

2016-10-10 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16802:
---

 Summary: Procedure v2 - group procedure cleaning
 Key: HBASE-16802
 URL: https://issues.apache.org/jira/browse/HBASE-16802
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


group the cleaning of the evicted procedures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16791) Fix TestDispatchMergingRegionsProcedure#testRollbackAndDoubleExecution

2016-10-07 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16791:
---

 Summary: Fix 
TestDispatchMergingRegionsProcedure#testRollbackAndDoubleExecution
 Key: HBASE-16791
 URL: https://issues.apache.org/jira/browse/HBASE-16791
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0
 Attachments: HBASE-16791-v0.patch

Fix TestDispatchMergingRegionsProcedure#testRollbackAndDoubleExecution. The 
test is going one step over the PONR



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16781) Fix flaky TestMasterProcedureWalLease

2016-10-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16781:
---

 Summary: Fix flaky TestMasterProcedureWalLease
 Key: HBASE-16781
 URL: https://issues.apache.org/jira/browse/HBASE-16781
 Project: HBase
  Issue Type: Test
  Components: proc-v2, test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


Attempt to fix the flaky TestMasterProcedureWalLease.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16779) Avoid create/disable/delete for each verification in testIllegalTableDescriptor

2016-10-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16779:
---

 Summary: Avoid create/disable/delete for each verification in 
testIllegalTableDescriptor
 Key: HBASE-16779
 URL: https://issues.apache.org/jira/browse/HBASE-16779
 Project: HBase
  Issue Type: Sub-task
  Components: test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


testIllegalTableDescriptor() calls create/disable/delete for each option that 
it wants to verify. let's try to avoid all these operations and reduce the test 
runtime



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16777) Fix flaky TestMasterProcedureEvents

2016-10-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16777:
---

 Summary: Fix flaky TestMasterProcedureEvents 
 Key: HBASE-16777
 URL: https://issues.apache.org/jira/browse/HBASE-16777
 Project: HBase
  Issue Type: Test
  Components: proc-v2, test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0
 Attachments: HBASE-16777-v0.patch

Fix the flaky MasterProcedureEvents test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16735) Procedure v2 - Fix yield while holding locks

2016-09-29 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16735:
---

 Summary: Procedure v2 - Fix yield while holding locks
 Key: HBASE-16735
 URL: https://issues.apache.org/jira/browse/HBASE-16735
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0
 Attachments: HBASE-16735-v0.patch

Sched fix for proc holding locks. the proc was added back to the queue, but the 
queue was not re-added to the runq



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Snapshot getting under archive folder

2016-09-28 Thread Matteo Bertozzi
exporting to archive is normal. that's how snapshot works.
hbase knows how to deal with it.

just use the restore_snapshot or clone_snapshot and you'll have your table
available.
for hbase it doesn't really matter where the files are

Matteo


On Wed, Sep 28, 2016 at 3:13 AM, ssharavanan  wrote:

> We are performing a snapshot copy from one cluster 'A' to another cluster
> 'B'.
>
> During the export snapshot process all the data falls under
> /hbasedata/archive/data/Namespace. We were expecting the data to be under
> /hbasedata/data/. Is this normal?
>
> How to have all our data inside /hbasedata/data by default during our
> export
> or restore snapshot process?
>
>
>
>
>
>
>
> --
> View this message in context: http://apache-hbase.679495.n3.
> nabble.com/Snapshot-getting-under-archive-folder-tp4082961.html
> Sent from the HBase Developer mailing list archive at Nabble.com.
>


Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-27 Thread Matteo Bertozzi
4, 2016, at 8:17 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> > > >
> > > > bq. procedure gives you a retry mechanism on failure
> > > >
> > > > We do need this mechanism. Take a look at the multi-step
> > > > in FullTableBackupProcedure, etc.
> > > >
> > > > bq. let the user export it later when he wants
> > > >
> > > > This would make supporting security more complex (user A shouldn't be
> > > > exporting user B's backup). And it is not user friendly - at the time
> > > > backup request is issued, the following is specified:
> > > >
> > > > +  + " BACKUP_ROOT The full root path to store the backup
> > > > image,\n"
> > > > +  + " the prefix can be hdfs, webhdfs or
> > gpfs\n"
> > > >
> > > > Backup root is an integral part of backup manifest.
> > > >
> > > > Cheers
> > > >
> > > >
> > > > On Sat, Sep 24, 2016 at 7:59 AM, Matteo Bertozzi <
> > > theo.berto...@gmail.com>
> > > > wrote:
> > > >
> > > >>> On Sat, Sep 24, 2016 at 7:19 AM, Ted Yu <yuzhih...@gmail.com>
> wrote:
> > > >>>
> > > >>> Ideally the export should have one job running which does the retry
> > (on
> > > >>> failed partition) itself.
> > > >>>
> > > >>
> > > >> procedure gives you a retry mechanism on failure. if you don't use
> > that,
> > > >> than you don't need procedure.
> > > >> if you want you can start a procedure executor in a non master
> process
> > > (the
> > > >> hbase-procedure is a separate package and does not depend on
> master).
> > > but
> > > >> again, export seems a case where you don't need procedure.
> > > >>
> > > >> like snapshot, the logic may just be: ask the master to take a
> backup.
> > > and
> > > >> let the user export it later when he wants. so you avoid having a MR
> > job
> > > >> started by the master since people does not seems to like it.
> > > >>
> > > >> for restore (I think that is where you use the MR splitter) you can
> > > >> probably just have a backup ready (already splitted). there is
> > already a
> > > >> jira that should do that HBASE-14135. instead of doing the operation
> > of
> > > >> split/merge on restore. you consolidate the backup "offline" (mr job
> > > >> started by the user) and then ask to restore the backup.
> > > >>
> > > >>
> > > >>>
> > > >>> On Sat, Sep 24, 2016 at 7:04 AM, Matteo Bertozzi <
> > > >> theo.berto...@gmail.com>
> > > >>> wrote:
> > > >>>
> > > >>>> as far as I understand the code, you don't need procedure for the
> > > >> export
> > > >>>> itself.
> > > >>>> the export operation is already idempotent, since you are just
> > copying
> > > >>>> files.
> > > >>>> if the file exist and is complete (check length, checksum, ...)
> you
> > > can
> > > >>>> skip it,
> > > >>>> otherwise you'll send it over again.
> > > >>>>
> > > >>>> you need the proc for taking the backup and restoring,
> > > >>>> because you want to complete the operation and end up with a
> > > consistent
> > > >>>> state
> > > >>>> across the multiple components you are updating (meta, fs, ...)
> > > >>>> but again, for export you can just run the tool over and over
> until
> > > the
> > > >>>> operation succeed, and that should be ok.
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> Matteo
> > > >>>>
> > > >>>>
> > > >>>>> On Sat, Sep 24, 2016 at 6:54 AM, Ted Yu <yuzhih...@gmail.com>
> > wrote:
> > > >>>>>
> > > >>>>> Master is involved in this discussion because currently only
> Master
> > > >>>>> instantiates ProcedureExecutor which runs the 3 Procedures for
> > > >> backup /
> > > >>>>> restore.
> 

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Matteo Bertozzi
On Sat, Sep 24, 2016 at 7:19 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Ideally the export should have one job running which does the retry (on
> failed partition) itself.
>

procedure gives you a retry mechanism on failure. if you don't use that,
than you don't need procedure.
if you want you can start a procedure executor in a non master process (the
hbase-procedure is a separate package and does not depend on master). but
again, export seems a case where you don't need procedure.

like snapshot, the logic may just be: ask the master to take a backup. and
let the user export it later when he wants. so you avoid having a MR job
started by the master since people does not seems to like it.

for restore (I think that is where you use the MR splitter) you can
probably just have a backup ready (already splitted). there is already a
jira that should do that HBASE-14135. instead of doing the operation of
split/merge on restore. you consolidate the backup "offline" (mr job
started by the user) and then ask to restore the backup.


>
> On Sat, Sep 24, 2016 at 7:04 AM, Matteo Bertozzi <theo.berto...@gmail.com>
> wrote:
>
> > as far as I understand the code, you don't need procedure for the export
> > itself.
> > the export operation is already idempotent, since you are just copying
> > files.
> > if the file exist and is complete (check length, checksum, ...) you can
> > skip it,
> > otherwise you'll send it over again.
> >
> > you need the proc for taking the backup and restoring,
> > because you want to complete the operation and end up with a consistent
> > state
> > across the multiple components you are updating (meta, fs, ...)
> > but again, for export you can just run the tool over and over until the
> > operation succeed, and that should be ok.
> >
> >
> >
> > Matteo
> >
> >
> > On Sat, Sep 24, 2016 at 6:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> >
> > > Master is involved in this discussion because currently only Master
> > > instantiates ProcedureExecutor which runs the 3 Procedures for backup /
> > > restore.
> > >
> > > What if an optional standalone service which hosts ProcedureExecutor is
> > > used for this purpose ?
> > > Would that have better chance of giving us middle ground so that we can
> > > move this forward ?
> > >
> > > Cheers
> > >
> > > On Fri, Sep 23, 2016 at 5:15 PM, Stack <st...@duboce.net> wrote:
> > >
> > > > (Moved out of the Master doing MR DISCUSSION)
> > > >
> > > > On Fri, Sep 23, 2016 at 12:24 PM, Vladimir Rodionov <
> > > > vladrodio...@gmail.com>
> > > > wrote:
> > > >
> > > > > >>  -1 on that backup be in core hbase
> > > > >
> > > > > Not sure I understand what it means.
> > > > >
> > > > > Sorry for the imprecision.
> > > >
> > > > The -1 is NOT against backup/restore. I am -1 on MR as a dependency
> and
> > > so
> > > > -1 on the Master running backup/restore MR jobs, even if optional.
> > > >
> > > > Master should not depend on MR. We've gone out of our way to avoid
> > taking
> > > > MR on as dependency in the past. Seems late in the game for us to
> > change
> > > > our opinion on this. If we didn't do it for distributed log
> splitting,
> > or
> > > > MOB, why would we do it to support an optional backup/restore?
> > > >
> > > > I have opinions on the questions below -- i.e. that Master running
> > > > backup/restore is outside of the Master's charge -- but they are not
> > > worth
> > > > much since I've not done much by way of review or contrib to
> > > backup/restore
> > > > other than to try it as a 'user' so I'll keep them to myself until I
> > do.
> > > I
> > > > only came out from under my shell to participate on the MR as
> > dependency
> > > > chat.
> > > >
> > > > Thanks,
> > > > M
> > > >
> > > >
> > > > 1. We are not allowed to use Master to orchestrate the whole process?
> > > >
> > > >
> > > > We
> > > > > have already brought up all advantages of using
> > > > >Master and distributed procedures for backup and restore.
> > > > >
> > > > >
> > > > > Downside of moving this to client tool is lack of fault tolerance:
> > > > >  1.1 Client won't be 

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Matteo Bertozzi
as far as I understand the code, you don't need procedure for the export
itself.
the export operation is already idempotent, since you are just copying
files.
if the file exist and is complete (check length, checksum, ...) you can
skip it,
otherwise you'll send it over again.

you need the proc for taking the backup and restoring,
because you want to complete the operation and end up with a consistent
state
across the multiple components you are updating (meta, fs, ...)
but again, for export you can just run the tool over and over until the
operation succeed, and that should be ok.



Matteo


On Sat, Sep 24, 2016 at 6:54 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Master is involved in this discussion because currently only Master
> instantiates ProcedureExecutor which runs the 3 Procedures for backup /
> restore.
>
> What if an optional standalone service which hosts ProcedureExecutor is
> used for this purpose ?
> Would that have better chance of giving us middle ground so that we can
> move this forward ?
>
> Cheers
>
> On Fri, Sep 23, 2016 at 5:15 PM, Stack <st...@duboce.net> wrote:
>
> > (Moved out of the Master doing MR DISCUSSION)
> >
> > On Fri, Sep 23, 2016 at 12:24 PM, Vladimir Rodionov <
> > vladrodio...@gmail.com>
> > wrote:
> >
> > > >>  -1 on that backup be in core hbase
> > >
> > > Not sure I understand what it means.
> > >
> > > Sorry for the imprecision.
> >
> > The -1 is NOT against backup/restore. I am -1 on MR as a dependency and
> so
> > -1 on the Master running backup/restore MR jobs, even if optional.
> >
> > Master should not depend on MR. We've gone out of our way to avoid taking
> > MR on as dependency in the past. Seems late in the game for us to change
> > our opinion on this. If we didn't do it for distributed log splitting, or
> > MOB, why would we do it to support an optional backup/restore?
> >
> > I have opinions on the questions below -- i.e. that Master running
> > backup/restore is outside of the Master's charge -- but they are not
> worth
> > much since I've not done much by way of review or contrib to
> backup/restore
> > other than to try it as a 'user' so I'll keep them to myself until I do.
> I
> > only came out from under my shell to participate on the MR as dependency
> > chat.
> >
> > Thanks,
> > M
> >
> >
> > 1. We are not allowed to use Master to orchestrate the whole process?
> >
> >
> > We
> > > have already brought up all advantages of using
> > >Master and distributed procedures for backup and restore.
> > >
> > >
> > > Downside of moving this to client tool is lack of fault tolerance:
> > >  1.1 Client won't be allowed to do any operations, that can,
> potentially
> > > affect
> > > cluster, such as disabling splits/merges, balancer.
> > >  1.2 In case of client failure who will be doing the whole rollback
> > stuff?
> > > We are trying to make it atomic.
> > >
> > > Security is not clear.
> >
> >
> >
> > 2. We are not allowed to modify code of existing HBase core classes (what
> > > does core mean anyway)?
> > >
> > >
> >
> >
> > > 3. We are not allowed to create backup system table (hbase:backup) in a
> > > system space? Only in user space? The table is global.
> > >
> >
> >
> > > 2. is critical. Despite the fact, that 95% of code is new, we have
> > touched,
> > > of course some existing HBase code.
> > > 3. is not that critical, of course we can move backup system into user
> > > space.
> > >
> > > And finally, will moving backup into external tool give us +1 from
> stack?
> > >
> > > -Vlad
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Sep 23, 2016 at 11:26 AM, Stack <st...@duboce.net> wrote:
> > >
> > > > On Fri, Sep 23, 2016 at 11:22 AM, Vladimir Rodionov <
> > > > vladrodio...@gmail.com>
> > > > wrote:
> > > >
> > > > > >> + MR is dead
> > > > >
> > > > > Does MR know that? :)
> > > > >
> > > > > Again. With all due respect, stack - still no suggestions what
> should
> > > we
> > > > > use for "bulk data move and transformation" instead of MR?
> > > > >
> > > >
> > > > Use whatever distributed engine suits your fancy -- MR, Spark,
> > > distributed
> > > >

[jira] [Created] (HBASE-16697) bump TestRegionServerMetrics to LargeTests

2016-09-23 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16697:
---

 Summary: bump TestRegionServerMetrics to LargeTests
 Key: HBASE-16697
 URL: https://issues.apache.org/jira/browse/HBASE-16697
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


TestRegionServerMetrics keeps failing because it exceed the MediumTests time 
limit. bump it to Large



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16695) Procedure v2 - Support for parent holding locks

2016-09-23 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16695:
---

 Summary: Procedure v2 - Support for parent holding locks
 Key: HBASE-16695
 URL: https://issues.apache.org/jira/browse/HBASE-16695
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Add the logic to allow child procs to be executed when the parent is holding 
the xlock. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] MR jobs started by Master or RS

2016-09-23 Thread Matteo Bertozzi
Tool?
> > > > > > > >
> > > > > > > > On Sep 22, 2016, at 7:57 PM, Vladimir Rodionov <
> > > > > vladrodio...@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >>>> In our production cluster,  it is a common case we just
> have
> > > > HDFS
> > > > > > and
> > > > > > > >>>> HBase deployed.
> > > > > > > >>>> If our Master/RS depend on MR framework (especially some
> > > > features
> > > > > we
> > > > > > > >>>> have not used at all),  it introduced another cost for
> > > maintain.
> > > > > I
> > > > > > > >>>> don't think it is a good idea.
> > > > > > > >>
> > > > > > > >> So , you are not backup users in this case. Many our
> customers
> > > > have
> > > > > > full
> > > > > > > >> stack deployed and
> > > > > > > >> want see backup to be a standard feature. Besides this,
> > nothing
> > > > will
> > > > > > > happen
> > > > > > > >> in your cluster
> > > > > > > >> if you won't be doing backups.
> > > > > > > >>
> > > > > > > >> This discussion (we do not want see M/R dependency) goes to
> > > > nowhere.
> > > > > > We
> > > > > > > >> asked already, at least twice, to suggest another framework
> > > (other
> > > > > > than
> > > > > > > M/R)
> > > > > > > >> for bulk data copy with *conversion*. Still waiting for
> > > > suggestions.
> > > > > > > >>
> > > > > > > >> -Vlad
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>> On Thu, Sep 22, 2016 at 7:49 PM, Ted Yu <
> yuzhih...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >>>
> > > > > > > >>> If MR framework is not deployed in the cluster, hbase still
> > > > > functions
> > > > > > > >>> normally (post merge).
> > > > > > > >>>
> > > > > > > >>> In terms of build time dependency, we have long been
> > depending
> > > on
> > > > > > > >>> mapreduce. Take a look at ExportSnapshot.
> > > > > > > >>>
> > > > > > > >>> Cheers
> > > > > > > >>>
> > > > > > > >>> On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <
> > > > > heng.chen.1...@gmail.com
> > > > > > >
> > > > > > > >>> wrote:
> > > > > > > >>>
> > > > > > > >>>> In our production cluster,  it is a common case we just
> have
> > > > HDFS
> > > > > > and
> > > > > > > >>>> HBase deployed.
> > > > > > > >>>> If our Master/RS depend on MR framework (especially some
> > > > features
> > > > > we
> > > > > > > >>>> have not used at all),  it introduced another cost for
> > > maintain.
> > > > > I
> > > > > > > >>>> don't think it is a good idea.
> > > > > > > >>>>
> > > > > > > >>>> 2016-09-23 10:28 GMT+08:00 张铎 <palomino...@gmail.com>:
> > > > > > > >>>>> To be specific, for example, our nice Backup/Restore
> > feature,
> > > > if
> > > > > we
> > > > > > > >>> think
> > > > > > > >>>>> this is not a core feature of HBase, then we could make
> it
> > > > depend
> > > > > > on
> > > > > > > >>> MR,
> > > > > > > >>>>> and start a standalone BackupManager instance that
> submits
> > MR
> > > > > jobs
> > > > >

Re: [DISCUSSION] MR jobs started by Master or RS

2016-09-22 Thread Matteo Bertozzi
just a remark. my query was not about tools using MR (everyone i think is
ok with those).
the topic was about: "are we ok with running MR jobs from Master and RSs
code?" since this will be the first time we do this

Matteo


On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das <d...@hortonworks.com> wrote:

> Very much agree; for tools like ExportSnapshot / Backup / Restore, it's
> fine to be dependent on MR. MR is the right framework for such. We should
> also do compactions using MR (just saying :) )
> 
> From: Ted Yu <yuzhih...@gmail.com>
> Sent: Thursday, September 22, 2016 2:00 PM
> To: dev@hbase.apache.org
> Subject: Re: [DISCUSSION] MR jobs started by Master or RS
>
> I agree - backup / restore is in the same category as import / export.
>
> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell <andrew.purt...@gmail.com>
> wrote:
>
> > Backup is extra tooling around core in my opinion. Like import or export.
> > Or the optional MOB tool. It's fine.
> >
> > > On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi <mberto...@apache.org>
> > wrote:
> > >
> > > What's the latest opinion around running MR jobs from hbase (Master or
> > RS)?
> > >
> > > I remember in the past that there was discussion about not having MR
> has
> > > direct dependency of hbase.
> > >
> > > I think some of discussion where around MOB that had a MR job to
> compact,
> > > that later was transformed in a non-MR job to be merged, I think we
> had a
> > > similar discussion for log split/replay.
> > >
> > > the latest is the new Backup feature (HBASE-7912), that runs a MR job
> > from
> > > the master to copy data or restore data.
> > > (backup is also "not really core" as in.. if you don't use backup
> you'll
> > > not end up running MR jobs, but this was probably true for MOB as in
> "if
> > > you don't enable MOB you don't need MR")
> > >
> > > any thoughts? do we a rule that says "we don't want to have hbase run
> MR
> > > jobs, only tool started manually by the user can do that". or can we
> > start
> > > adding MR calls around without problems?
> >
>


[DISCUSSION] MR jobs started by Master or RS

2016-09-22 Thread Matteo Bertozzi
What's the latest opinion around running MR jobs from hbase (Master or RS)?

I remember in the past that there was discussion about not having MR has
direct dependency of hbase.

I think some of discussion where around MOB that had a MR job to compact,
that later was transformed in a non-MR job to be merged, I think we had a
similar discussion for log split/replay.

the latest is the new Backup feature (HBASE-7912), that runs a MR job from
the master to copy data or restore data.
(backup is also "not really core" as in.. if you don't use backup you'll
not end up running MR jobs, but this was probably true for MOB as in "if
you don't enable MOB you don't need MR")

any thoughts? do we a rule that says "we don't want to have hbase run MR
jobs, only tool started manually by the user can do that". or can we start
adding MR calls around without problems?


[jira] [Created] (HBASE-16688) Split TestMasterFailoverWithProcedures

2016-09-22 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16688:
---

 Summary: Split TestMasterFailoverWithProcedures
 Key: HBASE-16688
 URL: https://issues.apache.org/jira/browse/HBASE-16688
 Project: HBase
  Issue Type: Bug
  Components: proc-v2, test
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


extract the WAL lease tests from the TestMasterFailoverWithProcedures. leaving 
TestMasterFailoverWithProcedures with only the proc test on master failover



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16671) Split TestExportSnapshot

2016-09-21 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16671:
---

 Summary: Split TestExportSnapshot
 Key: HBASE-16671
 URL: https://issues.apache.org/jira/browse/HBASE-16671
 Project: HBase
  Issue Type: Test
  Components: snapshots, test
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0
 Attachments: HBASE-16671-v0.patch

TestExportSnapshot contains 3 type of tests. 
 - MiniCluster creating a table, taking a snapshot and running export
 - Mocked snapshot running export
 - tool helpers tests

since now we have everything packed in a single test. 2 and 3 ended up having a 
before and after that is creating a table and taking a snapshot which is not 
used. Move those tests out and cut some time from the test. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2016-09-21 Thread Matteo Bertozzi
let me do a "mega patch" review pass.
Is this the latest? https://reviews.apache.org/r/51823/

Matteo


On Wed, Sep 21, 2016 at 7:43 AM, Ted Yu  wrote:

> Are there more (review) comments ?
>
> Thanks
>
> On Tue, Sep 20, 2016 at 10:02 AM, Devaraj Das 
> wrote:
>
> > Just reviving this thread. Thanks Sean, Stack, Dima, and others for the
> > thorough reviews and testing. Thanks Ted and Vlad for taking care of the
> > feedback. Are we all good to do the merge now? Rather do sooner than
> later.
> > 
> > From: saint@gmail.com  on behalf of Stack <
> > st...@duboce.net>
> > Sent: Monday, September 12, 2016 1:18 PM
> > To: HBase Dev List
> > Subject: Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912
> >
> > On Mon, Sep 12, 2016 at 12:19 PM, Ted Yu  wrote:
> >
> > > Mega patch (rev 18) is on HBASE-14123.
> > >
> > > Please comment on HBASE-14123 on how you want to review.
> > >
> >
> >
> > Yeah. That was my lost tab. Last rb was 6 months ago. Suggest updating
> it.
> > RB is pretty good for review. Patch is only 1.5M so should be fine.
> >
> > St.Ack
> >
> >
> > >
> > > Thanks
> > >
> > > On Mon, Sep 12, 2016 at 12:15 PM, Stack  wrote:
> > >
> > > > On review of the 'patch', do I just compare the branch to master or
> is
> > > > there a megapatch posted somewhere (I think I saw one but it seemed
> > stale
> > > > and then I 'lost' the tab). Sorry for dumb question.
> > > > St.Ack
> > > >
> > > > On Mon, Sep 12, 2016 at 12:01 PM, Stack  wrote:
> > > >
> > > > > Late to the game. A few comments after rereading this thread as a
> > > 'user'.
> > > > >
> > > > > + Before merge, a user-facing feature like this should work (If
> this
> > is
> > > > "higher-bar
> > > > > for new features", bring it on -- smile).
> > > > > + As a user, I tried the branch with tools after reviewing the
> > > > just-posted
> > > > > doc. I had an 'interesting' experience (left comments up on
> issue). I
> > > > think
> > > > > the tooling/doc. important to get right. If it breaks easily or is
> > > > > inconsistent (or lacks 'polish'), operators will judge the whole
> > > > > backup/restore tooling chain as not trustworthy and abandon it.
> Lets
> > > not
> > > > > have this happen to this feature.
> > > > > + Matteo's suggestion (with a helpful starter list) that there
> needs
> > to
> > > > be
> > > > > explicit qualification on what is actually being delivered --
> > > including a
> > > > > listing of limitations (some look serious such as data bleed from
> > other
> > > > > regions in WALs, but maybe I don't care for my use case...) --
> needs
> > to
> > > > > accompany the merge. Lets fold them into the user doc. in the
> > technical
> > > > > overview area as suggested so user expectations are properly
> managed
> > > > > (otherwise, they expect the world and will just give up when we
> fall
> > > > > short). Vladimir did a list of what is in each of the phases above
> > > which
> > > > > would serve as a good start.
> > > > > + Is this feature 'experimental' (Matteo asks above). I'd prefer it
> > is
> > > > > not. If it is, it should be labelled all over that it is so. I see
> > > > current
> > > > > state called out as a '... technical preview feature'. Does this
> mean
> > > > > not-for-users?
> > > > >
> > > > > St.Ack
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Sep 12, 2016 at 8:03 AM, Ted Yu 
> wrote:
> > > > >
> > > > >> Sean:
> > > > >> Do you have more comments ?
> > > > >>
> > > > >> Cheers
> > > > >>
> > > > >> On Fri, Sep 9, 2016 at 1:42 PM, Vladimir Rodionov <
> > > > vladrodio...@gmail.com
> > > > >> >
> > > > >> wrote:
> > > > >>
> > > > >> > Sean,
> > > > >> >
> > > > >> > Backup/Restore can fail due to various reasons: network outage
> > > > (cluster
> > > > >> > wide), various time-outs in HBase and HDFS layer, M/R failure
> due
> > to
> > > > >> "HDFS
> > > > >> > exceeded quota", user error (manual deletion of data) and so on
> so
> > > on.
> > > > >> That
> > > > >> > is impossible to enumerate all possible types of failures in a
> > > > >> distributed
> > > > >> > system - that is not our goal/task.
> > > > >> >
> > > > >> > We focus completely on backup system table consistency in a
> > presence
> > > > of
> > > > >> any
> > > > >> > type of failure. That is what I call "tolerance to failures".
> > > > >> >
> > > > >> > On a failure:
> > > > >> >
> > > > >> > BACKUP. All backup system information (prior to backup) will be
> > > > restored
> > > > >> > and all temporary data, related to a failed session, in HDFS
> will
> > be
> > > > >> > deleted
> > > > >> > RESTORE. We do not care about system data, because restore does
> > not
> > > > >> change
> > > > >> > it. Temporary data in HDFS will be cleaned up and table will be
> > in a
> > 

[jira] [Created] (HBASE-16634) Speedup TestExportSnapshot

2016-09-14 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16634:
---

 Summary: Speedup TestExportSnapshot
 Key: HBASE-16634
 URL: https://issues.apache.org/jira/browse/HBASE-16634
 Project: HBase
  Issue Type: Test
  Components: snapshots, test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.4.0
 Attachments: HBASE-16634-v0.patch

TestExportSnapshot is a long and heavy test since has to take snapshot, export 
(via MR), restore them and so on. this in general make the test flaky due to 
slowness.

let's try to speed it up a bit by reducing the number of regions of the table 
we are testing on 9 regions at the moment we can reduce it at least to 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-16619) Following the quick start guide fails to launch master due to SecureBulkLoadEndpoint

2016-09-12 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-16619.
-
Resolution: Duplicate

This is the same as HBASE-16292, HBASE-16427 and the fix will be by  HBASE-16257

> Following the quick start guide fails to launch master due to 
> SecureBulkLoadEndpoint
> 
>
> Key: HBASE-16619
> URL: https://issues.apache.org/jira/browse/HBASE-16619
> Project: HBase
>  Issue Type: Bug
>  Components: documentation, master
>Affects Versions: 2.0.0
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Critical
>
> Followed the quick start guide for standalone hbase instance. HMaster 
> instance doesn't start. Log file has following error:
> {code}
> 2016-09-12 11:53:43,580 ERROR [main] regionserver.SecureBulkLoadManager: 
> Failed to create or set permission on staging directory 
> /user/user1/hbase-staging
> ExitCodeException exitCode=1: chmod: /user/user1/hbase-staging: No such file 
> or directory
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
> at org.apache.hadoop.util.Shell.run(Shell.java:456)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:728)
> at 
> org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
> at 
> org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager.start(SecureBulkLoadManager.java:124)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:626)
> at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:409)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:307)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:221)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:156)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:226)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2469)
> 2016-09-12 11:53:43,581 ERROR [main] master.HMasterCommandLine: Master exiting
> {code}
> Workaround for this is to add following config point to conf/hbase-site.xml:
> {code}
>   
> hbase.bulkload.staging.dir
> file:///Users/user1/dev/apache/hbase-standalone/staging
>   
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16618) Procedure v2 - Add base class for table and ns procedures

2016-09-12 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16618:
---

 Summary: Procedure v2 - Add base class for table and ns procedures
 Key: HBASE-16618
 URL: https://issues.apache.org/jira/browse/HBASE-16618
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2
Affects Versions: 2.0.0, 1.4.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.4.0


Now that we have a bunch of procedures implemented, we can add a base class for 
the Table and Namespace procedure with a couple of the common pattern used 
(e.g. basic locking, toString, ...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16617) Procedure v2 - Improvements

2016-09-12 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16617:
---

 Summary: Procedure v2 - Improvements
 Key: HBASE-16617
 URL: https://issues.apache.org/jira/browse/HBASE-16617
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor


Umbrella jira for all the jiras that are adding helpers, utilities and 
improvement to the proc framework or to the master procs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2016-09-09 Thread Matteo Bertozzi
we should probably have a "current limitations" section in the user guide
(maybe near the technical details),
some of this stuff may be in the final 2.0 since some tasks are marked as
phase3,
but I think is important to mention stuff like:
 - if you write to the table with Durability.SKIP_WALS your data will not
be in the incremental-backup
 - if you bulkload files that data will not be in the incremental backup
(HBASE-14417)
 - the incremental backup will not only contains the data of the table you
specified but also the regions from other tables that are on the same set
of RSs (HBASE-14141) ...maybe a note about security around this topic
 - the incremental backup will not contains just the "latest row" between
backup A and B, but it will also contains all the updates occurred in
between. but the restore does not allow you to restore up to a certain
point in time, the restore will always be up to the "latest backup point".
 - you should limit the number of "incremental" up to N (or maybe SIZE), to
avoid replay time becoming the bottleneck. (HBASE-14135)


On Fri, Sep 9, 2016 at 12:25 PM, Vladimir Rodionov 
wrote:

> User Guide, prepared by our tech writer Frank Welsh, was attached to
> HBASE-7912.
>
> -Vlad
>
> On Fri, Sep 9, 2016 at 12:16 PM, Vladimir Rodionov  >
> wrote:
>
> > Do not worry Sean, doc is coming today as a preview and our writer Frank
> > will be working on a putting  it into Apache repo. Timeline depends on
> > Franks schedule but I hope we will get it rather sooner than later.
> >
> > As for failure testing, we are focusing only on a consistent state of
> > backup system data in a presence of any type of failures, We are not
> going
> > to implement  anything more "fancy", than that. We allow both: backup and
> > restore to fail. What we do not allow is to have system data corrupted.
> > Will it suffice for you? Do you have any other concerns, you want us to
> > address?
> >
> > -Vlad
> >
> >
> > On Fri, Sep 9, 2016 at 10:56 AM, Sean Busbey  wrote:
> >
> >> "docs will come to Apache soon" does not address my concern around docs
> at
> >> all, unless said docs have already made it into the project repo. I
> don't
> >> want third party resources for using a major and important feature of
> the
> >> project, I want us to provide end users with what they need to get the
> job
> >> done.
> >>
> >> I see some calls for patience on the failure testing, but the appeal to
> us
> >> having done a bad job of requiring proper tests of previous features
> just
> >> makes me more concerned about not getting them here. I don't want to set
> >> yet another bad example that will then be pointed to in the future.
> >>
> >> On Sep 8, 2016 10:50, "Ted Yu"  wrote:
> >>
> >> > Is there any concern which is not addressed ?
> >> >
> >> > Do we need another Vote thread ?
> >> >
> >> > Thanks
> >> >
> >> > On Thu, Sep 8, 2016 at 9:21 AM, Andrew Purtell 
> >> > wrote:
> >> >
> >> > > Vlad,
> >> > >
> >> > > I apologize for using the term 'half-baked' in a way that could
> seem a
> >> > > description of HBASE-7912. I meant that as a general hypothetical.
> >> > >
> >> > > On Wed, Sep 7, 2016 at 9:36 AM, Vladimir Rodionov <
> >> > vladrodio...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > >> I'm not sure that "There is already lots of half-baked code in
> >> the
> >> > > > branch,
> >> > > > so what's the harm in adding more?"
> >> > > >
> >> > > > I meant - not production - ready yet. This is 2.0 development
> branch
> >> > and,
> >> > > > hence many features are in works,
> >> > > > not being tested well etc. I do not consider backup as half baked
> >> > > feature -
> >> > > > it has passed our internal QA and has very good doc, which we will
> >> > > provide
> >> > > > to Apache shortly.
> >> > > >
> >> > > > -Vlad
> >> > > >
> >> > > > On Wed, Sep 7, 2016 at 9:13 AM, Andrew Purtell <
> apurt...@apache.org
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > We shouldn't admit half baked changes that won't be finished.
> >> However
> >> > > in
> >> > > > > this case the crew working on this feature are long timers and
> >> less
> >> > > > likely
> >> > > > > than just about anyone to leave something in a half baked state.
> >> Of
> >> > > > course
> >> > > > > there is no guarantee how anything will turn out, but I am
> >> willing to
> >> > > > take
> >> > > > > a little on faith if they feel their best path forward now is to
> >> > merge
> >> > > to
> >> > > > > trunk. I only wish I had bandwidth to have done some real
> kicking
> >> of
> >> > > the
> >> > > > > tires by now. Maybe this week.
> >> > > > >
> >> > > > > (Yes, I'm using some of that time for this email :-) but I type
> >> > fast.)
> >> > > > >
> >> > > > > That said, I would like to agitate for making 2.0 more real and
> >> spend
> >> > > > some
> >> > > > > time on it now that I'm winding down with 0.98. I think that
> means
> >> > > > > 

[jira] [Created] (HBASE-16587) Procedure v2 - Cleanup suspended proc execution

2016-09-08 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16587:
---

 Summary: Procedure v2 - Cleanup suspended proc execution
 Key: HBASE-16587
 URL: https://issues.apache.org/jira/browse/HBASE-16587
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


for procedures like the assignment or the lock one we need to be able to hold 
on locks while suspended. At the moment the way to do that is up to the proc 
implementation. This patch moves the logic to the base Procedure and 
ProcedureExecutor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16586) Procedure v2 - Cleanup sched wait/lock semantic

2016-09-08 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16586:
---

 Summary: Procedure v2 - Cleanup sched wait/lock semantic
 Key: HBASE-16586
 URL: https://issues.apache.org/jira/browse/HBASE-16586
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


For some reason waitEvent() and waitRegion() had a mismatching return value. 
unity the wait semantic in being: return true we wait, false we don't wait.
procdures using hasLock = waitRegion() should change to hasLock = 
!waitRegion(). at the moment we have only DispatchMergingRegionsProcedure using 
it (in master).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: branch-2: Judgement Day

2016-09-07 Thread Matteo Bertozzi
the idea of end of semptember is to cut the branch, to be able to work on
fixing stuff.
AM, backup, offheap/protobuf/... will probably not be fully finished. but
the idea is to have testing running and fixes and finishes on those
feature.

the fs API is probably ok to backport even at late stage, and it can
probably be done on a branch-1.x since it's all hidden behind
server/private interface.
but major stuff that have no code yet, are probably not going to make 2.0


Matteo


On Wed, Sep 7, 2016 at 9:56 AM, Sean Busbey <bus...@apache.org> wrote:

> New filesystem layout def not ready by end of sep. API changes maybe.
>
> On Sep 7, 2016 11:47, "Dima Spivak" <dimaspi...@apache.org> wrote:
>
> > New filesystem layout?
> >
> > On Wednesday, September 7, 2016, Matteo Bertozzi <
> theo.berto...@gmail.com>
> > wrote:
> >
> > > my idea was to cut branch-2 when we have the new AM.
> > > my guess is that a fully working AM that we can consider for inclusion
> > will
> > > bring us to end of september.
> > >
> > > other stuff in 2.0 that are in-progress are
> > >  - the offheap work/memstore-compaction/protobuf3
> > >  - the backup work
> > >  - HLC support
> > >
> > > anything else big am I missing?
> > >
> > > Matteo
> > >
> > >
> > > On Wed, Sep 7, 2016 at 9:02 AM, Andrew Purtell <apurt...@apache.org
> > > <javascript:;>> wrote:
> > >
> > > > Yes, let's kill it.
> > > >
> > > > We should talk about 2.0. I'm winding down RM of 0.98 branch and the
> > time
> > > > I've had available for that can go into something, IMHO, more useful
> > for
> > > > project momentum like making 2.0 real. I'd like to help. Can we /
> > should
> > > we
> > > > branch for 2.0 relatively soon?
> > > >
> > > >
> > > > On Wed, Sep 7, 2016 at 8:58 AM, Dima Spivak <dimaspi...@apache.org
> > > <javascript:;>> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I remember that Nick pointed this out a while back, but I see a
> stray
> > > > > branch-2 on GitHub again [1]. Can we kill it?
> > > > >
> > > > > 1. https://github.com/apache/hbase/commits/branch-2
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >- Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > > >
> > >
> >
> >
> > --
> > -Dima
> >
>


Re: branch-2: Judgement Day

2016-09-07 Thread Matteo Bertozzi
my idea was to cut branch-2 when we have the new AM.
my guess is that a fully working AM that we can consider for inclusion will
bring us to end of september.

other stuff in 2.0 that are in-progress are
 - the offheap work/memstore-compaction/protobuf3
 - the backup work
 - HLC support

anything else big am I missing?

Matteo


On Wed, Sep 7, 2016 at 9:02 AM, Andrew Purtell  wrote:

> Yes, let's kill it.
>
> We should talk about 2.0. I'm winding down RM of 0.98 branch and the time
> I've had available for that can go into something, IMHO, more useful for
> project momentum like making 2.0 real. I'd like to help. Can we / should we
> branch for 2.0 relatively soon?
>
>
> On Wed, Sep 7, 2016 at 8:58 AM, Dima Spivak  wrote:
>
> > Hi all,
> >
> > I remember that Nick pointed this out a while back, but I see a stray
> > branch-2 on GitHub again [1]. Can we kill it?
> >
> > 1. https://github.com/apache/hbase/commits/branch-2
> >
>
>
>
> --
> Best regards,
>
>- Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>


[jira] [Resolved] (HBASE-16519) Procedure v2 - Avoid sync wait on DDLs operation

2016-09-02 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-16519.
-
Resolution: Fixed

> Procedure v2 - Avoid sync wait on DDLs operation
> 
>
> Key: HBASE-16519
> URL: https://issues.apache.org/jira/browse/HBASE-16519
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2
>Affects Versions: 2.0.0
>    Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
> Attachments: HBASE-16519-addendum.patch, HBASE-16519-v0.patch
>
>
> Some operation ModifyColumnFamily, AddDeleteColumnFamily, DeleteColumnFamily, 
> ModifyTable, TruncateTable are still synchronous on the master side. with a 
> wait until the operation completes before returning. 
> this was done to keep the sync behavior for old client. but instead of using 
> the procLatch which recognize the client version and decide if the operation 
> should be sync or not it just always wait. making the client side proc fault 
> tolerance ineffective.
> also the add/delete/modifyColumnFamily operation does not seems to follow the 
> Async() naming in master. and the comment claim to be async but everyone uses 
> them as sync. (this is something from HBASE-13538)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16552) MiniHBaseCluster#getServerWith() does not ignore stopped RSs

2016-09-02 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16552:
---

 Summary: MiniHBaseCluster#getServerWith() does not ignore stopped 
RSs
 Key: HBASE-16552
 URL: https://issues.apache.org/jira/browse/HBASE-16552
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.98.21, 1.2.2, 1.1.5, 2.0.0, 1.3.0, 1.4.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


MiniHBaseCluster#getServerWith() does not ignore stopped RSs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16551) Cleanup SplitLogManager and CatalogManager

2016-09-01 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16551:
---

 Summary: Cleanup SplitLogManager and CatalogManager
 Key: HBASE-16551
 URL: https://issues.apache.org/jira/browse/HBASE-16551
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


couple of cleanups around SplitLogManager and CatalogManager:
replace all copy-pasted cast in SplitLogManager with one call to an helper 
method
remove Server, MasterServices, Stoppable since we call the class in only one 
place and is called as master, master, master
reuse MockNoopMasterServices instead of creating the Server/MasterServices 
classes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16550) Procedure v2 - Add AM compatibility for 2.x Master and 1.x RSs

2016-09-01 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16550:
---

 Summary: Procedure v2 - Add AM compatibility for 2.x Master and 
1.x RSs
 Key: HBASE-16550
 URL: https://issues.apache.org/jira/browse/HBASE-16550
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, Region Assignment
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Core AM HBASE-14614 relies on the RS to be using zkless assignment. Add support 
for the old a plain non zkless AM



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16549) Procedure v2 - Add new AM metrics

2016-09-01 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16549:
---

 Summary: Procedure v2 - Add new AM metrics
 Key: HBASE-16549
 URL: https://issues.apache.org/jira/browse/HBASE-16549
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, Region Assignment
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


With the new AM we can add a bunch of metrics
 - assign/unassign time
 - server crash time
 - grouping related metrics? (how many batch we do, and similar?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16548) Procedure v2 - Add handling of split/merge region transition to the new AM

2016-09-01 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16548:
---

 Summary: Procedure v2 - Add handling of split/merge region 
transition to the new AM
 Key: HBASE-16548
 URL: https://issues.apache.org/jira/browse/HBASE-16548
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, Region Assignment
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Core Assignment HBASE-14614 does not handle split and merge in 
reportRegionStateTransition(). Handle the transition request!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14617) Procedure V2: Update ServerCrashProcedure to interact with assignment procedures

2016-09-01 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-14617.
-
Resolution: Invalid

Part of HBASE-14616

> Procedure V2: Update ServerCrashProcedure to interact with assignment 
> procedures
> 
>
> Key: HBASE-14617
> URL: https://issues.apache.org/jira/browse/HBASE-14617
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0
>Reporter: Stephen Yuan Jiang
>    Assignee: Matteo Bertozzi
> Fix For: 2.0.0
>
>
> this JIRA tracks the update of ServerCrashProcedure to interact with 
> assignment procedures.  This is very critical (and most tricky part of) work 
> that deals with dead region server when assignment is happening.
> - remove region server queue when the region server is dead
> - notify assignment procedures that are doing assignment operation in the 
> dead server
> - assign regions in dead server to other RS (we have to deal with RIT, deal 
> with in-progress table DDL)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16543) Separate Create/Modify Table operations from open/reopen regions

2016-09-01 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16543:
---

 Summary: Separate Create/Modify Table operations from open/reopen 
regions
 Key: HBASE-16543
 URL: https://issues.apache.org/jira/browse/HBASE-16543
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
 Fix For: 2.0.0


At the moment create table and modify table operations will trigger an 
open/reopen of the regions inside the DDL operation. 
we should split the operation in two parts
 - create table, enable table regions
 - modify table, reopen table regions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16537) Add tests to verify create/modify table region reopen with missing coprocessor

2016-08-31 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16537:
---

 Summary: Add tests to verify create/modify table region reopen 
with missing coprocessor 
 Key: HBASE-16537
 URL: https://issues.apache.org/jira/browse/HBASE-16537
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2, Region Assignment
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Add tests to cover the case where a table is created or modified and a non 
existent coprocessor is set. 
This should result in a nice exception to the user providing the information 
about the coprocessor not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16533) Procedure v2 - Extract chore from the executor

2016-08-30 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16533:
---

 Summary: Procedure v2 - Extract chore from the executor
 Key: HBASE-16533
 URL: https://issues.apache.org/jira/browse/HBASE-16533
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.4.0


At the moment we have the CompletedProcedureCleaner chore as a special case in 
the executor. let's extract that and allow to have other chores. (I want to use 
it for the AM)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16522) Procedure v2 - Cache system user and avoid IOException

2016-08-29 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16522:
---

 Summary: Procedure v2 - Cache system user and avoid IOException
 Key: HBASE-16522
 URL: https://issues.apache.org/jira/browse/HBASE-16522
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0
 Attachments: HBASE-16522-v0.patch

We can cache the system user and avoid the IOException that we have to carry 
around when we create procedures



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16519) Procedure v2 - Avoid sync wait on DDLs operation

2016-08-29 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16519:
---

 Summary: Procedure v2 - Avoid sync wait on DDLs operation
 Key: HBASE-16519
 URL: https://issues.apache.org/jira/browse/HBASE-16519
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Some operation ModifyColumnFamily, AddDeleteColumnFamily, DeleteColumnFamily, 
ModifyTable, TruncateTable are still synchronous on the master side. with a 
wait until the operation completes before returning. 
this was done to keep the sync behavior for old client. but instead of using 
the procLatch which recognize the client version and decide if the operation 
should be sync or not it just always wait. making the client side proc fault 
tolerance ineffective.

also the add/delete/modifyColumnFamily operation does not seems to follow the 
Async() naming in master. and the comment claim to be async but everyone uses 
them as sync. (this is something from HBASE-13538)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16508) Move UnexpectedStateException to common

2016-08-26 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16508:
---

 Summary: Move UnexpectedStateException to common
 Key: HBASE-16508
 URL: https://issues.apache.org/jira/browse/HBASE-16508
 Project: HBase
  Issue Type: Improvement
Affects Versions: 2.0.0, 1.4.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.4.0


UnexpectedStateException seems to be useful enough (or at least I want to use 
it) to be moved in common. at the moment is used only by the Memstore classes 
and it lives in the regionserver package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16507) Procedure v2 - Force DDL operation to always roll forward

2016-08-26 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16507:
---

 Summary: Procedure v2 - Force DDL operation to always roll forward
 Key: HBASE-16507
 URL: https://issues.apache.org/jira/browse/HBASE-16507
 Project: HBase
  Issue Type: Sub-task
  Components: master, proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


Having rollback for DDLs was a bad idea. 
and it turns out to be an unexpected behavior for the user. 

DDLs only have transient errors (e.g. zk, hdfs, meta down)
if we abort/rollback on a transient failure the user will get a failure,
and it is not clear why the user needs to retry the command when the system can 
do that.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16487) Remove Class.fromName("..PrefixTreeCodec") from TableMapReduceUtil addHBaseDependencyJars

2016-08-23 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16487:
---

 Summary: Remove Class.fromName("..PrefixTreeCodec") from 
TableMapReduceUtil addHBaseDependencyJars
 Key: HBASE-16487
 URL: https://issues.apache.org/jira/browse/HBASE-16487
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 1.2.2, 2.0.0, 1.3.0, 1.4.0
Reporter: Matteo Bertozzi


HBASE-15152 included the prefix tree module as dependency to 
TableMapReduceUtil. but the hardcoded string of the class name was wrong. 
HBASE-16360 fixed the hardcoded string. 

but, I was looking at the comment above and I can't figure out where is the 
circular dependency.
{code}
// PrefixTreeCodec is part of the hbase-prefix-tree module. If not included in 
MR jobs jar
// dependencies, MR jobs that write encoded hfiles will fail.
// We used reflection here so to prevent a circular module dependency.
// TODO - if we extract the MR into a module, make it depend on 
hbase-prefix-tree
{code}
from the pom.xml of the prefix-tree module I don't see hbase-server. but I can 
see prefix-tree module in the hbase-server/pom.xml. the TableMapReduceUtil is 
in hbase-server.. so in theory we don't have any circular dependency.
we can just probably drop all that try/catch block with the Class.forName() and 
just simply use org.apache.hadoop.hbase.codec.prefixtree.PrefixTreeCodec as we 
do for the others. 

(or at least we should end up with a test to cover the that Class.fromName() in 
case we rename the PrefixTreeCodec or the namespace in the future and forget to 
update this reference)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16485) Procedure v2 - Add support to addChildProcedure() as last "step" in StateMachineProcedure

2016-08-23 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16485:
---

 Summary: Procedure v2 - Add support to addChildProcedure() as last 
"step" in StateMachineProcedure
 Key: HBASE-16485
 URL: https://issues.apache.org/jira/browse/HBASE-16485
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.3.0, 1.4.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.3.0, 1.4.0
 Attachments: HBASE-16485-v0.patch

HBASE-15371 added the support for adding children to the StateMachineProcedure, 
but there is one limitation to it. a child cannot be added to the last "step" 
of the execution. the current code will silently ignore the child added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-16472) TableNotDisabledException

2016-08-23 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-16472.
-
Resolution: Not A Problem

> TableNotDisabledException
> -
>
> Key: HBASE-16472
> URL: https://issues.apache.org/jira/browse/HBASE-16472
> Project: HBase
>  Issue Type: Bug
>Reporter: Dhruv Singhal
>
> When I created a table in HBase and then tried running create statement in 
> phoenix. Phoenix returned me the following error. 
> procedure.ModifyTableProcedure: Error trying to modify table=t21sample 
> state=MODIFY_TABLE_PREPARE
> org.apache.hadoop.hbase.TableNotDisabledException: t21sample
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.prepareModify(ModifyTableProcedure.java:298)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:98)
> at 
> org.apache.hadoop.hbase.master.procedure.ModifyTableProcedure.executeFromState(ModifyTableProcedure.java:54)
> at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:107)
> at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:400)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:869)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:673)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:626)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:70)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.run(ProcedureExecutor.java:413)
> The same error occurred in the case of altering an existing table. 
> I some how managed to create a table in Phoenix by the same sequence if I 
> disable the table after creating it. The phoenix returned me table disabled 
> error. But after enabling the table in hbase, the table was created in 
> phoenix, same worked for alter statement as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16452) Procedure v2 - Make ProcedureWALPrettyPrinter extend Tool

2016-08-18 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16452:
---

 Summary: Procedure v2 - Make ProcedureWALPrettyPrinter extend Tool
 Key: HBASE-16452
 URL: https://issues.apache.org/jira/browse/HBASE-16452
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


Make the ProcedureWALPrettyPrinter tool extend Tool, and adjust a couple of 
print without new line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16451) Procedure v2 - Test WAL protobuf entry size limit

2016-08-18 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16451:
---

 Summary: Procedure v2 - Test WAL protobuf entry size limit
 Key: HBASE-16451
 URL: https://issues.apache.org/jira/browse/HBASE-16451
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0


Add a test to make sure that we are able to read/write procedures with a big 
"data" size. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-16427) After HBASE-13701, hbase standalone mode start failed due to mkdir failed

2016-08-16 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-16427.
-
Resolution: Duplicate

This is the same as HBASE-16292 and the fix will be in HBASE-16257

> After HBASE-13701,  hbase standalone mode start failed due to mkdir failed
> --
>
> Key: HBASE-16427
> URL: https://issues.apache.org/jira/browse/HBASE-16427
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>
> {code}
> 2016-08-17 10:26:43,305 ERROR [main] regionserver.SecureBulkLoadManager: 
> Failed to create or set permission on staging directory 
> /user/chenheng/hbase-staging
> ExitCodeException exitCode=1: chmod: /user/chenheng/hbase-staging: No such 
> file or directory
>   at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
>   at org.apache.hadoop.util.Shell.run(Shell.java:456)
>   at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
>   at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)
>   at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:728)
>   at 
> org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:502)
>   at 
> org.apache.hadoop.hbase.regionserver.SecureBulkLoadManager.start(SecureBulkLoadManager.java:124)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:626)
>   at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:406)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.(HMasterCommandLine.java:307)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>   at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:140)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:221)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:156)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:226)
>   at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)
>   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2421)
> 2016-08-17 10:26:43,306 ERROR [main] master.HMasterCommandLine: Master exiting
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2016-08-10 Thread Matteo Bertozzi
There are a bunch of builds that have most of the test failing.

Example:
https://builds.apache.org/job/HBase-Trunk_matrix/1392/jdk=JDK%201.7%20(latest),label=yahoo-not-h2/testReport/junit/org.apache.hadoop.hbase/TestLocalHBaseCluster/testLocalHBaseCluster/

from the stack trace looks like the problem is with the jdk name that has
spaces:
the hadoop FsVolumeImpl calls setNameFormat(... + fileName.toString() + ...)
and this seems to not be escaped
so we end up with JDK%25201.7%2520(latest) in the string format and we get
a IllegalFormatPrecisionException: 7

2016-08-10 22:07:46,108 WARN  [DataNode:
[[[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data1/,
[DISK]file:/home/jenkins/jenkins-slave/workspace/HBase-Trunk_matrix/jdk/JDK%25201.7%2520(latest)/label/yahoo-not-h2/hbase-server/target/test-data/e7099624-ecfa-4674-87de-a8733d13b582/dfscluster_10fdcfc3-cd1b-45be-9b5a-9c88f385e6f1/dfs/data/data2/]]
 heartbeating to localhost/127.0.0.1:34629]
datanode.BPServiceActor(831): Unexpected exception in block pool Block
pool  (Datanode Uuid unassigned) service to
localhost/127.0.0.1:34629
java.util.IllegalFormatPrecisionException: 7
at java.util.Formatter$FormatSpecifier.checkText(Formatter.java:2984)
at java.util.Formatter$FormatSpecifier.(Formatter.java:2688)
at java.util.Formatter.parse(Formatter.java:2528)
at java.util.Formatter.format(Formatter.java:2469)
at java.util.Formatter.format(Formatter.java:2423)
at java.lang.String.format(String.java:2792)
at 
com.google.common.util.concurrent.ThreadFactoryBuilder.setNameFormat(ThreadFactoryBuilder.java:68)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.initializeCacheExecutor(FsVolumeImpl.java:140)



Matteo


On Tue, Aug 9, 2016 at 9:55 AM, Stack  wrote:

> Good on you Sean.
> S
>
> On Mon, Aug 8, 2016 at 9:43 PM, Sean Busbey  wrote:
>
> > I updated all of our jobs to use the updated JDK versions from infra.
> > These have spaces in the names, and those names end up in our
> > workspace path, so try to keep an eye out.
> >
> >
> >
> > On Mon, Aug 8, 2016 at 10:42 AM, Sean Busbey 
> wrote:
> > > running in docker is the default now. relying on the default docker
> > > image that comes with Yetus means that our protoc checks are
> > > failing[1].
> > >
> > >
> > > [1]: https://issues.apache.org/jira/browse/HBASE-16373
> > >
> > > On Sat, Aug 6, 2016 at 5:03 PM, Sean Busbey  wrote:
> > >> Hi folks!
> > >>
> > >> this morning I merged the patch that updates us to Yetus 0.3.0[1] and
> > updated the precommit job appropriately. I also changed it to use one of
> > the Java versions post the puppet changes to asf build.
> > >>
> > >> The last three builds look normal (#2975 - #2977). I'm gonna try
> > running things in docker next. I'll email again when I make it the
> default.
> > >>
> > >> [1]: https://issues.apache.org/jira/browse/HBASE-15882
> > >>
> > >> On 2016-06-16 10:43 (-0500), Sean Busbey  wrote:
> > >>> FYI, today our precommit jobs started failing because our chosen jdk
> > >>> (1.7.0.79) disappeared (mentioned on HBASE-16032).
> > >>>
> > >>> Initially we were doing something wrong, namely directly referencing
> > >>> the jenkins build tools area without telling jenkins to give us an
> env
> > >>> variable that stated where the jdk is located. However, after
> > >>> attempting to switch to the appropriate tooling variable for jdk
> > >>> 1.7.0.79, I found that it didn't point to a place that worked.
> > >>>
> > >>> I've now updated the job to rely on the latest 1.7 jdk, which is
> > >>> currently 1.7.0.80. I don't know how often "latest" updates.
> > >>>
> > >>> Personally, I think this is a sign that we need to prioritize
> > >>> HBASE-15882 so that we can switch back to using Docker. I won't have
> > >>> time this week, so if anyone else does please pick up the ticket.
> > >>>
> > >>> On Thu, Mar 17, 2016 at 5:19 PM, Stack  wrote:
> > >>> > Thanks Sean.
> > >>> > St.Ack
> > >>> >
> > >>> > On Wed, Mar 16, 2016 at 12:04 PM, Sean Busbey  >
> > wrote:
> > >>> >
> > >>> >> FYI, I updated the precommit job today to specify that only
> compile
> > time
> > >>> >> checks should be done against jdks other than the primary jdk7
> > instance.
> > >>> >>
> > >>> >> On Mon, Mar 7, 2016 at 8:43 PM, Sean Busbey 
> > wrote:
> > >>> >>
> > >>> >> > I tested things out, and while YETUS-297[1] is present the
> > default runs
> > >>> >> > all plugins that can do multiple jdks against those available
> > (jdk7 and
> > >>> >> > jdk8 in our case).
> > >>> >> >
> > >>> >> > We can configure things to only do a single run of unit tests.
> > They'll be

[jira] [Created] (HBASE-16378) Procedure v2 - Make ProcedureException extend HBaseException

2016-08-08 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16378:
---

 Summary: Procedure v2 - Make ProcedureException extend 
HBaseException
 Key: HBASE-16378
 URL: https://issues.apache.org/jira/browse/HBASE-16378
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.2.2, 1.1.5, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.3.0, 1.1.6, 1.2.3


Make ProcedureException extend HBaseException, so we can avoid stuff like 
HBASE-15591. and avoid try/catch ProcedureException and direct rethrows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16360) TableMapReduceUtil addHBaseDependencyJars has the wrong class name for PrefixTreeCodec

2016-08-04 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16360:
---

 Summary: TableMapReduceUtil addHBaseDependencyJars has the wrong 
class name for PrefixTreeCodec
 Key: HBASE-16360
 URL: https://issues.apache.org/jira/browse/HBASE-16360
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 1.2.2, 1.0.3, 2.0.0, 1.3.0, 1.4.0
Reporter: Matteo Bertozzi


HBASE-15152 included the prefix tree module as dependency to 
TableMapReduceUtil. but the hardcoded string of the class name is wrong. 

{code}
Class.forName("org.apache.hadoop.hbase.code.prefixtree.PrefixTreeCodec");
{code}

should be ".codec." instead of ".code."
{code}
Class.forName("org.apache.hadoop.hbase.codec.prefixtree.PrefixTreeCodec");
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16294) hbck reporting "No HDFS region dir found" found replicas

2016-07-27 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16294:
---

 Summary: hbck reporting "No HDFS region dir found" found replicas
 Key: HBASE-16294
 URL: https://issues.apache.org/jira/browse/HBASE-16294
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 1.2.2, 1.1.5, 1.0.3, 2.0.0, 1.3.0, 1.4.0
Reporter: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 1.2.3


simple test, create a table with replicas and then run hbck. 
we don't filter out the replicas for the loadHdfsRegioninfo()
{noformat}
$ hbase shell
hbase(main):001:0> create 'myTable', 'myCF', {REGION_REPLICATION => '3'}

$ hbase hbck
2016-07-27 13:47:38,090 WARN  [hbasefsck-pool1-t2] util.HBaseFsck: No HDFS 
region dir found: { meta => 
myTable,,1469652448440_0002.9dea3506e09e00910158dc91fa21e550., hdfs => null, 
deployed => 
u1604srv,42895,1469652420413;myTable,,1469652448440_0002.9dea3506e09e00910158dc91fa21e550.,
 replicaId => 2 } meta={ENCODED => 9dea3506e09e00910158dc91fa21e550, NAME => 
'myTable,,1469652448440_0002.9dea3506e09e00910158dc91fa21e550.', STARTKEY => 
'', ENDKEY => '', REPLICA_ID => 2}
2016-07-27 13:47:38,092 WARN  [hbasefsck-pool1-t1] util.HBaseFsck: No HDFS 
region dir found: { meta => 
myTable,,1469652448440_0001.a03250bca30781ff7002a91c281b4e92., hdfs => null, 
deployed => 
u1604srv,42895,1469652420413;myTable,,1469652448440_0001.a03250bca30781ff7002a91c281b4e92.,
 replicaId => 1 } meta={ENCODED => a03250bca30781ff7002a91c281b4e92, NAME => 
'myTable,,1469652448440_0001.a03250bca30781ff7002a91c281b4e92.', STARTKEY => 
'', ENDKEY => '', REPLICA_ID => 1}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSSION] Merge Backup / Restore - Branch HBASE-7912

2016-07-19 Thread Matteo Bertozzi
I did some review in the early beginning, but then lost track of the
changes.
but I'd like to give a quick review to the full code once people here are
ok with getting this feature in master (2.0).
(let say we put a deadline for reviews, like 1 week for reviewing the full
stuff after everyone agrees to get this in. just to avoid holding this for
too long, but still enough time to have people that are interested to look
at it. with did the same thing for MOB with a mega patch
https://reviews.apache.org/r/36391/)

most of the code seemed isolated from the beginning, few changes here and
there in the core.
so, this side of things seems ok to me.

maybe some work to add IT tests as mentioned above, but that should not
take long.

I don't know if there are already docs, but that is another thing we may
want to get in with the merge.
a minimal coverage at least on how to use the feature, and maybe calling it
out as experimental?

my main concern were around incremental backups.
I'm still not convinced around the fact that because the WALs contain
regions of multiple tables
the incremental backup will keep around WALs with some data that we don't
really want in the backup (for space or maybe security reason).

then there was the question about for how long should I take incrementals,
before deciding that a fresh full backup is less costly in terms of space.
but I think this incremental merge/compaction was a feature on the roadmap
as Phase3.
which I think is ok to get later on,
maybe just call out a lifecycle example on the docs under "best practices".


has anyone interested in using backups looked at the doc in HBASE-7912?
is the current design of incremental backup acceptable for everyone wanting
to use this feature?
(maybe this should be a question for the @user list and not dev)

is there anyone already using this feature or it is just dev testing it?
to me will be interesting having a use-case/workflow example,
to see if in the real world my concerns about incremental are not showing
up.

On Tue, Jul 19, 2016 at 1:35 PM, Ted Yu  wrote:

> Gentle ping on this subject.
>
> The changes are mostly non-intrusive.
>
> More comments are welcome.
>
> On Mon, Jul 11, 2016 at 9:29 PM, Vladimir Rodionov  >
> wrote:
>
> > Not that hard, Andrew. I will open JIRA.
> >
> > -Vlad
> >
> > On Mon, Jul 11, 2016 at 8:46 PM, Andrew Purtell <
> andrew.purt...@gmail.com>
> > wrote:
> >
> > > How hard would it be to convert what you've been using to test end to
> end
> > > during dev into an IT?
> > >
> > >
> > > On Jul 11, 2016, at 5:31 PM, Vladimir Rodionov  >
> > > wrote:
> > >
> > > >>> Is there an integration test in hbase-it yet? If not, any tips on a
> > > >>> semi-automateable way to take backups and restore them?
> > > >
> > > > We do not have yet, but we have a lot of unit tests. We provide 2 API
> > for
> > > > backup:
> > > >
> > > > 1. Admin.getBackupAdmin
> > > >
> > > > 2. Command - line via hbase command.
> > > >
> > > > Everything is straightforward.
> > > >
> > > > -Vlad
> > > >
> > > >
> > > >
> > > >
> > > >> On Mon, Jul 11, 2016 at 5:23 PM, Dima Spivak 
> > > wrote:
> > > >>
> > > >> Is there an integration test in hbase-it yet? If not, any tips on a
> > > >> semi-automateable way to take backups and restore them?
> > > >>
> > > >> -Dima
> > > >>
> > > >> On Mon, Jul 11, 2016 at 6:42 PM, Vladimir Rodionov <
> > > vladrodio...@gmail.com
> > > >> wrote:
> > > >>
> > > >>> Sorry, wrong links:
> > > >>> These are the phases:
> > > >>>
> > > >>> Phase 1:
> > > >>> https://issues.apache.org/jira/browse/HBASE-
> > > >>> 14030
> > > >>> Phase 2:
> > > >>> https://issues.apache.org/jira/browse/HBASE-
> > > >>> 14123
> > > >>> Phase 3:
> > > >>> https://issues.apache.org/jira/browse/HBASE-
> > > >>> 14414
> > > >>>
> > > >>> -Vlad
> > > >>>
> > > >>> On Mon, Jul 11, 2016 at 4:41 PM, Vladimir Rodionov <
> > > >> vladrodio...@gmail.com
> > > >>> wrote:
> > > >>>
> > >  These are the phases:
> > > 
> > >  Phase 1:
> > >  https://issues.apache.org/jira/browse/HBASE-
> > >  14030
> > >  Phase 2:
> > >  https://issues.apache.org/jira/browse/HBASE-
> > >  14123
> > >  Phase 3:
> > >  https://issues.apache.org/jira/browse/HBASE-
> > >  14414
> > > 
> > >  -Vlad
> > > 
> > > 
> > >  On Mon, Jul 11, 2016 at 12:21 PM, Enis Söztutar 
> > > >> wrote:
> > > 
> > > > As you guys may already be familiar, Vladimir, Ted, Jerry and
> > others
> > > >>> have
> > > > been developing the backup / restore functionality in a series of
> > > >> issues
> > > > 

[jira] [Created] (HBASE-16219) Move meta bootstrap out of HMaster

2016-07-12 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16219:
---

 Summary: Move meta bootstrap out of HMaster
 Key: HBASE-16219
 URL: https://issues.apache.org/jira/browse/HBASE-16219
 Project: HBase
  Issue Type: Sub-task
  Components: master, Region Assignment
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


another cleanup to have a smaller integration patch for the new AM.

Trying to isolate the Assignment code from the HMaster.
Move all the bootstrap code to split meta logs and assign meta regions from 
HMaster to a MasterMetaBootstrap class to also reduce the long 
finishActiveMasterInitialization() method



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16207) can't restore snapshot without "Admin" permission

2016-07-11 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16207:
---

 Summary: can't restore snapshot without "Admin" permission
 Key: HBASE-16207
 URL: https://issues.apache.org/jira/browse/HBASE-16207
 Project: HBase
  Issue Type: Bug
  Components: master, snapshots
Affects Versions: 1.1.5, 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 2.0.0, 1.2.2, 1.1.6, 1.3.1
 Attachments: HBASE-16207-v0.patch, HBASE-16207-v0_branch-1.patch

MasterRpcServices.restoreSnapshot() tries to verify if the NS exists before 
starting the restore, but instead of calling ensureNamespaceExists() it calls 
master.getNamespace() which requires ADMIN permission to get the NS descriptor. 
{code}
public RestoreSnapshotResponse restoreSnapshot(RpcController controller,
...
  // Ensure namespace exists. Will throw exception if non-known NS.
  master.getNamespace(dstTable.getNamespaceAsString());
{code}

unfortunately i'm not aware of any unit-test that cover this kind of 
situations. we cover single ACLs from the TestAccessController but we don't 
exercise rpc calls and verify if there is more than one check on the ACLs like 
in this case



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16121) Require only MasterServices to the ServerManager constructor

2016-06-27 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16121:
---

 Summary: Require only MasterServices to the ServerManager 
constructor
 Key: HBASE-16121
 URL: https://issues.apache.org/jira/browse/HBASE-16121
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0
 Attachments: HBASE-16121-v0.patch

currently we pass Server and MasterServices to the ServerManager. 
MasterServices is a Server and in tests where we try to pass only one of the 
two we end up with NPEs if the code change.
remove the Server arg and just pass the MasterServices



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16119) Procedure v2 - Reimplement merge

2016-06-26 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16119:
---

 Summary: Procedure v2 - Reimplement merge
 Key: HBASE-16119
 URL: https://issues.apache.org/jira/browse/HBASE-16119
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, Region Assignment
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi


use the proc-v2 state machine for merge. also update the logic to have a single 
meta-writer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HBASE-16092) Procedure v2 - complete child procedure support

2016-06-24 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi reopened HBASE-16092:
-

> Procedure v2 - complete child procedure support
> ---
>
> Key: HBASE-16092
> URL: https://issues.apache.org/jira/browse/HBASE-16092
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.4.0
>    Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-16092-v0.patch
>
>
> There was a missing part on the child procedure tracking.
> child procedure were never deleted from the wal on parent completion,
> leading to failures on startup. (no code uses child procedures yet)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16103) Procedure v2 - TestCloneSnaphotProcedure relies on execution order

2016-06-24 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16103:
---

 Summary: Procedure v2 - TestCloneSnaphotProcedure relies on 
execution order
 Key: HBASE-16103
 URL: https://issues.apache.org/jira/browse/HBASE-16103
 Project: HBase
  Issue Type: Bug
  Components: proc-v2, test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0


https://builds.apache.org/view/All/job/HBase-Trunk_matrix/jdk=latest1.8,label=yahoo-not-h2/1100/
 (executor died in this one)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16092) Procedure v2 - complete child procedure support

2016-06-23 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16092:
---

 Summary: Procedure v2 - complete child procedure support
 Key: HBASE-16092
 URL: https://issues.apache.org/jira/browse/HBASE-16092
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 2.0.0, 1.4.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.4.0


There was a missing part on the child procedure tracking.
child procedure were never deleted from the wal on parent completion,
leading to failures on startup. (no code uses child procedures yet)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16082) Procedure v2 - Move out helpers from MasterProcedureScheduler

2016-06-22 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16082:
---

 Summary: Procedure v2 - Move out helpers from 
MasterProcedureScheduler
 Key: HBASE-16082
 URL: https://issues.apache.org/jira/browse/HBASE-16082
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.2.2, 1.3.1
 Attachments: HBASE-16068-v0.patch

Move out the helper classes from MasterProcedureScheduler. I plan to use them 
in other places. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] HBase-2.0 SHOULD be rolling upgradable and wire-compatible with 1.x

2016-06-21 Thread Matteo Bertozzi
I think everyone wants rolling upgrade. the discussion should probably be
around how much compatibility code do we want to keep around.

using as example HBASE-16060, we need to decide how much are we rolling
upgradable and from where.
I'm not too convinced that we should have extra code in master to "simulate
the old states",
I'll rather have cleaner code in 2.0 and force the users to move to one of
the latest 1.x.y
there are not many changes in the 1.x releases, so we should be able to say:
if you are on 1.1 move to the latest 1.1.x, if you are on 1.2 move to the
latest 1.2.x and so on.

also there are some operations that may not be needed during rolling
upgrades,
and we can cut on compatibility to have some code removed.
an example here is HBASE-15521 where we are no longer able to clone/restore
snapshot during 1.x -> 2.x rolling upgrade, until the two master are on
2.x. but this may be extended to you can't perform some operation until all
the machines are on 2.x for some future change.

I think we should aim for something like:
 - data path: HTable put/get/scan/... must work during a rolling upgrade
 - replication: must? work during rolling upgrade
 - admin: some operation may not be working during rolling upgrade
 - upgrade to the latest 1.x.y before the 2.x upgrade (we can add in 2.x
master and rs the ability to check the client version)


Matteo


On Tue, Jun 21, 2016 at 12:05 AM, Dima Spivak  wrote:

> If there’s no technical limitation, we should definitely do it. As you
> note, customers running in production hate when they have to shut down
> clusters and with some of the testing infrastructure being rolled out, this
> is definitely something we can set up automated testing for. +1
>
> -Dima
>
> On Mon, Jun 20, 2016 at 2:58 PM, Enis Söztutar  wrote:
>
> > Time to formalize 2.0 rolling upgrade scenario?
> >
> > 0.94 -> 0.96 singularity was a real pain for operators and for our users.
> > If possible we should not have the users suffer through the same thing
> > unless there is a very compelling reason. For the current stuff in
> master,
> > there is nothing that will prevent us to not have rolling upgrade support
> > for 2.0. So I say, we should decide on the rolling upgrade requirement
> now,
> > and start to evaluate incoming patches accordingly. Otherwise, we risk
> the
> > option to go deeper down the hole.
> >
> > What do you guys think. Previous threads [1] and [2] seems to be in
> favor.
> > Should we vote?
> >
> > Ref:
> > [1]
> >
> >
> http://search-hadoop.com/m/YGbbsd4An1aso5E1=HBase+1+x+to+2+0+upgrade+goals+
> >
> > [2]
> >
> >
> http://search-hadoop.com/m/YGbb1CBXTL8BTI=thinking+about+supporting+upgrades+to+HBase+1+x+and+2+x
> >
>


[jira] [Created] (HBASE-16068) Procedure v2 - use consts for conf properties in tests

2016-06-20 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16068:
---

 Summary: Procedure v2 - use consts for conf properties in tests 
 Key: HBASE-16068
 URL: https://issues.apache.org/jira/browse/HBASE-16068
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, test
Affects Versions: 1.1.5, 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6


replace the hardcoded properties string conf.set("foo.key", v) in the tests 
with the use of the configuration property constants that we already have



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16056) Procedure v2 - fix master crash for FileNotFound

2016-06-17 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16056:
---

 Summary: Procedure v2 - fix master crash for FileNotFound
 Key: HBASE-16056
 URL: https://issues.apache.org/jira/browse/HBASE-16056
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.1.5, 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.3.0, 1.2.2, 1.1.6


[~syuanjiang] and [~tedyu] reported a backup master not able to start with 
FileNotFound during proc-v2 lease recovery. (another restart should have solved 
the problem)
{noformat}
FileNotFoundException: File does not exist: 
/hbase/MasterProcWALs/state-01.log
namenode.INodeFile.valueOf(INodeFile.java:61) at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2877)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:753)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:671)
 
{noformat}
this may happen when the other master is still active (e.g. GC) and tries to 
remove files while the other master tries to become active. This operation is 
retryable so the code should able to handle that.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16034) Fix ProcedureTestingUtility#LoadCounter.setMaxProcId()

2016-06-15 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-16034:
---

 Summary: Fix ProcedureTestingUtility#LoadCounter.setMaxProcId() 
 Key: HBASE-16034
 URL: https://issues.apache.org/jira/browse/HBASE-16034
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2, test
Affects Versions: 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.3.0, 1.2.2
 Attachments: HBASE-16034-v0.patch

ProcedureTestingUtility LoadCounter.setMaxProcId() implementation is wrong, and 
it ends up not setting the max value
{code}
 public void setMaxProcId(long maxProcId) {
-  maxProcId = maxProcId;
+  this.maxProcId = maxProcId;
 }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: RegionCoprocessorHost.preClose() execute multiple times

2016-06-14 Thread Matteo Bertozzi
HBASE-8075

Matteo


On Tue, Jun 14, 2016 at 7:40 AM, Stephen Jiang 
wrote:

> In RSRpcServices#closeRegion(), it calls RegionCoprocessorHost#preClose()
> first, then calls HRegionServer#closeRegion().
>
> In HRegionServer#closeRegion(), the RegionCoprocessorHost#preClose() is
> called again.
>
> I just wonder whether the RegionCoprocessorHost#preClose() call
> in RSRpcServices#closeRegion() is unnecessary.  Anyone has idea?  The code
> seems there forever.
>
> I think we should be able to remove the call in RSRpcServices#closeRegion()
> and rely on HRegionServer#closeRegion() call for this CP.
>
> {code}
>
>   @Override
>
>   @QosPriority(priority=HConstants.ADMIN_QOS)
>
>   public CloseRegionResponse closeRegion(final RpcController controller,
>
>   final CloseRegionRequest request) throws ServiceException {
>
>...
>
>   // Can be null if we're calling close on a region that's not online
>
>   final Region region = regionServer.getFromOnlineRegions(
> encodedRegionName);
>
>   if ((region  != null) && (region .getCoprocessorHost() != null)) {
>
> region.getCoprocessorHost().preClose(false);
>
>   }
>
>   ...
>
>   boolean closed = regionServer.closeRegion(encodedRegionName, false,
> sn
> );
>
> {code}
>


[jira] [Created] (HBASE-15927) Remove HMaster.assignRegion()

2016-05-31 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15927:
---

 Summary: Remove HMaster.assignRegion()
 Key: HBASE-15927
 URL: https://issues.apache.org/jira/browse/HBASE-15927
 Project: HBase
  Issue Type: Sub-task
  Components: test
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


another cleanup to have a smaller integration patch for the new AM.

get rid of the HMaster.assignRegion() which was used only by few tests. 
and replace that assignRegion()+wait() with a HTU call



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Smart Flaky Handler

2016-05-20 Thread Matteo Bertozzi
any suggestion on how to make people aware of the tests being flaky?

for example I will have never notice the procedure test being flaky if was
not for stack posting the list here.
so, maybe a weekly digest in the dev-list with the list of flaky will get
more audience than having people go into the job.

also, I was thinking about how do I notice if I broke something when I post
a patch.
since we exclude the flakys from the run, there is no way I can notice I
broke something from QA.
maybe we can add a section in QA that runs the flaky ones and tells you
"those are failed but may be flaky"
and at least can look if the failures are related to the patch or is just
flaky.

On Fri, May 20, 2016 at 11:03 AM, Nick Dimiduk  wrote:

> Nice work Appy! What do I need to do to get it wired up for branch-1.1?
>
> On Fri, May 20, 2016 at 9:25 AM, Stack  wrote:
>
> > The system seems to be working nicely Appy. We are getting green
> precommit
> > builds for the first time in ages.
> >
> > Should we change the includes and excludes lists so they have a file type
> > ending? .txt? Then I could open them easily in the browser. Currently I
> > have to download them.
> >
> > Includes are tests that are currently considered 'flakey'?
> >
> >
> >
> TestGenerateDelegationToken,TestMobCompactor,TestRegionServerMetrics,TestAcidGuarantees,TestMasterReplication,TestRowProcessorEndpoint,TestAsyncLogRolling,DynamicLogicExpressionSuite,TestMasterFailoverWithProcedures,TestChoreService,TestScannerHeartbeatMessages,TestWALProcedureStore,TestRegionMergeTransactionOnCluster,TestSaslFanOutOneBlockAsyncDFSOutput,TestReplicationEndpointWithMultipleWAL
> >
> > We have a nice list.
> >
> > Excludes are:
> >
> >
> >
> **/TestGenerateDelegationToken.java,**/TestMobCompactor.java,**/TestRegionServerMetrics.java,**/TestAcidGuarantees.java,**/TestMasterReplication.java,**/TestRowProcessorEndpoint.java,**/TestAsyncLogRolling.java,**/DynamicLogicExpressionSuite.java,**/TestMasterFailoverWithProcedures.java,**/TestChoreService.java,**/TestScannerHeartbeatMessages.java,**/TestWALProcedureStore.java,**/TestRegionMergeTransactionOnCluster.java,**/TestSaslFanOutOneBlockAsyncDFSOutput.java,**/TestReplicationEndpointWithMultipleWAL.java,
> >
> > Whats the '**/' about? Is it supposed to have opening/closing versions?
> >
> > Thanks boss,
> > St.
> >
> >
> >
> > On Mon, May 16, 2016 at 4:45 PM, Stack  wrote:
> >
> > > Sweet!
> > >
> > > On Mon, May 16, 2016 at 4:38 PM, Apekshit Sharma 
> > > wrote:
> > >
> > >> This mail is to introduce the work to tackle the flaky tests in our
> > build.
> > >>
> > >> *Why is it important?*
> > >> - Our build history sucks, last 175 post-commit runs failed. We need
> to
> > >> make it useful.
> > >> - To better understand our code’s testing status, more importantly
> it’s
> > >> weak points.
> > >> - We know those 2-3 tests which keep failing every now and then, but
> not
> > >> those ~10 nasty ones which fail like 1 out of 50 times, and screw our
> > build.
> > >> - This isn’t something that can be done manually on a daily basis. We
> > >> need automation.
> > >>
> > >> *Changes made so far:*
> > >> Code changes: HBASE-15839
> > >>   (Umbrella issue)
> > >>
> > >> *Jenkins changes:*
> > >>
> > >>
> > >> [Diagram link:
> > >>
> >
> https://issues.apache.org/jira/secure/attachment/12804292/Screen%20Shot%202016-05-16%20at%204.02.46%20PM.png
> > >> ]
> > >> ​
> > >> *(new job) HBase-Find-Flaky-Tests*: Gets test reports of recent builds
> > >> of post-commit job (TRUNK_matrix) and HBase-Flaky-Tests job (see
> below)
> > to
> > >> find flaky tests. Frequency of run determines how fast we catch test
> > >> regressions. So if we run it every 4 hours, any test which started
> > failing
> > >> in post-commit job (TRUNK_matrix) in last 4 hour will be blacklisted.
> > >>
> > >> *(new job) HBase-Flaky-Tests*: This job runs only the flaky tests. The
> > >> aim is to run this job back-to-back to collect as many runs as we can.
> > >> Higher the run rate, the better will be our system at catching the
> flaky
> > >> tests. We currently run it hourly. so we’ll be able to keep track of
> > flaky
> > >> tests with ~5% failure rate or more.
> > >>
> > >> *Post-commit (TRUNK_matrix) and pre-commit jobs*: Exclude these flaky
> > >> tests.
> > >>
> > >>
> > >> *So what if a bad commit makes a good test bad?*
> > >> Since the test is not bad, it’ll run in next post-commit and will
> fail.
> > >> Next run of HBase-Find-Flaky-Tests will  pick it up and blacklist it.
> > >> Blacklisting will help keep the post-commit job and more importantly
> > >> pre-commit job clean, a problem we face quite often.
> > >>
> > >> *Are we just tucking away are shit?*
> > >> Nope, this will help us:
> > >> - first, Maintain a list of bad test (we lack that today).
> > >> - second, make our build greener to the point that a failed/red build
> 

[jira] [Created] (HBASE-15872) Split TestWALProcedureStore

2016-05-20 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15872:
---

 Summary: Split TestWALProcedureStore
 Key: HBASE-15872
 URL: https://issues.apache.org/jira/browse/HBASE-15872
 Project: HBase
  Issue Type: Sub-task
  Components: proc-v2
Affects Versions: 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15864) Reuse the testing helper to wait regions in transition

2016-05-19 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15864:
---

 Summary: Reuse the testing helper to wait regions in transition
 Key: HBASE-15864
 URL: https://issues.apache.org/jira/browse/HBASE-15864
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 1.3.0
 Attachments: HBASE-15864-v0.patch

There are a bunch of unit test that do the same loop to wait for region in 
transitions. get rid of the duplicate code and call the helpers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15843) Replace RegionState.getRegionInTransition() Map with a Set

2016-05-16 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15843:
---

 Summary: Replace RegionState.getRegionInTransition() Map with a Set
 Key: HBASE-15843
 URL: https://issues.apache.org/jira/browse/HBASE-15843
 Project: HBase
  Issue Type: Improvement
  Components: master, Region Assignment
Affects Versions: 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


RegionState.getRegionInTransition() is always used as a Set.
replace the Map with a Set, avoid some allocation and extra code.

also ClusterStatus.RegionInTransition has duplicated information.
The spec field contains the regionName (not encoded). 
but we have the same info as part of the region_state with the HRegionInfo 
serialized.
unfortunately I don't think we can get rid of 'spec' that being a required 
field.
{noformat}
message RegionInTransition {
  required RegionSpecifier spec = 1;
  required RegionState region_state = 2;
}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15818) Shell create ‘t1’, ‘f1’, ‘f2’, ‘f3’ wrong number of arguments

2016-05-11 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-15818.
-
Resolution: Invalid

closing, I think is just wrong ' 

> Shell create ‘t1’, ‘f1’, ‘f2’, ‘f3’ wrong number of arguments 
> --
>
> Key: HBASE-15818
> URL: https://issues.apache.org/jira/browse/HBASE-15818
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0, 1.2.1
>    Reporter: Matteo Bertozzi
>Priority: Critical
> Fix For: 2.0.0, 1.3.0, 1.2.2
>
>
> Unable to create a table with multiple families (as suggested by the Examples)
> also the shell is exiting. (I only tested 1.2 and 2.0 and have the problem, 
> 1.1 seems to be ok)
> {noformat}
> hbase(main):001:0> create ‘t1’, ‘f1’, ‘f2’, ‘f3’
> ERROR: wrong number of arguments (0 for 1)
> Examples:
>   hbase> create 't1', 'f1', 'f2', 'f3'
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15818) Shell create ‘t1’, ‘f1’, ‘f2’, ‘f3’ wrong number of arguments

2016-05-11 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15818:
---

 Summary: Shell create ‘t1’, ‘f1’, ‘f2’, ‘f3’ wrong number of 
arguments 
 Key: HBASE-15818
 URL: https://issues.apache.org/jira/browse/HBASE-15818
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Priority: Critical
 Fix For: 2.0.0, 1.3.0, 1.2.2


Unable to create a table with multiple families (as suggested by the Examples)
also the shell is exiting. (I only tested 1.2 and 2.0 but it may be in every 
version)
{noformat}
hbase(main):001:0> create ‘t1’, ‘f1’, ‘f2’, ‘f3’

ERROR: wrong number of arguments (0 for 1)

Examples:
  hbase> create 't1', 'f1', 'f2', 'f3'
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] First release candidate for HBase 1.1.5 (RC0) is available

2016-05-11 Thread Matteo Bertozzi
+1

- compiled from source and run few unit-test
TestAdmin*,Test*Master*,Test*Region*
- inspected the binary
- started hbase from both source and binary
- few commands from shell: create/disable/enable/drop/split, put/get/scan,
snapshot/clone_snapshot
- run PerformanceEvaluation with random write/read with autosplit
- run a simple bulkload and checked the data
- clicked around the webui
- checked the logs for anything strange

On Wed, May 11, 2016 at 8:26 AM, Nick Dimiduk  wrote:

> A reminder, everyone, that this vote is scheduled to conclude in ~40 hours.
>
> On Sun, May 8, 2016 at 9:23 PM, Nick Dimiduk  wrote:
>
> > *** Please note that my key expired since the previous release. I have
> > updated its expiration, pushed to pgp.mit.edu, updated the KEYS file
> > linked below, and attempted to force an update on id.apache.org. I don't
> > know how long it will take for people.apache.org to refresh. ***
> >
> > *** Please note that this voting window is slightly shorter than the
> > customary one week so that we have time for an RC1 before HBaseCon, if
> > necessary. ***
> >
> > I'm happy to announce the first release candidate of HBase 1.1.5 (HBase-
> > 1.1.5RC0) is available for download at
> > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.1.5RC0/
> >
> > Maven artifacts are also available in the staging repository
> > https://repository.apache.org/content/repositories/orgapachehbase-1136/
> >
> > Artifacts are signed with my code signing subkey 0xAD9039071C3489BD,
> > available in the Apache keys directory
> > https://people.apache.org/keys/committer/ndimiduk.asc and in our KEYS
> > file http://www-us.apache.org/dist/hbase/KEYS.
> >
> > There's also a signed tag for this release at
> >
> https://git-wip-us.apache.org/repos/asf?p=hbase.git;a=tag;h=92323e8e630e46d277ab2e8ebd34b91ab5d597d5
> >
> > The detailed source and binary compatibility report vs 1.1.4 has been
> > published for your review, at
> > http://home.apache.org/~ndimiduk/1.1.4_1.1.5RC0_compat_report.html
> >
> > HBase 1.1.5 is the fifth patch release in the HBase 1.1 line, continuing
> > on the theme of bringing a stable, reliable database to the Hadoop and
> > NoSQL communities. This release includes over 20 bug fixes since the
> 1.1.4
> > release. Notable correctness fixes
> > include HBASE-15234, HBASE-15295, HBASE-15325, HBASE-15622, and
> HBASE-15645.
> >
> > The full list of fixes included in this release is available at
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753=12335058
> > and and in the CHANGES.txt file included in the distribution.
> >
> > Please try out this candidate and vote +/-1 by 23:59 Pacific time on
> > Thursday, 2016-05-12 as to whether we should release these artifacts as
> > HBase 1.1.5.
> >
> > Thanks,
> > Nick
> >
>


[jira] [Created] (HBASE-15809) Basic Replication WebUI

2016-05-09 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15809:
---

 Summary: Basic Replication WebUI
 Key: HBASE-15809
 URL: https://issues.apache.org/jira/browse/HBASE-15809
 Project: HBase
  Issue Type: New Feature
  Components: Replication
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 2.0.0, 1.3.0


At the moment the only way to have some insight on replication from the webui 
is looking at zkdump and metrics.

the basic information useful to get started debugging are: peer information and 
the view of WALs offsets for each peer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: What's up with branch-2?

2016-05-08 Thread Matteo Bertozzi
I can't see it, are you sure you don't have a local copy? (or maybe someone
deleted it after your mail)
https://github.com/apache/hbase/commits/branch-2

Matteo


On Sun, May 8, 2016 at 6:08 PM, Nick Dimiduk  wrote:

> What's up with this branch-2? Seems like it's back after Sean's
> HBASE-15006.
>


[jira] [Resolved] (HBASE-15746) RegionCoprocessor preClose() called 3 times

2016-05-08 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi resolved HBASE-15746.
-
Resolution: Duplicate

> RegionCoprocessor preClose() called 3 times
> ---
>
> Key: HBASE-15746
> URL: https://issues.apache.org/jira/browse/HBASE-15746
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors, regionserver
>Affects Versions: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 0.98.19
>    Reporter: Matteo Bertozzi
>Priority: Minor
>
> The preClose() region coprocessor call gets called 3 times via rpc.
> The first one is when we receive the RPC
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1329
> The second time is when ask the RS to close the region
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L2852
> The third time is when the doClose() on the region is executed.
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L1419
> I'm pretty sure the first one can be removed since, there is no code between 
> that and the second call. and they are a copy-paste.
> The second one explicitly says that is to enforce ACLs before starting the 
> operation, which leads me to the fact that the 3rd one in the region gets 
> executed too late in the process. but the region.close() may be called by 
> someone other than the RS, so we should probably leave the preClose() in 
> there (e.g. OpenRegionHandler on failure cleanup). 
> any idea?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15790) Force "hbase" ownership on bulkload

2016-05-06 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15790:
---

 Summary: Force "hbase" ownership on bulkload
 Key: HBASE-15790
 URL: https://issues.apache.org/jira/browse/HBASE-15790
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.19, 1.1.4, 1.2.1, 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-15790-v0.patch

When a user different than "hbase" bulkload files, in general we end up with 
files owned by a user different than hbase. sometimes this causes problems with 
hbase not be able to move files around archiving/deleting.

A simple solution is probably to change the ownership of the files to "hbase" 
during bulkload.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15781) Remove unused TableEventHandler and TotesHRegionInfo

2016-05-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15781:
---

 Summary: Remove unused TableEventHandler and TotesHRegionInfo
 Key: HBASE-15781
 URL: https://issues.apache.org/jira/browse/HBASE-15781
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0
 Attachments: HBASE-15781-v0.patch

Remove unused classes TableEventHandler and TotesHRegionInfo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15778) replace master.am and master.sm direct access with getter calls

2016-05-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15778:
---

 Summary: replace master.am and master.sm direct access with getter 
calls
 Key: HBASE-15778
 URL: https://issues.apache.org/jira/browse/HBASE-15778
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


MasterRpcServices seems to access directly a bunch of members of HMaster. I 
think this is because when we split HMaster and MaterRpcServices we just did a 
find & replace.

I was trying to mock and have a different AssignmentManager and ServerManager 
but it got impossible with those access directly.
so, here a patch to avoid the direct access to AM and SM.

There are many more members that can be made private but I'm lazy. so this pass 
is only AM, SM related



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15776) Replace master.am.getTableStateManager() with the direct master.getTableStateManager()

2016-05-05 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15776:
---

 Summary: Replace master.am.getTableStateManager() with the direct 
master.getTableStateManager()
 Key: HBASE-15776
 URL: https://issues.apache.org/jira/browse/HBASE-15776
 Project: HBase
  Issue Type: Sub-task
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0
 Attachments: HBASE-15776-v0.patch

Replace the double lookup master.getAssignmentManager().getTableStateManager() 
with the direct master.getTableStateManager().

this also because I'd like to remove the TableStateManager instance from the 
new AM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: A checkstyle question: indentation in switch-case statements

2016-05-04 Thread Matteo Bertozzi
I think it is complaining because the "case" is aligned with the "switch"
and not indented

Matteo


On Wed, May 4, 2016 at 7:13 AM, Stephen Jiang 
wrote:

> I got two checkstyle warning for the style of switch-case statements:
>
>
> ./hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AddColumnFamilyProcedure.java:386:
>cpHost.preAddColumnFamilyAction(tableName, cfDescriptor);:
> error: 'block' child have incorrect indentation level 12, expected
> level should be one of the following: 14, 16.
>
> ./hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AddColumnFamilyProcedure.java:386:
>cpHost.preAddColumnFamilyAction(tableName, cfDescriptor);:
> error: 'method call' child have incorrect indentation level 12,
> expected level should be one of the following: 14, 16.
>
> To me, the 2-space indentation looks correct.  What is wrong for this style
> (switch and case are in the same level, is this wrong?  I set up HBASE
> coding style in eclipse and it automatically did it this way)?
>
>   switch (state) {
>
>   case ADD_COLUMN_FAMILY_PRE_OPERATION:
>
> cpHost.preAddColumnFamilyAction(tableName, cfDescriptor);
>
> break;
>
>   case ADD_COLUMN_FAMILY_POST_OPERATION:
>
> cpHost.postCompletedAddColumnFamilyAction(tableName,
> cfDescriptor);
>
> break;
>
>   default:
>
> throw new UnsupportedOperationException(this + " unhandled
> state=" + state);
>
>   }
>
>
> Any insight would be helpful.
>
> Thanks
>
> Stephen
>


[jira] [Created] (HBASE-15763) Isolate Wal related stuff from MasterFileSystem

2016-05-03 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15763:
---

 Summary: Isolate Wal related stuff from MasterFileSystem
 Key: HBASE-15763
 URL: https://issues.apache.org/jira/browse/HBASE-15763
 Project: HBase
  Issue Type: Sub-task
  Components: master, wal
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0


To help the work on the redofs (HBASE-14090) we need some rework on the 
"filesystem" interfaces.

This task just moves the WAL related things out of MasterFileSystem.
This task is not meant to create a good interface for the wal but just to move 
out things from MasterFileSystem to be able to start working on that. the fixup 
of the wal interface will be done later in another jira. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15746) RegionCoprocessor preClose() called 3 times

2016-05-02 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-15746:
---

 Summary: RegionCoprocessor preClose() called 3 times
 Key: HBASE-15746
 URL: https://issues.apache.org/jira/browse/HBASE-15746
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors, regionserver
Affects Versions: 0.98.19, 1.1.4, 1.2.1, 2.0.0, 1.3.0
Reporter: Matteo Bertozzi
Priority: Minor


The preClose() region coprocessor call gets called 3 times via rpc.

The first one is when we receive the RPC
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L1329

The second time is when ask the RS to close the region
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java#L2852

The third time is when the doClose() on the region is executed.
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L1419

I'm pretty sure the first one can be removed since, there is no code between 
that and the second call. and they are a copy-paste.

The second one explicitly says that is to enforce ACLs before starting the 
operation, which leads me to the fact that the 3rd one in the region gets 
executed too late in the process. but the region.close() may be called by 
someone other than the RS, so we should probably leave the preClose() in there 
(e.g. OpenRegionHandler on failure cleanup). 

any idea?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   >