[jira] [Created] (HBASE-17718) Difference between RS's servername and its ephemeral node cause SSH stop working

2017-03-01 Thread Allan Yang (JIRA)
Allan Yang created HBASE-17718:
--

 Summary: Difference between RS's servername and its ephemeral node 
cause SSH stop working
 Key: HBASE-17718
 URL: https://issues.apache.org/jira/browse/HBASE-17718
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.1.8, 1.2.4, 2.0.0
Reporter: Allan Yang
Assignee: Allan Yang



After HBASE-9593, RS put up an ephemeral node in ZK before reporting for duty. 
But if the hosts config (/etc/hosts) is different between master and RS, RS's 
serverName can be different from the one stored the ephemeral zk node. The 
email metioned in HBASE-13753 
(http://mail-archives.apache.org/mod_mbox/hbase-user/201505.mbox/%3CCANZDn9ueFEEuZMx=pZdmtLsdGLyZz=rrm1N6EQvLswYc1z-H=g...@mail.gmail.com%3E)
 is exactly what happened in our production env. 

But what the email didn't point out is that the difference between serverName 
in RS and zk node can cause SSH stop to work. as we can see from the code in 
{{RegionServerTracker}}
{code}
  @Override
  public void nodeDeleted(String path) {
if (path.startsWith(watcher.rsZNode)) {
  String serverName = ZKUtil.getNodeName(path);
  LOG.info("RegionServer ephemeral node deleted, processing expiration [" +
serverName + "]");
  ServerName sn = ServerName.parseServerName(serverName);
  if (!serverManager.isServerOnline(sn)) {
LOG.warn(serverName.toString() + " is not online or isn't known to the 
master."+
 "The latter could be caused by a DNS misconfiguration.");
return;
  }
  remove(sn);
  this.serverManager.expireServer(sn);
}
  }
{code}
The server will not be processed by SSH/ServerCrashProcedure. The regions on 
this server will not been assigned again until master restart or failover.
I know HBASE-9593 was to fix the issue if RS report to duty and crashed before 
it can put up a zk node. It is a very rare case. But The issue I metioned can 
happened more often(due to DNS, config, etc.) and have more severe consequence.

So here I offer some solutions to discuss:
1. Revert HBASE-9593 from all branches, Andrew Purtell has reverted it in 
branch-0.98
2. Abort RS if master return a different name, otherwise SSH can't work properly
3. Master receive whatever servername reported by RS and don't change it.

 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17717) Incorrect ZK ACL set for HBase superuser

2017-03-01 Thread Josh Elser (JIRA)
Josh Elser created HBASE-17717:
--

 Summary: Incorrect ZK ACL set for HBase superuser
 Key: HBASE-17717
 URL: https://issues.apache.org/jira/browse/HBASE-17717
 Project: HBase
  Issue Type: Bug
  Components: security, Zookeeper
Reporter: Shreya Bhat
Assignee: Josh Elser
 Fix For: 2.0.0, 1.3.1, 1.1.10, 1.2.6


Shreya was doing some testing of a deploy of HBase, verifying that the ZK ACLs 
were actually set as we expect (yay, security).

She noticed that, in some cases, we were seeing multiple ACLs for the same user.

{noformat}
'world,'anyone
: r
'sasl,'hbase
: cdrwa
'sasl,'hbase
: cdrwa
{noformat}

After digging into this (and some insight from the mighty [~enis]), we realized 
that this was happening because of an overridden value for {{hbase.superuser}}. 
However, the ACL value doesn't match what we'd expect to see (as 
hbase.superuser was set to {{cstm-hbase}}).

After digging into this code, it seems like the {{auth}} ACL scheme in 
ZooKeeper does not work as we expect.

{code}
  if (superUser != null) {
acls.add(new ACL(Perms.ALL, new Id("auth", superUser)));
  }
{code}

In the above, the {{"auth"}} scheme ignores any provided "subject" in the 
{{Id}} object. It *only* considers the authentication of the current 
connection. As such, our usage of this never actually sets the ACL for the 
superuser correctly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17716) Formalize Scan Metric names

2017-03-01 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-17716:
---

 Summary: Formalize Scan Metric names
 Key: HBASE-17716
 URL: https://issues.apache.org/jira/browse/HBASE-17716
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: Karan Mehta
Assignee: Karan Mehta
Priority: Minor


HBase provides various metrics through the API's exposed by ScanMetrics class. 
The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix Metrics 
API. Currently these metrics are referred via hard-coded strings, which are not 
formal and can break the Phoenix API. Hence we need to refactor the code to 
assign enums for these metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17715) expose a sane API to package a standalone client jar

2017-03-01 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-17715:


 Summary: expose a sane API to package a standalone client jar
 Key: HBASE-17715
 URL: https://issues.apache.org/jira/browse/HBASE-17715
 Project: HBase
  Issue Type: Task
Reporter: Sergey Shelukhin
Assignee: Enis Soztutar


TableMapReduceUtil currently exposes a method that takes some info from job 
object iirc, and then makes a standalone jar and adds it to classpath.
It would be nice to have an API that one can call with minimum necessary 
arguments (not dependent on job stuff, "tmpjars" and all that) that would make 
a standalone client jar at a given path and let the caller manage it after that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Preparing HBase 1.3.1 release

2017-03-01 Thread Mikhail Antonov
Hey guys,

It's been 2 months since 1.3 release came out; time to start preparing
1.3.1 release (and then time to assess when 1.3 should be labeled "latest
stable" in the book).

Looking at the Jira there's about 40 committed patches to branch-1.3, about
a dozen jiras in review and 2 dozens or a bit more of open jiras (including
several backporting jiras for changes requested for 1.3 too late in the RC
cycle).

At this point I think we should wind down the commits in the branch; please
ping me if you want to commit something in branch-1.3.

I'm planning to start going over jiras and have a look at "patch available"
ones; the ones that are open or reopened we need to either put effort to
get them in, or kick them out of this RC cycle and/or to 1.3.* line.

Please review the changes that you'd want to have in 1.3.1, ping me on
tasks and speak up if you see any blockers.

Thanks!
-Mikhail


Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-03-01 Thread Mikhail Antonov
Ouch. Thanks Sean!

I'm pretty sure at some point I was debugging 1.3-IT job and saw branch-1.3
getting checked out in the logs. Not sure how/when it went sideways though.

Yeah, let's see how it goes.

-Mikhail

On Wed, Mar 1, 2017 at 5:50 AM, Sean Busbey  wrote:

> Fun times.
>
> 1) Turns out our 1.3-IT jobs have been running against branch-1.2.
> Don't know how long, but as long as we have history.
>
> 2) I deleted the failing-since-august 1.2-IT job.
>
> 3) I renamed the passing 1.3-IT job that runs against branch-1.2 to be
> the 1.2-IT job
>
> 4) I copied the now renamed 1.2-IT job and made a 1.3-IT job that runs
> against branch-1.3
>
> I kicked off jobs after all this shuffling. We'll see how it goes.
>
> On Tue, Feb 21, 2017 at 5:49 PM, Sean Busbey  wrote:
> > FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
> > suggested "custom child workspace" of "${SHORT_COMBINATION}", since
> > spaces in paths had caused them to fail for a v long time.
> >
> > On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell 
> wrote:
> >> Thanks Ted, that would be a nice contribution, thank you.
> >>
> >>
> >> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma 
> wrote:
> >>
> >>> @Ted, here's the old jira, HBASE-14167. Use that.
> >>>
> >>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:
> >>>
> >>> > I just ran the tests in hbase-spark module using 'mvn verify'.
> >>> >
> >>> > All passed.
> >>> >
> >>> > I am testing a patch locally where hbase-spark tests are run in test
> >>> phase.
> >>> >
> >>> > If the tests pass, I will log a JIRA.
> >>> >
> >>> > Thanks
> >>> >
> >>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
> >>> > wrote:
> >>> > >
> >>> > > The hbase-spark integration tests run (and fail) for me locally
> >>> whenever
> >>> > I
> >>> > > build master with 'mvn clean install -DskipITs' .
> >>> > >
> >>> > > HBaseConnectionCacheSuite:
> >>> > > - all test cases *** FAILED ***
> >>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
> >>> > >
> >>> > > Saw it but had to ignore/triage to get something else done.
> >>> > >
> >>> > > We have a weird situation where integration tests run when they
> >>> shouldn't
> >>> > > locally yet no tests run at all for patch process?
> >>> > >
> >>> > > I would like to see Spark behave like the other modules. I remember
> >>> > filing
> >>> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
> >>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which is
> not
> >>> > > appropriate given that none of the other modules have their own
> >>> distinct
> >>> > > control parameters. There also doesn't seem to be a distinction
> between
> >>> > > unit tests and integration tests. The 'test' target does nothing.
> >>> > > Everything happens during the 'integration-test' phase. Is this a
> Spark
> >>> > > limitation?
> >>> > >
> >>> > >
> >>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey <
> bus...@cloudera.com>
> >>> > wrote:
> >>> > >>
> >>> > >> Do the HBase Spark tests only run during the maven verify command?
> >>> > >> We'll need to update our personality to say that that command
> should
> >>> > >> be used for unit tests when in the hbase spark module. ugh.
> >>> > >>
> >>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma <
> a...@cloudera.com>
> >>> > >> wrote:
> >>> > >>> Our patch process isn't running hbase-spark tests. See this for
> >>> > example:
> >>> > >>>
> >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> >>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
> >>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
> >>> > >>>
> >>> > >>> Found it when trying to debug cause of trunk failures. Part of
> the
> >>> > cause
> >>> > >> is
> >>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
> >>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
> >>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/
> 1776/consoleFull
> >>> > >>  >>> > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-
> h2/1776/consoleFull>
> >>> > >> )
> >>> > >>> which was added in HBASE-16638. However, to be fair, QA was
> green and
> >>> > >>> reported passing hbase-spark tests for that jira.
> >>> > >>>
> >>> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack 
> wrote:
> >>> > 
> >>> >  childCustomWorkspace seems to be just the ticket. Nice find
> Appy.
> >>> >  St.Ack
> >>> > 
> >>> >  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey <
> bus...@cloudera.com>
> >>> > >> wrote:
> >>> > 
> >>> > > Option 2c looks to be working really well. Thanks for tackling
> this
> >>> > >> Appy!
> >>> > >
> >>> > > We still have some failures on the master build, but it looks
> like
> >>> > > actual problems (or perhaps a flakey). There are several
> 

[jira] [Created] (HBASE-17714) Client heartbeats seems to be broken

2017-03-01 Thread Samarth Jain (JIRA)
Samarth Jain created HBASE-17714:


 Summary: Client heartbeats seems to be broken
 Key: HBASE-17714
 URL: https://issues.apache.org/jira/browse/HBASE-17714
 Project: HBase
  Issue Type: Bug
Reporter: Samarth Jain


We have a test in Phoenix where we introduce an artificial sleep of 2 times the 
RPC timeout in preScannerNext() hook of a co-processor. 

{code}
 public static class SleepingRegionObserver extends SimpleRegionObserver {
public SleepingRegionObserver() {}

@Override
public boolean preScannerNext(final 
ObserverContext c,
final InternalScanner s, final List results,
final int limit, final boolean hasMore) throws IOException {
try {
if (SLEEP_NOW && 
c.getEnvironment().getRegion().getRegionInfo().getTable().getNameAsString().equals(TABLE_NAME))
 {
Thread.sleep(RPC_TIMEOUT * 2);
}
} catch (InterruptedException e) {
throw new IOException(e);
}
return super.preScannerNext(c, s, results, limit, hasMore);
}
}
{code}

This test was passing fine till 1.1.3 but started failing sometime before 1.1.9 
with an OutOfOrderScannerException. See PHOENIX-3702. [~lhofhansl] mentioned 
that we have client heartbeats enabled and that should prevent us from running 
into issues like this. FYI, this test fails with 1.2.3 version of HBase too.

CC [~apurtell], [~jamestaylor]





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Successful: HBase Generate Website

2017-03-01 Thread Apache Jenkins Server
Build status: Successful

If successful, the website and docs have been generated. To update the live 
site, follow the instructions below. If failed, skip to the bottom of this 
email.

Use the following commands to download the patch and apply it to a clean branch 
based on origin/asf-site. If you prefer to keep the hbase-site repo around 
permanently, you can skip the clone step.

  git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git

  cd hbase-site
  wget -O- 
https://builds.apache.org/job/hbase_generate_website/503/artifact/website.patch.zip
 | funzip > 4a5eba5e59591a8be06f6a3bf1b7f83a63bbec67.patch
  git fetch
  git checkout -b asf-site-4a5eba5e59591a8be06f6a3bf1b7f83a63bbec67 
origin/asf-site
  git am --whitespace=fix 4a5eba5e59591a8be06f6a3bf1b7f83a63bbec67.patch

At this point, you can preview the changes by opening index.html or any of the 
other HTML pages in your local 
asf-site-4a5eba5e59591a8be06f6a3bf1b7f83a63bbec67 branch.

There are lots of spurious changes, such as timestamps and CSS styles in 
tables, so a generic git diff is not very useful. To see a list of files that 
have been added, deleted, renamed, changed type, or are otherwise interesting, 
use the following command:

  git diff --name-status --diff-filter=ADCRTXUB origin/asf-site

To see only files that had 100 or more lines changed:

  git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}'

When you are satisfied, publish your changes to origin/asf-site using these 
commands:

  git commit --allow-empty -m "Empty commit" # to work around a current ASF 
INFRA bug
  git push origin asf-site-4a5eba5e59591a8be06f6a3bf1b7f83a63bbec67:asf-site
  git checkout asf-site
  git branch -D asf-site-4a5eba5e59591a8be06f6a3bf1b7f83a63bbec67

Changes take a couple of minutes to be propagated. You can verify whether they 
have been propagated by looking at the Last Published date at the bottom of 
http://hbase.apache.org/. It should match the date in the index.html on the 
asf-site branch in Git.

As a courtesy- reply-all to this email to let other committers know you pushed 
the site.



If failed, see https://builds.apache.org/job/hbase_generate_website/503/console

Re: Testing and CI -- Apache Jenkins Builds (WAS -> Re: Testing)

2017-03-01 Thread Sean Busbey
Fun times.

1) Turns out our 1.3-IT jobs have been running against branch-1.2.
Don't know how long, but as long as we have history.

2) I deleted the failing-since-august 1.2-IT job.

3) I renamed the passing 1.3-IT job that runs against branch-1.2 to be
the 1.2-IT job

4) I copied the now renamed 1.2-IT job and made a 1.3-IT job that runs
against branch-1.3

I kicked off jobs after all this shuffling. We'll see how it goes.

On Tue, Feb 21, 2017 at 5:49 PM, Sean Busbey  wrote:
> FYI, I updated the 1.2-IT and 1.3-IT jobs today to use Appy's
> suggested "custom child workspace" of "${SHORT_COMBINATION}", since
> spaces in paths had caused them to fail for a v long time.
>
> On Fri, Oct 14, 2016 at 4:44 PM, Andrew Purtell  wrote:
>> Thanks Ted, that would be a nice contribution, thank you.
>>
>>
>> On Fri, Oct 14, 2016 at 12:07 PM, Apekshit Sharma  wrote:
>>
>>> @Ted, here's the old jira, HBASE-14167. Use that.
>>>
>>> On Fri, Oct 14, 2016 at 12:02 PM, Ted Yu  wrote:
>>>
>>> > I just ran the tests in hbase-spark module using 'mvn verify'.
>>> >
>>> > All passed.
>>> >
>>> > I am testing a patch locally where hbase-spark tests are run in test
>>> phase.
>>> >
>>> > If the tests pass, I will log a JIRA.
>>> >
>>> > Thanks
>>> >
>>> > > On Oct 14, 2016, at 11:41 AM, Andrew Purtell 
>>> > wrote:
>>> > >
>>> > > The hbase-spark integration tests run (and fail) for me locally
>>> whenever
>>> > I
>>> > > build master with 'mvn clean install -DskipITs' .
>>> > >
>>> > > HBaseConnectionCacheSuite:
>>> > > - all test cases *** FAILED ***
>>> > >  2 did not equal 1 (HBaseConnectionCacheSuite.scala:92)
>>> > >
>>> > > Saw it but had to ignore/triage to get something else done.
>>> > >
>>> > > We have a weird situation where integration tests run when they
>>> shouldn't
>>> > > locally yet no tests run at all for patch process?
>>> > >
>>> > > I would like to see Spark behave like the other modules. I remember
>>> > filing
>>> > > a JIRA asking that hbase-spark honor -DskipITs. It still doesn't.
>>> > > Meanwhile, it does its own thing with '-DskipSparkTests', which is not
>>> > > appropriate given that none of the other modules have their own
>>> distinct
>>> > > control parameters. There also doesn't seem to be a distinction between
>>> > > unit tests and integration tests. The 'test' target does nothing.
>>> > > Everything happens during the 'integration-test' phase. Is this a Spark
>>> > > limitation?
>>> > >
>>> > >
>>> > >> On Fri, Oct 14, 2016 at 11:27 AM, Sean Busbey 
>>> > wrote:
>>> > >>
>>> > >> Do the HBase Spark tests only run during the maven verify command?
>>> > >> We'll need to update our personality to say that that command should
>>> > >> be used for unit tests when in the hbase spark module. ugh.
>>> > >>
>>> > >> On Thu, Oct 13, 2016 at 7:42 PM, Apekshit Sharma 
>>> > >> wrote:
>>> > >>> Our patch process isn't running hbase-spark tests. See this for
>>> > example:
>>> > >>>
>>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>>> > >>> https://builds.apache.org/job/PreCommit-HBASE-Build/3842/
>>> > >> artifact/patchprocess/patch-unit-hbase-spark.txt/*view*/
>>> > >>>
>>> > >>> Found it when trying to debug cause of trunk failures. Part of the
>>> > cause
>>> > >> is
>>> > >>> hbase-spark's HBaseConnectionCacheSuite test failure (
>>> > >>> https://builds.apache.org/view/All/job/HBase-Trunk_
>>> > >> matrix/jdk=JDK%201.8%20(latest),label=yahoo-not-h2/1776/consoleFull
>>> > >> >> > matrix/jdk=JDK%201.8%20%28latest%29,label=yahoo-not-h2/1776/consoleFull>
>>> > >> )
>>> > >>> which was added in HBASE-16638. However, to be fair, QA was green and
>>> > >>> reported passing hbase-spark tests for that jira.
>>> > >>>
>>> >  On Mon, Sep 19, 2016 at 12:57 PM, Stack  wrote:
>>> > 
>>> >  childCustomWorkspace seems to be just the ticket. Nice find Appy.
>>> >  St.Ack
>>> > 
>>> >  On Mon, Sep 19, 2016 at 10:03 AM, Sean Busbey 
>>> > >> wrote:
>>> > 
>>> > > Option 2c looks to be working really well. Thanks for tackling this
>>> > >> Appy!
>>> > >
>>> > > We still have some failures on the master build, but it looks like
>>> > > actual problems (or perhaps a flakey). There are several passing
>>> > > builds.
>>> > >
>>> > > This should be pretty easy to replicate on the other jobs. I don't
>>> > see
>>> > > a downside. Anyone else have concerns?
>>> > >
>>> > >
>>> > > On Fri, Sep 16, 2016 at 6:15 PM, Apekshit Sharma <
>>> a...@cloudera.com>
>>> > > wrote:
>>> > >> So this all started with spaces-in-path issue, right?  I think it
>>> > >> has
>>> > >> gobbled up a lot of time of a lot of people.
>>> > >> Let's discuss our options and try to fix it for good. Here are
>>> what
>>> 

[jira] [Created] (HBASE-17713) the interface '/version/cluster' with header 'Accept: application/json' return is not JSON but plain text

2017-03-01 Thread Feng Ce (JIRA)
Feng Ce created HBASE-17713:
---

 Summary: the interface '/version/cluster' with header 'Accept: 
application/json' return is not JSON but plain text
 Key: HBASE-17713
 URL: https://issues.apache.org/jira/browse/HBASE-17713
 Project: HBase
  Issue Type: Bug
  Components: REST
Affects Versions: 1.2.2
 Environment: Hhbase 1.2.2
Reporter: Feng Ce
Priority: Minor


Hbase REST API, this interface `get 'version/cluster'`, when I use the header 
`Accept: application/json`, the response is not JSON but plain text.

curl -X GET \
  -H "Accept: application/json" \
  "http://localhost:/version/cluster;
# "1.2.2"

But when I use `Accept: text/xml`, the response is correct XML.

curl -X GET \
  -H "Accept: text/xml" \
  "http://localhost:/version/cluster;
# 1.2.2



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)