[ANNOUNCE] HBaseConAsia 2018 CFP now open!

2018-05-14 Thread Yu Li
All,

I'm pleased to announce HBaseConAsia 2018 which is to be held in Beijing,
China on Aug. 17th.

A call for proposals is available now[1], and we encourage all HBase users
and developers to contribute a talk[2] and plan to attend the event
(however, event registration is not yet available).

We will update more details for the event at [3] (not available yet but
will be soon), please watch it and feel free to ask the dev@hbase.apache.org
 mailing list or myself if any questions.

Thanks and please start planning those talks!

- Yu (on behalf of the HBase PMC)

[1] *https://easychair.org/cfp/hbaseconasia-2018
*
[2] https://easychair.org/conferences/?conf=hbaseconasia2018
[3] https://hbase.apache.org/hbaseconasia-2018/


Re: WAL storage policies and interactions with Hadoop admin tools.

2018-05-14 Thread Yu Li
Thanks for pointing this out Sean. IMHO, after re-checking the codes,
HBASE-18118 needs an addendum (at least). The proposal was to set the
storage policy of WAL directory to HOT by default, but the current
implementation could not achieve this: it follows the old "NONE" logic to
escape calling the API if policy matches default, but for "HOT" we need an
explicit call to HDFS.

Further more, I think the old logic to leave default to "NONE" is even
better: if admin set hbase.root.dir to some policy like ALL_SSD the WAL
will simply follow, and if not the policy is HOT by default
So maybe reverting HBASE-18118 is a better choice although I could see my
own +1 on HBASE-18118 there?... @Andrew what's your opinion here?

And btw, I have opened HBASE-20479 for documenting the whole HSM solution
in hbase including HFile/WAL/Bulkload etc. (but still haven't got enough
time to complete it) JFYI.


Best Regards,
Yu

On 15 May 2018 at 05:14, Sean Busbey  wrote:

> Hi folks!
>
> I'm trying to reason through our "set a storage policy for WALs"
> feature and having some difficulty. I want to get some feedback before
> I fix our docs or submit a patch to change behavior.
>
> Here's the history of the feature as I understand it:
>
> 1) Starting in HBase 1.1 you can change the setting
> "hbase.wal.storage.policy" and if the underlying Hadoop installation
> supports storage policies[1] then we'll call the needed APIs to set
> policies as we create WALs.
>
> The main use case is to tell HDFS that you want the HBase WAL on SSDs
> in a mixed hardware deployment.
>
> 2) In HBase 1.1 - 1.4, the above setting defaulted to the value
> "NONE". Our utility code for setting storage policies expressly checks
> any config value against the default and when it matches opts to log a
> message rather than call the actual Hadoop API[2]. This is important
> since "NONE" isn't actually a valid storage policy, so if we pass it
> to the Hadoop API we'll get a bunch of log noise.
>
> 3) In HBase 2 and 1.5+, the setting defaults to "HOT" as of
> HBASE-18118. Now if we were to pass the value to the Hadoop API we
> won't get log noise. The utility code does the same check against our
> default. The Hadoop default storage policy is "HOT" so presumably we
> save an RPC call by not setting it again.
>
> 
>
> If the above is correct, how do I specify that I want WALs to have a
> storage policy of HOT in the event that HDFS already has some other
> policy in place for a parent directory?
>
> e.g. In HBase 1.1 - 1.4, I can set the storage policy (via Hadoop
> admin tools) for "/hbase" to be COLD and I can change
> "hbase.wal.storage.policy" to HOT. In HBase 2 and 1.5+, AFAICT my WALs
> will still have the COLD policy.
>
> Related, but different problem: I can use Hadoop admin tools to set
> the storage policy for "/hbase" to be "ALL_SSD" and if I leave HBase
> configs on defaults then I end up with WALs having "ALL_SSD" as their
> policy in all versions. But in HBase 2 and 1.5+ the HBase configs
> claim the policy is HOT.
>
> Should we always set the policy if the api is available? To avoid
> having to double-configure in something like the second case, do we
> still need a way to say "please do not expressly set a storage
> policy"? (as an alternative we could just call out "be sure to update
> your WAL config" in docs)
>
>
>
> [1]: "Storage Policy" gets called several things in Hadoop, like
> Archival Storage, Heterogenous Storage, HSM, and "Hierarchical
> Storage". In all cases I'm talking about the feature documented here:
>
> http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/
> hadoop-hdfs/ArchivalStorage.html
> http://hadoop.apache.org/docs/r3.0.2/hadoop-project-dist/
> hadoop-hdfs/ArchivalStorage.html
>
> I think it's available in Hadoop 2.6.0+, 3.0.0+.
>
> [2]:
>
> In rel/1.2.0 you can see the default check by tracing starting at FSHLog:
>
> https://s.apache.org/BqAk
>
> The constants referred to in that code are in HConstants:
>
> https://s.apache.org/OJyR
>
> And in FSUtils we exit the function early when the default matches
> what we pull out of configs:
>
>  https://s.apache.org/A4GA
>
> In rel/2.0.0 the code works essentially the same but has moved around.
> The starting point is now AbstractFSWAL:
>
> https://s.apache.org/pp6T
>
> The constants now use HOT instead of NONE as a default:
>
> https://s.apache.org/7K2J
>
> and in CommonFSUtils we do the same early return:
>
> https://s.apache.org/fYKr
>


[jira] [Created] (HBASE-20583) SplitLogWorker should handle FileNotFoundException when split a wal

2018-05-14 Thread Guanghao Zhang (JIRA)
Guanghao Zhang created HBASE-20583:
--

 Summary: SplitLogWorker should handle FileNotFoundException when 
split a wal
 Key: HBASE-20583
 URL: https://issues.apache.org/jira/browse/HBASE-20583
 Project: HBase
  Issue Type: Bug
Reporter: Guanghao Zhang
Assignee: Guanghao Zhang


When a split task is finished, master will delete the wal first, then remove 
the task's zk node. So if master crashed after delelte the wal, the zk task 
node may be leaved on zk. When master resubmit this task, the task will failed 
by FileNotFoundException.

We also handle FileNotFoundException in WALSplitter. But not handle this in 
SplitLogWorker.

 
{code:java}
  try {
in = getReader(path, reporter);
  } catch (EOFException e) {
if (length <= 0) {
  // TODO should we ignore an empty, not-last log file if skip.errors
  // is false? Either way, the caller should decide what to do. E.g.
  // ignore if this is the last log in sequence.
  // TODO is this scenario still possible if the log has been
  // recovered (i.e. closed)
  LOG.warn("Could not open {} for reading. File is empty", path, e);
}
// EOFException being ignored
return null;
  }
} catch (IOException e) {
  if (e instanceof FileNotFoundException) {
// A wal file may not exist anymore. Nothing can be recovered so move on
LOG.warn("File {} does not exist anymore", path, e);
return null;
  }
}{code}
{code:java}
// Here fs.getFileStatus may throw FileNotFoundException, too. We should handle 
this exception as the WALSplitter.getReader.
try {
  if (!WALSplitter.splitLogFile(walDir, fs.getFileStatus(new Path(walDir, 
filename)),
fs, conf, p, sequenceIdChecker,
  server.getCoordinatedStateManager().getSplitLogWorkerCoordination(), 
factory)) {
return Status.PREEMPTED;
  }
} 
{code}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20582) Bump up the Jackson and Jruby version because of some reported vulnerabilities

2018-05-14 Thread Ankit Singhal (JIRA)
Ankit Singhal created HBASE-20582:
-

 Summary: Bump up the Jackson and Jruby version because of some 
reported vulnerabilities
 Key: HBASE-20582
 URL: https://issues.apache.org/jira/browse/HBASE-20582
 Project: HBase
  Issue Type: Bug
Reporter: Ankit Singhal
Assignee: Ankit Singhal
 Fix For: 2.1.0


There are some vulnerabilities reported with two of the libraries used in HBase.

{code}
Jackson(version:2.9.2):
CVE-2017-17485
CVE-2018-5968
CVE-2018-7489

Jruby(version:9.1.10.0):
CVE-2009-5147
CVE-2013-4363
CVE-2014-4975
CVE-2014-8080
CVE-2014-8090
CVE-2015-3900
CVE-2015-7551
CVE-2015-9096
CVE-2017-0899
CVE-2017-0900
CVE-2017-0901
CVE-2017-0902
CVE-2017-0903
CVE-2017-10784
CVE-2017-14064
CVE-2017-9224
CVE-2017-9225
CVE-2017-9226
CVE-2017-9227
CVE-2017-9228
{code}

Tool somehow able to relate the vulnerability of Ruby with JRuby(Java 
implementation).

Not all of them directly affects HBase but it is better to be on the updated 
version to avoid issues during an audit in security sensitive organization.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


WAL storage policies and interactions with Hadoop admin tools.

2018-05-14 Thread Sean Busbey
Hi folks!

I'm trying to reason through our "set a storage policy for WALs"
feature and having some difficulty. I want to get some feedback before
I fix our docs or submit a patch to change behavior.

Here's the history of the feature as I understand it:

1) Starting in HBase 1.1 you can change the setting
"hbase.wal.storage.policy" and if the underlying Hadoop installation
supports storage policies[1] then we'll call the needed APIs to set
policies as we create WALs.

The main use case is to tell HDFS that you want the HBase WAL on SSDs
in a mixed hardware deployment.

2) In HBase 1.1 - 1.4, the above setting defaulted to the value
"NONE". Our utility code for setting storage policies expressly checks
any config value against the default and when it matches opts to log a
message rather than call the actual Hadoop API[2]. This is important
since "NONE" isn't actually a valid storage policy, so if we pass it
to the Hadoop API we'll get a bunch of log noise.

3) In HBase 2 and 1.5+, the setting defaults to "HOT" as of
HBASE-18118. Now if we were to pass the value to the Hadoop API we
won't get log noise. The utility code does the same check against our
default. The Hadoop default storage policy is "HOT" so presumably we
save an RPC call by not setting it again.



If the above is correct, how do I specify that I want WALs to have a
storage policy of HOT in the event that HDFS already has some other
policy in place for a parent directory?

e.g. In HBase 1.1 - 1.4, I can set the storage policy (via Hadoop
admin tools) for "/hbase" to be COLD and I can change
"hbase.wal.storage.policy" to HOT. In HBase 2 and 1.5+, AFAICT my WALs
will still have the COLD policy.

Related, but different problem: I can use Hadoop admin tools to set
the storage policy for "/hbase" to be "ALL_SSD" and if I leave HBase
configs on defaults then I end up with WALs having "ALL_SSD" as their
policy in all versions. But in HBase 2 and 1.5+ the HBase configs
claim the policy is HOT.

Should we always set the policy if the api is available? To avoid
having to double-configure in something like the second case, do we
still need a way to say "please do not expressly set a storage
policy"? (as an alternative we could just call out "be sure to update
your WAL config" in docs)



[1]: "Storage Policy" gets called several things in Hadoop, like
Archival Storage, Heterogenous Storage, HSM, and "Hierarchical
Storage". In all cases I'm talking about the feature documented here:

http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
http://hadoop.apache.org/docs/r3.0.2/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

I think it's available in Hadoop 2.6.0+, 3.0.0+.

[2]:

In rel/1.2.0 you can see the default check by tracing starting at FSHLog:

https://s.apache.org/BqAk

The constants referred to in that code are in HConstants:

https://s.apache.org/OJyR

And in FSUtils we exit the function early when the default matches
what we pull out of configs:

 https://s.apache.org/A4GA

In rel/2.0.0 the code works essentially the same but has moved around.
The starting point is now AbstractFSWAL:

https://s.apache.org/pp6T

The constants now use HOT instead of NONE as a default:

https://s.apache.org/7K2J

and in CommonFSUtils we do the same early return:

https://s.apache.org/fYKr


[jira] [Resolved] (HBASE-20580) Fixed the flaky TestEnableRSGroup which was caused by HBASE-20544

2018-05-14 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey resolved HBASE-20580.
-
Resolution: Duplicate
  Assignee: (was: Zheng Hu)

resolving as duplicate now that the addendum to HBASE-20544 has landed.

> Fixed the flaky TestEnableRSGroup which was caused by HBASE-20544
> -
>
> Key: HBASE-20580
> URL: https://issues.apache.org/jira/browse/HBASE-20580
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20581) HBase book documentation wrong for REST operations on schema endpoints

2018-05-14 Thread Josh Elser (JIRA)
Josh Elser created HBASE-20581:
--

 Summary: HBase book documentation wrong for REST operations on 
schema endpoints
 Key: HBASE-20581
 URL: https://issues.apache.org/jira/browse/HBASE-20581
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Josh Elser
Assignee: Josh Elser


On [https://hbase.apache.org/book.html#_using_rest_endpoints]

The documentation states that to update a table schema (the configuration for a 
column family), the {{PUT}} HTTP verb will update the current configuration 
with the "fragment" of configuration provided, while the {{POST}} HTTP verb 
will replace the current configuration with whatever is provided.

In reality, the opposite is true: {{POST}} updates the configuration, {{PUT}} 
replaces. The old javadoc for the o.a.h.h.rest package got it right, but the 
entry on the HBase book transposed this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20580) Fixed the flaky TestEnableRSGroup which was caused by HBASE-20544

2018-05-14 Thread Zheng Hu (JIRA)
Zheng Hu created HBASE-20580:


 Summary: Fixed the flaky TestEnableRSGroup which was caused by 
HBASE-20544
 Key: HBASE-20580
 URL: https://issues.apache.org/jira/browse/HBASE-20580
 Project: HBase
  Issue Type: Bug
Reporter: Zheng Hu
Assignee: Zheng Hu






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)