Thanks for pointing this out Sean. IMHO, after re-checking the codes,
HBASE-18118 needs an addendum (at least). The proposal was to set the
storage policy of WAL directory to HOT by default, but the current
implementation could not achieve this: it follows the old "NONE" logic to
escape calling the API if policy matches default, but for "HOT" we need an
explicit call to HDFS.

Further more, I think the old logic to leave default to "NONE" is even
better: if admin set hbase.root.dir to some policy like ALL_SSD the WAL
will simply follow, and if not the policy is HOT by default
So maybe reverting HBASE-18118 is a better choice although I could see my
own +1 on HBASE-18118 there?... @Andrew what's your opinion here?

And btw, I have opened HBASE-20479 for documenting the whole HSM solution
in hbase including HFile/WAL/Bulkload etc. (but still haven't got enough
time to complete it) JFYI.


Best Regards,
Yu

On 15 May 2018 at 05:14, Sean Busbey <bus...@apache.org> wrote:

> Hi folks!
>
> I'm trying to reason through our "set a storage policy for WALs"
> feature and having some difficulty. I want to get some feedback before
> I fix our docs or submit a patch to change behavior.
>
> Here's the history of the feature as I understand it:
>
> 1) Starting in HBase 1.1 you can change the setting
> "hbase.wal.storage.policy" and if the underlying Hadoop installation
> supports storage policies[1] then we'll call the needed APIs to set
> policies as we create WALs.
>
> The main use case is to tell HDFS that you want the HBase WAL on SSDs
> in a mixed hardware deployment.
>
> 2) In HBase 1.1 - 1.4, the above setting defaulted to the value
> "NONE". Our utility code for setting storage policies expressly checks
> any config value against the default and when it matches opts to log a
> message rather than call the actual Hadoop API[2]. This is important
> since "NONE" isn't actually a valid storage policy, so if we pass it
> to the Hadoop API we'll get a bunch of log noise.
>
> 3) In HBase 2 and 1.5+, the setting defaults to "HOT" as of
> HBASE-18118. Now if we were to pass the value to the Hadoop API we
> won't get log noise. The utility code does the same check against our
> default. The Hadoop default storage policy is "HOT" so presumably we
> save an RPC call by not setting it again.
>
> ----
>
> If the above is correct, how do I specify that I want WALs to have a
> storage policy of HOT in the event that HDFS already has some other
> policy in place for a parent directory?
>
> e.g. In HBase 1.1 - 1.4, I can set the storage policy (via Hadoop
> admin tools) for "/hbase" to be COLD and I can change
> "hbase.wal.storage.policy" to HOT. In HBase 2 and 1.5+, AFAICT my WALs
> will still have the COLD policy.
>
> Related, but different problem: I can use Hadoop admin tools to set
> the storage policy for "/hbase" to be "ALL_SSD" and if I leave HBase
> configs on defaults then I end up with WALs having "ALL_SSD" as their
> policy in all versions. But in HBase 2 and 1.5+ the HBase configs
> claim the policy is HOT.
>
> Should we always set the policy if the api is available? To avoid
> having to double-configure in something like the second case, do we
> still need a way to say "please do not expressly set a storage
> policy"? (as an alternative we could just call out "be sure to update
> your WAL config" in docs)
>
>
>
> [1]: "Storage Policy" gets called several things in Hadoop, like
> Archival Storage, Heterogenous Storage, HSM, and "Hierarchical
> Storage". In all cases I'm talking about the feature documented here:
>
> http://hadoop.apache.org/docs/r2.7.5/hadoop-project-dist/
> hadoop-hdfs/ArchivalStorage.html
> http://hadoop.apache.org/docs/r3.0.2/hadoop-project-dist/
> hadoop-hdfs/ArchivalStorage.html
>
> I think it's available in Hadoop 2.6.0+, 3.0.0+.
>
> [2]:
>
> In rel/1.2.0 you can see the default check by tracing starting at FSHLog:
>
> https://s.apache.org/BqAk
>
> The constants referred to in that code are in HConstants:
>
> https://s.apache.org/OJyR
>
> And in FSUtils we exit the function early when the default matches
> what we pull out of configs:
>
>  https://s.apache.org/A4GA
>
> In rel/2.0.0 the code works essentially the same but has moved around.
> The starting point is now AbstractFSWAL:
>
> https://s.apache.org/pp6T
>
> The constants now use HOT instead of NONE as a default:
>
> https://s.apache.org/7K2J
>
> and in CommonFSUtils we do the same early return:
>
> https://s.apache.org/fYKr
>

Reply via email to