I got the HDFS block report for my cluster. We run with 7-8 RS groups. The
data (primary + replica) is assigned within RS Group since we use RS Group
Load Balancer. On the WAL files, only the primary is the same server as the
hbase region server. The secondary and tertiary replicas are spread across
the cluster, irrespective of RS groups.

In this case, multi-tenancy is still not complete without WAL isolation to
the rsgroup/tenant.

--
Nikhil Bafna | 8095234263


On Mon, Dec 10, 2018 at 5:25 PM Nikhil Bafna <[email protected]>
wrote:

>
> one thing that comes to my mind is that most of the time,
>> within a normal functioning hdfs as the file system, WAL files blocks
>> would already be located on nodes of the given RS group due data
>> locality
>
>
> The primary node hosting the WAL blocks would be the same as the region
> server. But, would the secondary and tertiary servers for the WALs also be
> within the same RS Group by default? From the code, I don't see any hints
> passed to HDFS during WAL output stream create to indicate this.
>
> --
> Nikhil Bafna | 8095234263
>
>
> On Mon, Dec 10, 2018 at 4:59 PM Wellington Chevreuil <
> [email protected]> wrote:
>
>> Hi Nikhil, yeah, jira would be more suitable for discussions involving
>> code proposals, with patch reviews.
>>
>> Thinking on the tradeoff from benefits versus impacts/risks of these
>> changes, one thing that comes to my mind is that most of the time,
>> within a normal functioning hdfs as the file system, WAL files blocks
>> would already be located on nodes of the given RS group due data
>> locality. So would you feel is it still relevant the given
>> refactoring?
>>
>> As a side note, you might want to focus on branch 2 code base for new
>> features such as this, since there's been discussion about targeting
>> only bug fixes for branch-1, as version 1 approaches EOL.Em seg, 10 de
>> dez de 2018 às 10:51, Nikhil Bafna <[email protected]>
>> escreveu:
>> >
>> > I'm looking at extending HBASE-6721 to apply it to WALs such that WALs
>> are
>> > created & replicated within an RSGroup. This extends multi-tenancy to
>> WALs
>> > also, and not just cover Hbase data. I was working out of 1.2.x code.
>> >
>> > The approach I'm using is
>> > - Strategy interface for WAL placement on the filesystem. Default to
>> > delegate it to respective filesystem (which is the old behavior).
>> > FavoredNode strategy computes the favoured nodes from the RSGroup
>> > memberships.
>> > - FavoredNode strategy requires instance of hbase.Server, to get the
>> > current server name and an instance of Zookeeper watch to listen for
>> > changes to RSGroup memberships
>> > - The strategy is initialised in HRegionServer init and set in a static
>> > field in DefaultWALProvider
>> > - DefaultWALProvider.Writer takes the strategy in its init, and invokes
>> it
>> > before output stream creation and passes the favored nodes information
>> > to DistributedFileSystem.create()
>> >
>> > Few questions
>> > - Any glaring miss in the approach?
>> > - I have hesitation in setting the strategy in static field in
>> > DefaultWALProvider. I would have preferred it to be passed in "init"
>> > itself, but that change seems to be too expansive.
>> > - Also, this introduces the dependency of server/zookeeper instance
>> inside
>> > the WAL code path, which seems to be not there till now. Is that an
>> > explicit choice to keep them separate?
>> > - If it seems like a useful change, should I open a JIRA and add patches
>> > and seek feedback there?
>> >
>> > --
>> > Nikhil Bafna | 8095234263
>>
>

Reply via email to