Re: [Gluster-devel] Data classification proposal

Vivek Agarwal Fri, 27 Jun 2014 00:48:26 -0700

On 06/27/2014 12:46 AM, Shyamsundar Ranganathan wrote:

Wanted to add to the thought process a different angle towards thinking about 
the data classified volumes.


One of the reasons for classifying data (be it tiering or others, like high 
profile users to high profile storage backends), is to deal with its (i.e data) 
protection differently.

With the current model as we discuss presenting the entire volume for 
consumption by clients to the file system, we should think about clients like 
backup, where the backup policy for a sub volume could differ from the backup 
policy for another (or say geo replication instead of backup).

I would think, other such use cases/clients would need to view parts of the 
volume and not the whole, when attempting to perform their function. For 
example in the backup case, the fast tier could be backed up daily and the slow 
tier could be backed up weekly, in which case one would need volume graphs that 
split this view for the client in question.

Agreed, the proposal sent by Joseph Fernandes a couple of days backsuggests something similar. You might want to look at the presentationsent by him.

Subject line being "Proposal for Gluster Compliance Feature"

Regards,
Vivek

Just a thought.

Shyam

----- Original Message -----

From: "Dan Lambright" <dlamb...@redhat.com>
To: "Jeff Darcy" <jda...@redhat.com>
Cc: "Gluster Devel" <gluster-devel@gluster.org>
Sent: Monday, June 23, 2014 4:48:13 PM
Subject: Re: [Gluster-devel] Data classification proposal

A frustrating aspect of Linux is the complexity of /etc configuration file's
formats (rsyslog.conf, logrotate, cron, yum repo files, etc) In that spirit
I would simplify the "select" in the data classification proposal (copied
below) to only accept a list of bricks/sub-tiers with wild-cards '*', rather
than full-blown regular expressions or key/value pairs. I would drop the
"unclaimed" keyword, and not have keywords "media type", and "rack". It does
not seem necessary to introduce new keys for the underlying block device
type (SSD vs disk) any more than we need to express the filesystem (XFS vs
ext4). In other words, I think tiering can be fully expressed in the
configuration file while still abstracting the underlying storage. That
said, the configuration file could be built up by a CLI or GUI, and richer
expressibility could exist at that level.

example:

brick host1:/brick ssd-group0-1

brick host2:/brick ssd-group0-2

brick host3:/brick disk-group0-1

rule tier-1
        select ssd-group0*

rule tier-2
        select disk-group0

rule all
        select tier-1
        # use repeated "select" to establish order
        select tier-2
        type features/tiering

The filtering option's regular expressions seem hard to avoid. If just the
name of the file satisfies most use cases (that we know of?) I do not think
there is any way to avoid regular expressions in the option for filters.
(Down the road, if we were to allow complete flexibility in how files can be
distributed across subvolumes, the filtering problems may start to look
similar to 90s-era packet classification with a solution along the lines of
the Berkeley packet filter.)

There may be different rules by which data is distributed at the "tiering"
level. For example, one tiering policy could be the fast tier (first
listed). It would be a "cache" for the slow tier (second listed). I think
the "option" keyword could handle that.

rule all
        select tier-1
         # use repeated "select" to establish order
        select tier-2
        type features/tiering
        option tier-cache, mode=writeback, dirty-watermark=80

Another example tiering policy could be based on compliance ; when a file
needs to become read-only, it moves from the first listed tier to the
second.

rule all
         select tier-1
         # use repeated "select" to establish order
         select tier-2
         type features/tiering
        option tier-retention

----- Original Message -----
From: "Jeff Darcy" <jda...@redhat.com>
To: "Gluster Devel" <gluster-devel@gluster.org>
Sent: Friday, May 23, 2014 3:30:39 PM
Subject: [Gluster-devel] Data classification proposal

One of the things holding up our data classification efforts (which include
tiering but also other stuff as well) has been the extension of the same
conceptual model from the I/O path to the configuration subsystem and
ultimately to the user experience.  How does an administrator define a
tiering policy without tearing their hair out?  How does s/he define a mixed
replication/erasure-coding setup without wanting to rip *our* hair out?  The
included Markdown document attempts to remedy this by proposing one out of
many possible models and user interfaces.  It includes examples for some of
the most common use cases, including the "replica 2.5" case we'e been
discussing recently.  Constructive feedback would be greatly appreciated.

# Data Classification Interface

The data classification feature is extremely flexible, to cover use cases
from
SSD/disk tiering to rack-aware placement to security or other policies.  With
this flexibility comes complexity.  While this complexity does not affect the
I/O path much, it does affect both the volume-configuration subsystem and the
user interface to set placement policies.  This document describes one
possible
model and user interface.

The model we used is based on two kinds of information: brick descriptions
and
aggregation rules.  Both are contained in a configuration file (format TBD)
which can be associated with a volume using a volume option.

## Brick Descriptions

A brick is described by a series of simple key/value pairs.  Predefined keys
include:

  * **media-type**
    The underlying media type for the brick.  In its simplest form this might
    just be *ssd* or *disk*.  More sophisticated users might use something
    like
    *15krpm* to represent a faster disk, or *perc-raid5* to represent a brick
    backed by a RAID controller.

  * **rack** (and/or **row**)
    The physical location of the brick.  Some policy rules might be set up to
    spread data across more than one rack.

User-defined keys are also allowed.  For example, some users might use a
*tenant* or *security-level* tag as the basis for their placement policy.

## Aggregation Rules

Aggregation rules are used to define how bricks should be combined into
subvolumes, and those potentially combined into higher-level subvolumes, and
so
on until all of the bricks are accounted for.  Each aggregation rule consists
of the following parts:

  * **id**
    The base name of the subvolumes the rule will create.  If a rule is
    applied
    multiple times this will yield *id-0*, *id-1*, and so on.

  * **selector**
    A "filter" for which bricks or lower-level subvolumes the rule will
    aggregate.  This is an expression similar to a *WHERE* clause in SQL,
    using
    brick/subvolume names and properties in lieu of columns.  These values are
    then matched against literal values or regular expressions, using the
    usual
    set of boolean operators to arrive at a *yes* or *no* answer to the
    question
    of whether this brick/subvolume is affected by this rule.

  * **group-size** (optional)
    The number of original bricks/subvolumes to be combined into each produced
    subvolume.  The special default value zero means to collect all original
    bricks or subvolumes into one final subvolume.  In this case, *id* is used
    directly instead of having a numeric suffix appended.

  * **type** (optional)
    The type of the generated translator definition(s).  Examples might
    include
    "AFR" to do replication, "EC" to do erasure coding, and so on.  The more
    general data classification task includes the definition of new
    translators
    to do tiering and other kinds of filtering, but those are beyond the scope
    of this document.  If no type is specified, cluster/dht will be used to do
    random placement among its constituents.

  * **tag** and **option** (optional, repeatable)
    Additional tags and/or options to be applied to each newly created
    subvolume.  See the "replica 2.5" example to see how this can be used.

Since each type might have unique requirements, such as ensuring that
replication is done across machines or racks whenever possible, it is assumed
that there will be corresponding type-specific scripts or functions to do the
actual aggregation.  This might even be made pluggable some day (TBD).  Once
all rule-based aggregation has been done, volume options are applied
similarly
to how they are now.

Astute readers might have noticed that it's possible for a brick to be
aggregated more than once.  This is intentional.  If a brick is part of
multiple aggregates, it will be automatically split into multiple bricks
internally but this will be invisible to the user.

## Examples

Let's start with a simple tiering example.  Here's what the
data-classification
config file might look like.

        brick host1:/brick
                media-type = ssd

        brick host2:/brick
                media-type = disk

        brick host3:/brick
                media-type = disk

        rule tier-1
                select media-type = ssd

        rule tier-2
                select media-type = disk

        rule all
                select tier-1
                # use repeated "select" to establish order
                select tier-2
                type features/tiering

This would create a DHT subvolume name *tier-2* for the bricks on *host2* and
*host3*.  Then it would add a features/tiering translator to treat *tier-1*
as
its upper tier and *tier-2* as its lower.  Here's a more complex example that
adds replication and erasure coding to the mix.

        # Assume 20 hosts, four fast and sixteen slow (named appropriately).

        rule tier-1
                 select *fast*
                group-size 2
                type cluster/afr

        rule tier-2
                # special pattern matching otherwise-unused bricks
                select %{unclaimed}
                group-size 8
                type cluster/ec parity=2
                # i.e. two groups, each six data plus two parity

        rule all
                select tier-1
                select tier-2
                type features/tiering

Lastly, here's an example of "replica 2.5" to do three-way replication for
some
files but two-way replication for the rest.

        rule two-way-parts
                select *
                group-size 2
                type cluster/afr

        rule two-way-pool
                select two-way-parts*
                tag special=no

        rule three-way-parts
                # use overlapping selections to demonstrate splitting
                select *
                group-size 3
                type cluster/afr

        rule three-way-pool
                select three-way-parts*
                tag special=yes

        rule sanlock
                select two-way*
                select three-way*
                type features/filter
                # files named *.lock go in the replica-3 pool
                option filter-condition-1 name:*.lock
                option filter-target-1 three-way-pool
                # everything else goes in the replica-2 pool
                option default-subvol two-way-pool
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Data classification proposal

Reply via email to