+1 I think this is the right step. My hunch is that some of the common data access patterns that we have in Accumulo (over HBase) is that the per-colfam encryption isn't quick as common a design pattern as it is for HBase (please tell me I'm wrong if anyone disagrees -- this is mostly a gut reaction). I think our users would likely benefit more from a per-namespace/table encryption control like you suggest.

Implementing RFile encryption at HDFS level (e.g. tie a specific zone/key for a table) is probably straightforward. Changing the TServer's WAL use would likely be trickier to get right (a tserver would have multiple WALs, one for each unique zone/key from Tablet it happens to host). Maybe worrying about that is getting ahead of things -- just thought about it and figured I'd mention it :)

William Slacum wrote:
Yup, #2. I also don't know if it's worth the effort for that specific
feature. It might be easier to add something like per-namespace and/or
per-table encryption, then define common access patterns for applications
that want to use multiple keys for encryption.



On Wed, Nov 4, 2015 at 8:10 PM, Adam Fuchs<[email protected]>  wrote:

Bill,

Do you envision one of the following as the driver behind finer-grained
encryption?:

1. We would only encrypt certain columns in order to get better
performance;

2. We would use different keys on different columns in order to revoke
access to a column via the key store;

3. We would only give a tablet server access to a subset of columns at any
given time in order to protect something, and figure out what to do for
compactions, etc.;

4. Something entirely different...

Seems like thing #2 might have merit, but I'm not sure it's worth the
effort.

Adam
On Nov 4, 2015 7:38 PM, "William Slacum"<[email protected]>  wrote:

@Adam, column family level encryption can be useful for multi-tenant
environments, and I think it maps pretty well to the document
partitioning/sharding/wikisearch style tables. Things are trickier in
Accumulo than in HBase since there isn't a 1:1 mapping between column
families and files. The built in RFile encryption scheme seems better
suited to this.

@Christopher&  Keith, it's something we can evaluate. Is there a good
test
harness for just writing an RFile, opening a reader to it, and just
poking
around? I was looking at the constructors and they didn't seem
straightforward enough for me to comprehend them within a few seconds.



On Tue, Nov 3, 2015 at 9:56 PM, Keith Turner<[email protected]
<javascript:_e(%7B%7D,'cvml','[email protected]');>>  wrote:

On Mon, Nov 2, 2015 at 1:37 PM, Keith Turner<[email protected]
<javascript:_e(%7B%7D,'cvml','[email protected]');>>  wrote:


On Mon, Nov 2, 2015 at 12:27 PM, William Slacum<[email protected]
<javascript:_e(%7B%7D,'cvml','[email protected]');>>  wrote:
Is "the code being 'at rest'" you making a funny about active
development?
Making sure I haven't lost my ability to get jokes :)

I see two reasons why the code would be inactive: the feature is
good
enough as is or it's not interesting enough to attract attention.
Considering it's not public API, there are no discussions to bring
into
the
public API, and there's no effort to document how to use it, my
intuition
tells me that there isn't enough interest in it from a project
perspective.

 From a user perspective, I've been getting asked about it when I
work
with
Accumulo users. My recommendation, exclusively, is to use HDFS
encryption
because I can go to Hadoop's website and find documentation on it.
When
I
go to find documentation on Accumulo's offerings, any usability
information
comes from vendor SlideShares. Most mentions of the feature on
official
Apache Accumulo channels echo Christopher's sentiments on the
feature
being
experimental and not being officially recommended for use.

I wouldn't want to rip out the feature first and then figure things
out
later. Sean already alluded to it, but a roadmap should contain
something
(tool or documentation) to help users migrate if we go down that
route.
What I'm trying to figure out is, when the question of "How do I do
encryption at rest in Accumulo?" comes up, what is our community's
answer?
If we went down the route of using HDFS encryption zones, can we
offer
the
same features? At the very least, we'd be offering the same
database-level
Where does the decryption happen with DFS, is it in the DFS client?
If
so, using HDFS level encryption seems to offer the same
functionality???
Has anyone written a tool that takes an
Accumulo-encrypted-HDFS-unencrypted-RFile and rewrites it is as an
Accumulo-unencrypted-HDFS-encrypted-RFile?  Wondering if there are
any
unexpected gotchas w/ this.

I was discussing my questions w/ Christopher today and he mentioned an
experiment that I thought was interesting.   What is the random seek
performance of Accumulo-encrypted-HDFS-unencrypted-RFile vs
Accumulo-unencrypted-HDFS-encrypted-RFile?




encryption scheme. I don't know the details of "more advanced key
stores",
but it seems like we could potentially take any custom
implementation
and
map it to a KeyProvider [1]. I could also envision table level
encryption
being implementable via zones, but probably not down to the column
family
level.

[1]


https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/crypto/key/KeyProvider.html

On Sun, Nov 1, 2015 at 10:19 AM, Adam Fuchs<[email protected]
<javascript:_e(%7B%7D,'cvml','[email protected]');>>  wrote:
Responses inline.

Adam

On Nov 1, 2015 9:58 AM, "Christopher"<[email protected]
<javascript:_e(%7B%7D,'cvml','[email protected]');>>  wrote:
1. I'm not sure I'd call an incomplete solution 'great'. What it
does
is
provide partial encryption-at-rest protection (unless you're
running
without walogs, and have good integration with some external
secure
key
management faculty, and then it's probably fine).
The only thing that doesn't get encrypted is a temporary WAL
recovery
file.
That is a project we should take on, but it does not imply that
the
existing features are not valuable. With HDFS encryption options
this
would
now be a much easier project to take on. Also, the users I know
that
use
encryption at rest do so with a more secure key store than the
default.
2. I'm concerned that anybody using Accumulo's E-A-R don't
necessarily
realize its current shortcomings, or its lack of upstream
maintenance
support (which it has not been receiving). It may be the case
that
these
users have support from an intermediary, and do understand the
shortcomings... I don't know, but it's a concern.
Anybody that creates a secure system has to analyze the security
of
the
system as a whole. Accumulo's encryption at rest is one part of
the
solution. Taking away the tool without providing an alternative
does
nothing to improve the security of systems built on Accumulo.

3. Correction: it has been an explicitly experimental feature
and
an
incomplete one, which hasn't really been touched in two years,
and
has
been
explicitly excluded by the community for being public API
because
of
its
incompleteness. Age doesn't determine public API status. The
community
does.

People are using it, so we have to consider the implications of
whatever
changes we make and weigh against the benefits. I believe the last
bug
fix
was done this year, so I would argue it is being maintained.
Changes
to
our
encryption at rest implementation will have consequences for those
users.
There had better be a clear benefit if we break their systems.

4. Has Accumulo's been evaluated for security and performance?
By
whom?
Is
it published?
Yes, there have been several talks at meetups and conferences that
discuss
the security and performance of the current solution.

On Sun, Nov 1, 2015, 08:55 Adam Fuchs<[email protected]
<javascript:_e(%7B%7D,'cvml','[email protected]');>>  wrote:
There's another way to look at the state of Accumulo's
encryption
at
rest:
1. Encryption at rest works great for what it does, and the
code
being
"at
rest" isn't necessarily a problem
2. Several organizations are using Accumulo's encryption at
rest
effectively in operations
3. Encryption at rest has been a supported configuration
option
for
over
two years with established plugin interfaces, and therefore it
should
be
considered part of the public API
4. Upstream alternatives (to my knowledge) have not been
analyzed
for
performance or security

The given option #2 would at least require an analysis of
alternatives,
and
we would have to decide what to do about backwards
compatibility
for
users
using custom key stores and encryption strategies that may or
may
not
be
supported by upstream alternatives.

As far as option #1 goes, I can get behind encouraging people
to
take
up
projects to improve Accumulo's encryption. I think we're
already
going
down
this path, but without having identified resources to do the
improvements.
Any volunteers?

Adam


On Fri, Oct 30, 2015 at 4:22 PM, William Slacum<
[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>>
wrote:
So I've been looking into options for providing encryption
at
rest,
and
it
seems like what Accumulo has is abandonware from a project
perspective.
There is no official documentation on how to perform
encryption
at
rest,
and the best information from its status comes from year (or
greater)
old
ticket comments about how the feature is still experimental.
Recently
there
was a talk that described using HDFS encryption zones as an
alternative.
 From my perspective, this is what I see as the current
situation:
1- Encryption at rest in Accumulo isn't actively being
worked
on
2- Encryption at rest in Accumulo isn't part of the public
API
or
marketed
capabilities
3- Documentation for what does exist is scattered throughout
Jira
comments
or presentations
4- A viable alternative exists that appears to have feature
parity in
HDFS
encryption
5- HBase has finer grained encryption capabilities that
extend
beyond
what
HDFS provides

Moving forward, what's the consensus for supporting this
feature?
Personally, I see two options:

1- Start going down a path to bring the feature into the
forefront
and
start providing feature parity with HBase

or

2- Remove the feature and place emphasis on upstream
encryption
offerings
Any input is welcomed&  appreciated!



Reply via email to