Dne 3.4.2015 v 05:37 Endi Sukma Dewata napsal(a):
Hi,

Attached are new patches replacing all old ones. Please take a look at
them. They should applied in this order: 365, 353-8, 355-6, 357-3,
359-2, 360-1, 364-1, 361-1.

Thanks for squashing patches 362-364 into the original patches, it's much more digestible this way.


I'm planning to merge the vault and vault container object and use the
vault type attribute to distinguish between the two. See more discussion
about that below.

OK.

3) The container_vault config option should be renamed to
container_vaultcontainer, as it is used in the vaultcontainer plugin,
not the vault plugin.

It was named container_vault because it defines the DN for of the
subtree that contains all vault-related entries. I moved the base_dn
variable from vaultcontainer object to the vault object for clarity.

That does not make much sense to me. Vault objects are contained in
their respective vaultcontainer objects, not directly in cn=vaults.

The cn=vaults itself is actually a vault container (i.e.
ipaVaultContainer). Theoretically you could store a vault in any
container including cn=vaults, but we just don't want people to use it
that way.

I think this is consistent with other plugins. For example, the
container_user points to cn=users, which is an nsContainer. There is no
concept of 'user container' other than the cn=users itself. But even if
there is one, the container_user will still be stored in the user class.

In fact it is not consistent with other plugins. All entries managed by the user plugin are stored *directly* under cn=users. Entries managed by the vault plugin are not stored directly under cn=vaults, but rather anywhere in the cn=vaults subtree and their DN is derived from the DN of the parent vault container. For such objects, we don't set <plugin>.container_dn and don't have container_<plugin> constant, but rather define them as child objects of their container objects.


When the vault & vaultcontainer is merged, this will no longer be an issue.

OK.


4) The vault object should be child of the vaultcontainer object.

Not only is this correct from the object model perspective, but it
would
also make all the container_id hacks go away.

It's a bit difficult because it will affect how the container & vault
ID's are represented on the CLI.

Yes, but the API should be done right (without hacks) first. You can
tune the CLI after that if you want.

I think the current framework is rather limiting. It's kind of hard to
build an interface that looks exactly what you want then add the
implementation later to match the interface because many things are
interrelated. In this particular case the object hierarchy on the server
side would affect how the vault ID will be represented on the client side.

It indeed is limiting and that's a good thing. We don't want people to be able to create any crazy interfaces they can imagine, inconsistent with everything else in IPA.


In the design the container ID would be a single value like this:

   $ ipa vault-add /services/server.example.com/HTTP

And if the vault ID is relative (without initial slash), it will be
appended to the user's private container (i.e. /users/<username>/):

   $ ipa vault-add PrivateVault

The implementation is not complete yet. Currently it accepts this
format:

   $ ipa vault-add <vault name> [--container <container ID>]

and I'm still planning to add this:

   $ ipa vault-add <vault ID>

This is actually now done in the latest patch. Internally the ID is
still split into name & parent ID.

If the vault must be a child of vaultcontainer, and the vaultcontainer
must be a child of a vaultcontainer, does it mean the vault ID would
have to be split into separate arguments like this?

   $ ipa vaultcontainer-add services server.example.com HTTP

If that's the case we'd lose the ability to specify a relative vault ID.

Yes, that's the case.

But I don't think relative IDs should be a problem, we can do this:

     $ ipa vaultcontainer-add a b c  # absolute
     $ ipa vaultcontainer-add . c    # relative

I think a "." will be confusing because there's no concept of "current
vaultcontainer" like "current directory".

or this:

     $ ipa vaultcontainer-add '' a b c  # absolute
     $ ipa vaultcontainer-add c         # relative

An empty string is also confusing and can be problematic to distinguish
with missing argument.

I didn't mean empty string specifically, it could have been any special value.


or this:

     $ ipa vaultcontainer-add a b c         # absolute
     $ ipa vaultcontainer-add c --relative  # relative

or this:

     $ ipa vaultcontainer-add a b c --absolute  # absolute
     $ ipa vaultcontainer-add c                 # relative

Per discussion in the IPA-CS meeting, we'd rather keep the "/" for vault
ID delimiters because the spaces will be confusing to users, but we'll
not use absolute ID anymore.

I'm sorry if I gave you the impression that this is up for discussion, but it is not. You either follow the convention without doing ugly hacks or your patch will not be accepted.

It won't be confusing to users, because they are used to the convention.


It's not implemented yet, but here is the plan. By default the vault
will be created in the user's private container:

   $ ipa vault-add PrivateVault

For shared vaults, instead of specifying an absolute ID we can specify a
--shared option:

   $ ipa vault-add --shared projects/IPA

Same thing with service vaults:

   $ ipa vault-add --service server.example.com/LDAP

To access a vault in another user's private container:

   $ ipa vault-show --user testuser PrivateVault

Fine by me, as long as you follow the convention.


16) You do way too much stuff in vault_add.forward(). Only code that
must be done on the client needs to be there, i.e. handling of the
"data", "text" and "in" options.

The vault_archive call must be in vault_add.execute(), otherwise a) we
will be making 2 RPC calls from the client and b) it won't be called at
all when api.env.in_server is True.

This is done by design. The vault_add.forward() generates the salt and
the keys. The vault_archive.forward() will encrypt the data. These
operations have to be done on the client side to secure the transport of
the data from the client through the server and finally to KRA. This
mechanism prevents the server from looking at the unencrypted data.

OK, but that does not justify that it's broken in server-side API. It
can and should be done so that it works the same way on both client and
server.

I think the best solution would be to split the command into two
commands, server-side vault_archive_raw to archive already encrypted
data, and client-side vault_archive to encrypt data and archive them
with vault_archive_raw in its .execute(). Same thing for vault_retrieve.

Actually I think it's better to just merge the add and archive, reducing
the number of RPC calls. The vault_archive now will support two types of
operations:

(a) Archive data into a new vault (it will create the vault just before
archiving the data):

   $ ipa vault-archive testvault --create --in data ...

(b) Archive data into an existing vault:

   $ ipa vault-archive testvault --in data ...

The vault_add is now just a wrapper for the vault_archive(a).

If that's just an implementation detail, OK.

If it's possible to modify existing vault objects using vault-add or create new objects using vault-archive, then NACK.


BTW, I also think it would be better if there were 2 separate sets of
commands for binary and textual data
(vault_{archive,retrieve}_{data,text}) rather than trying to handle
everything in vault_{archive,retrieve}.

I don't think we want to provide a separate command of every possible
data type & operation combination. Users would get confused. The archive
& retrieve commands should be able to handle all current & future data
types with options.

A command with two sets of mutually exclusive options is really two commands in disguise, which is a good sign it should be divided into two actual commands.

Who are you to say users would get confused? I say they would be more confused by a command with a plethora of mutually exclusive "options".

What other possible data types are there?


The add & archive combination was added for convenience, not for
optimization. This way you would be able to archive data into a new
vault using a single command. Without this, you'd have to execute two
separate commands: add & archive, which will result in 2 RPC calls
anyway.

I think I would prefer if it was separate, as that would be consistent
with other plugins (e.g. for objects with members, we don't allow adding
members directly in -add, you have to use -add-member after -add).

The vault data is similar to group description, not group members. When
creating a group we can supply the description. If not specified it will
be blank. Archiving vault data is similar to updating the group
description.

It's similar to group members because there are separate commands to manipulate them.

You have to choose one of:

a) vault data is settable using vault-add and vault-mod and gettable using vault-mod, vault-show and vault-find

b) vault data is settable using vault-archive and gettable using vault-retrieve

Anything in between is not permitted.


Vault secrets on the other hand is similar to group members. You will
see that in the other patch.


18) Why are vaultcontainer objects automatically created in vault_find?

This is just plain wrong and has to be removed, now.

The code was supposed to create the user's private container like in
#17, but the behavior has been changed. If the container being searched
is the user's private container, it will ignore the container not found
error and return zero results as if the private container already
exists. For other containers the container must already exist. For this
to work I had to add a handle_not_found() into LDAPSearch so the plugins
can customize the proper search response for the missing private
container.

No ad-hoc refactoring please. If you want to refactor anything, it
should be first designed properly and put in a separate patch.

Anyway, what should actually happen here is that if parent object is not
found, its object plugin's handle_not_found is called, i.e. something
like this:

     parent = self.obj.parent_object
     if parent:
         self.api.Object[parent].handle_not_found(*args[:-1])
     else:
         raise errors.NotFound(
             reason=self.obj.container_not_found_msg % {
                 'container': self.obj.container_dn,
             }
         )

It will not work because vault doesn't have a parent object. I'm adding
handle_not_found() into LDAPCreate and LDAPSearch in the first patch.

NACK, this change exists for the sole reason of supporting your hacks. Follow IPA convetions and this change won't be necessary.


21) vault_archive is not a retrieve operation, it should be based on
LDAPUpdate instead of LDAPRetrieve. Or Command actually, since it does
not do anything with LDAP. The same applies to vault_retrieve.

The vault_archive does not actually modify the LDAP entry because it
stores the data in KRA. It is actually an LDAPRetrieve operation because
it needs to get the vault info before it can perform the archival
operation. Same thing with vault_retrieve.

It is not a LDAPRetrieve operation, because it has different semantics.
Please use Command as base class and either use ldap2 for direct LDAP or
call vault_show instead of hacking around LDAPRetrieve.

It's been changed to inherit from LDAPQuery instead.

NACK, it's not a LDAPQuery operation, because it has different semantics. There is more to a command than executing code, so you should use a correct base class.


22) vault_archive will break with binary data that is not UTF-8 encoded
text.

This is where it occurs:

+        vault_data[u'data'] = unicode(data)

Generally, don't use unicode() on str values and str() on unicode
values
directly, always use .decode() and .encode().

The unicode(s, encoding) is actually equivalent to s.decode(encoding),
so the following code will not solve the problem:

   vault_data[u'data'] = data.decode()

As you said, decode() will only work if the data being decoded actually
follows the encoding rules (i.e. already UTF-8 encoded).

It needs to be a Unicode because json.dumps() doesn't work with binary
data. Fixed by adding base-64 encoding.

The base-64 encoding is necessary to convert random binaries into ASCII
so it can be decoded into Unicode. Here is the current code:

   vault_data[u'data'] = unicode(base64.b64encode(data))

which is equivalent to:

   vault_data[u'data'] = base64.b64encode(data).decode()

If you read a little bit further, you would get to the point, which is certainly not calling .decode() without arguments, but *always explicitly specify the encoding*.


If something str needs to be unicode, you should use .decode() to
explicitly specify the encoding, instead of relying on unicode() to pick
the correct one.

Since we know this is ASCII data we can now specify UTF-8 encoding.

   vault_data[u'data'] = base64.b64encode(data).decode('utf-8')

But for anything that comes from user input (e.g. filenames, passwords),
it's better to use the default encoding because that can be configured
by the user.

You are confusing user's configured encoding with Python's default encoding. Default encoding in Python isn't derived from user's localization settings.

Anyway, anything that comes from user input is already decoded using user's configured encoding when it enters the framework so I don't know why are you even bringing it up here.


Anyway, I think a better solution than base64 would be to use the
"raw_unicode_escape" encoding:

As explained above, base-64 encoding is necessary because random
binaries don't follow any encoding rules. I'd rather not use
raw_unicode_escape because it's not really a text data.

The result of decoding binary data using raw_unicode_escape is perfectly valid unicode data which doesn't eat 33% more space like base64 encoded binary does, hence my suggestion.

Anyway, it turns out when encoded in JSON, raw_unicode_escape string generally takes more space than base64 encoded string because of JSON escaping, so base64 is indeed better.

Here's how it's
now implemented:

     if data:
         data = data.decode('raw_unicode_escape')

Input data is already in binaries, no conversion needed.

     elif text:
         data = text

Input text will be converted to binaries with default encoding:

   data = text.encode()

See what the default encoding actually is and why you shouldn't rely on it above.


     elif input_file:
         with open(input_file, 'rb') as f:
             data = f.read()
         data = data.decode('raw_unicode_escape')

Input contains binary data, no conversion needed.

     else:
         data = u''

If not specified, the data will be empty string:

   data = ''

The data needs to be converted into binaries so it can be encrypted
before transport (depending on the vault type):

   data = self.obj.encrypt(data, ...)

     vault_data[u'data'] = data

Then for transport the data is base-64 encoded first, then converted
into Unicode:

   vault_data[u'data'] = base64.b64encode(data).decode('utf-8')


26) Instead of the delete_entry refactoring in baseldap and
vaultcontainer_add, you can put this in vaultcontainer_add's
pre_callback:

     try:
         ldap.get_entries(dn, scope=ldap.SCOPE_ONELEVEL, attrs_list=[])
     except errors.NotFound:
         pass
     else:
         if not options.get('force', False):
             raise errors.NotAllowedOnNonLeaf()

I suppose you meant vaultcontainer_del. Fixed, but this will generate an
additional search for each delete.

I'm leaving the changes baseldap because it may be useful later and it
doesn't change the behavior of the current code.

Again, no ad-hoc refactoring please.

The refactoring has also been moved into a separate patch. Just a note,
I still don't think a plugin should do a search and maybe generate a
NotAllowedOnLeaf exception on each delete operation. The exception
should have been generated automatically by the DS. But we can discuss
that separately.

NACK, turns out there is a better (and preferable) solution I didn't remember before, you can use exception callback in vaultcontainer_del:

def exc_callback(self, keys, options, exc, call_func, *call_args, **call_kwargs):
        if call_func.func_name == 'delete_entry':
            if isinstance(exc, errors.NotAllowedOnLeaf):
                if not options.get('force', False):
                    raise errors.DatabaseError(...)
        raise exc


28) The vault and vaultcontainer plugins seem to be pretty similar, I
think it would make sense to put common stuff in a base class and
inherit vault and vaultcontainer from that.

I plan to refactor the common code later. Right now the focus is to get
the functionality working correctly first.

Please do it now, "later" usually means "never". It shouldn't be too
hard and I can give you a hand with it if you want.

As mentioned above, I'm considering merging the vault & vault container
classes, so no need to refactor the common code out of these classes.
This will be delivered as a separate patch later.

OK.


Thanks.



--
Jan Cholasta

--
Manage your subscription for the Freeipa-devel mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-devel
Contribute to FreeIPA: http://www.freeipa.org/page/Contribute/Code

Reply via email to