Re: [Freeipa-devel] [PATCH] 971 detect binary LDAP data

Jan Cholasta Wed, 29 Feb 2012 01:14:21 -0800

On 28.2.2012 18:58, Rob Crittenden wrote:

Jan Cholasta wrote:

On 28.2.2012 18:02, Petr Viktorin wrote:

On 02/28/2012 04:45 PM, Rob Crittenden wrote:

Petr Viktorin wrote:

On 02/28/2012 04:02 AM, Rob Crittenden wrote:

Petr Viktorin wrote:

On 02/27/2012 05:10 PM, Rob Crittenden wrote:

Rob Crittenden wrote:

Simo Sorce wrote:

On Mon, 2012-02-27 at 09:44 -0500, Rob Crittenden wrote:

We are pretty trusting that the data coming out of LDAP matches
its
schema but it is possible to stuff non-printable characters into
most
attributes.


I've added a sanity checker to keep a value as a python str type
(treated as binary internally). This will result in a base64
encoded
blob be returned to the client.


I don't like the idea of having arbitrary binary data where unicode
strings are expected. It might cause some unexpected errors (I have a
feeling that --addattr and/or --delattr and possibly some plugins might
not handle this very well). Wouldn't it be better to just throw away the
value if it's invalid and warn the user?


This isn't for user input, it is for data stored in LDAP. User's are
going to have no way to provide binary data to us unless they use the
API themselves in which case they have to follow our rules.

Well my point was that --addattr and --delattr cause an LDAP search forthe given attribute and plugins might get the result of a LDAP search intheir post_callback and I'm not sure if they can cope with binary data.


Shouldn't you try to parse it as a unicode string and catch
TypeError to
know when to return it as binary ?

Simo.


What we do now is the equivalent of unicode(chr(0)) which returns
u'\x00' and is why we are failing now.

I believe there is a unicode category module, we might be able to
use
that if there is a category that defines non-printable characters.

rob


Like this:

import unicodedata

def contains_non_printable(val):
for c in val:
if unicodedata.category(unicode(c)) == 'Cc':
return True
return False

This wouldn't have the exclusion of tab, CR and LF like using ord()
but
is probably more correct.

rob

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


If you're protecting the XML-RPC calls, it'd probably be better to
look
at the XML spec directly: http://www.w3.org/TR/xml/#charsets

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]

I'd say this is a good set for CLI as well.

And you can trap invalid UTF-8 sequences by catching the
UnicodeDecodeError from decode().


I don't think we should care about XML-RPC in LDAP-specific code at all.
If you want to do some additional checks, do them in XML-RPC-specific
code.


We are trusting that the data in LDAP matches its schema. This is just
belt and suspenders verifying that it is the case.

Sure, but I still think we should allow any valid unicode data to comefrom LDAP, not just what is valid in XML-RPC.

rob


Honza

--
Jan Cholasta

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 971 detect binary LDAP data

Reply via email to