On 28.2.2012 18:58, Rob Crittenden wrote:
Jan Cholasta wrote:
On 28.2.2012 18:02, Petr Viktorin wrote:
On 02/28/2012 04:45 PM, Rob Crittenden wrote:
Petr Viktorin wrote:
On 02/28/2012 04:02 AM, Rob Crittenden wrote:
Petr Viktorin wrote:
On 02/27/2012 05:10 PM, Rob Crittenden wrote:
Rob Crittenden wrote:
Simo Sorce wrote:
On Mon, 2012-02-27 at 09:44 -0500, Rob Crittenden wrote:
We are pretty trusting that the data coming out of LDAP matches
its
schema but it is possible to stuff non-printable characters into
most
attributes.

I've added a sanity checker to keep a value as a python str type
(treated as binary internally). This will result in a base64
encoded
blob be returned to the client.

I don't like the idea of having arbitrary binary data where unicode
strings are expected. It might cause some unexpected errors (I have a
feeling that --addattr and/or --delattr and possibly some plugins might
not handle this very well). Wouldn't it be better to just throw away the
value if it's invalid and warn the user?

This isn't for user input, it is for data stored in LDAP. User's are
going to have no way to provide binary data to us unless they use the
API themselves in which case they have to follow our rules.

Well my point was that --addattr and --delattr cause an LDAP search for the given attribute and plugins might get the result of a LDAP search in their post_callback and I'm not sure if they can cope with binary data.




Shouldn't you try to parse it as a unicode string and catch
TypeError to
know when to return it as binary ?

Simo.


What we do now is the equivalent of unicode(chr(0)) which returns
u'\x00' and is why we are failing now.

I believe there is a unicode category module, we might be able to
use
that if there is a category that defines non-printable characters.

rob

Like this:

import unicodedata

def contains_non_printable(val):
for c in val:
if unicodedata.category(unicode(c)) == 'Cc':
return True
return False

This wouldn't have the exclusion of tab, CR and LF like using ord()
but
is probably more correct.

rob

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

If you're protecting the XML-RPC calls, it'd probably be better to
look
at the XML spec directly: http://www.w3.org/TR/xml/#charsets

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]

I'd say this is a good set for CLI as well.

And you can trap invalid UTF-8 sequences by catching the
UnicodeDecodeError from decode().


I don't think we should care about XML-RPC in LDAP-specific code at all.
If you want to do some additional checks, do them in XML-RPC-specific
code.

We are trusting that the data in LDAP matches its schema. This is just
belt and suspenders verifying that it is the case.

Sure, but I still think we should allow any valid unicode data to come from LDAP, not just what is valid in XML-RPC.


rob

Honza

--
Jan Cholasta

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Reply via email to