Re: Binary values and humanRedable flag

Radovan Semancik Mon, 10 Aug 2015 08:27:06 -0700

On 08/10/2015 03:10 PM, Emmanuel Lécharny wrote:

Le 10/08/15 13:33, Radovan Semancik a écrit :

On 08/10/2015 12:42 PM, Emmanuel Lécharny wrote:

There is no flag that says an Attribute is H-R or not. The
information is provided in RFC 22524.3.2
<https://tools.ietf.org/html/rfc2252#section-4.3.2>

Hmm, I was code for parsing of "X-NOT-HUMAN-READABLE" so I thought
that it might be caused by this. Thanks for clarification. Anyway, the
strange thing is that the syntax 1.3.6.1.4.1.1466.115.121.1.28 appears
to be human readable.

WHich it is not :


version: 1
dn: m-oid=1.3.6.1.4.1.1466.115.121.1.28,ou=syntaxes,cn=system,ou=schema
objectclass: top
objectclass: metaTop
objectclass: metaSyntax
m-oid: 1.3.6.1.4.1.1466.115.121.1.28
m-description: JPEG
m-obsolete: FALSE
x-not-human-readable: TRUE
entrycsn: 20100111202214.878000Z#000000#000#000000
creatorsname: uid=admin,ou=system
createtimestamp: 20100111145217Z


Depends on the server. OpenLDAP defines the syntax like this:

ldapSyntaxes: ( 1.3.6.1.4.1.1466.115.121.1.28 DESC 'JPEG'X-NOT-HUMAN-READABLE

  'TRUE' )

But OpenDJ like this:

ldapSyntaxes: ( 1.3.6.1.4.1.1466.115.121.1.28 DESC 'JPEG' )

This is probably the difference. (And thanks for pointing that out. Icompletely forgot that syntax declaration is also part of the schema.)

I believe that the API works with ApacheDS :-) ... but my goal is tomake it work with other LDAP servers as well. And the detection of H/Ris clearly wrong with OpenDJ. So I'm trying to figure out what's goingon. Now it looks like that the OpenDJ declaration of the syntax iscorrect. I would expect that is no X-NOT-HUMAN-READABLE clause ispresent then the H/R flag will be set according to the RFC. But it isnot. The API seems to be assuming "true" as a default for H/R flag. Isthis a bug in the API?

One more datapoint. This is the same test program run on eDirectory.Same problem:

jpegPhoto AttributeType = attributetype ( 0.9.2342.19200300.100.1.60NAME 'jpegPhoto'

    SYNTAX 1.3.6.1.4.1.1466.115.121.1.40
    USAGE userApplications )
jpegPhoto syntax = ldapsyntax ( 1.3.6.1.4.1.1466.115.121.1.40
    X-NOT-HUMAN-READABLE 'false' )
jpegPhoto syntax H/R = true

eDirectory syntax definition:

ldapSyntaxes: ( 1.3.6.1.4.1.1466.115.121.1.28 X-NDS_SYNTAX '9' )

X-NOT-HUMAN-READABLE 'false', which means it's hulan readable. But I
guess OpenDJ does *not* set the X-NOT-HUMAN-READABLE flag, while
openLDAP does.


Yes, that really seems to be the case.

I expect the server or the client to *know* magically that this
attribute is H/R when connected to OpenDJ, right ? (irony)

No magic needed here (although some magic might come very useful withsome LDAP servers :-) ) .... I just expect that when noX-NOT-HUMAN-READABLE is present then the default from the RFC is used.Isn't that a reasonable expectation?

Yes, that's true. the rational is that we do a best effort to inject
values correctly, converting them on the fly.

Note that this H-R flag itself is stupid. It was added 12 years ago as a
way to follow teh RFC, but as a matter of fact, the Syntax itself
already drives the type of data we can store in an Attribute. I made it
even more complex by trying to use Generics. Now, we have those
StringValue and BinaryValue all over the code.

Ideally, we should not have to care about what we store, and always
consider the stored values as byte[]. OTOH, it's not convenient when we
want to manipulate values as String, as converting them over and over
from byte[] to Strings is costly (epecially in the server). But I do
think we went way to far here. This conversion should be done internally
once, and that's it. It would save us a hell lot of time, and would make
the APi more comfortable to use.

I tend to agree. Always storing the value as binary seems to be good
idea.

Depends. from the performance POV, this is killing the server. Most of
the AT are H/R, and require some checks (comparison, normalization, etc)
during the processing of every request. Having only the binary value is
forcing the server to do the conversion back and forth multiple times.
We faced this issue and when we switched to StringValue and BinaryValue,
the performance boost was huge (100%).

Ideally, we should have 2 methods :
- getBinaryValue()
- getStringValue()

because we always know which type we are dealing with. But that's the
point : in the server, for operatiuons involving many attributes, that
would require a check on the Syntax everytime we want to manipulate a
value, which is a bit of a PITA, especially when we don't care about
this type. Having a Value<?> wrapper helps a lot here...

I understand. And storing converted string values is not really aproblem. As long as the binary value is the primary one. CurrentStringValue implementation has it the other way around. And this causesproblems. E.g. I have binary value of 2e254d883270c44cd7ae2e254d883270.The '88' and 'C4' are not a valid UTF codes, so if they are converted tostring, it will have those strange inverted question mark characters.And when converted back to binary it becomes2e254defbfbd3270efbfbd4cd7ae2e254defbfbd3270 ... so both the '88' and'C4' are translated to 'efbfbd' and the data are ruined.

If the StringValue was implemeted the other way around then it may beless harmful. I.e. storing the binary value as a primary and convertingthat to string. Storing that string in the StringValue object is OK (asfar as it is properly invalidated when the bytes change, but that shouldnot be a problem). As far as I understand the StringValue is storingboth values even now. So this is only matter of changing theimplementation and always storing the binary value as primary - both inBinaryValue and Stringvalue.

I'm really willing to find a better solution, I have worked a full
quarter on this issue (bin/string values) and I haven't be able to come
with something that hide the inconsitency and complexity of LDAP in this
area, sadly... May be it's time for a rehearsal...

I think that the code is not that bad to require a complete redesign.The interfaces should to be OK as far as I can tell now. So maybe onlysome internal refactoring is needed. That can be done in an evolutionaryfashion. What about just starting with storing the binary value as aprimary one? Then even if there is a problem with correct detection ofattribute type then no data is really lost and the client can stillsafely use value.getBytes() regardless of whether it is BinaryValue orStringValue.

BTW, now I have been able to work around the jpegPhoto problem by notsetting the attributeType into the Modification. And I have been able towork around wrong detection of GUID and ruined data by using customBinaryAttributeDetector. So I'm OK now. But anyway, I believe that theroot causes of these issues should be fixed in the API.


--
Radovan Semancik
Software Architect
evolveum.com

Re: Binary values and humanRedable flag

Reply via email to