On 08/10/2015 03:10 PM, Emmanuel Lécharny wrote:
Le 10/08/15 13:33, Radovan Semancik a écrit :
On 08/10/2015 12:42 PM, Emmanuel Lécharny wrote:
There is no flag that says an Attribute is H-R or not. The
information is provided in RFC 22524.3.2
<https://tools.ietf.org/html/rfc2252#section-4.3.2>
Hmm, I was code for parsing of "X-NOT-HUMAN-READABLE" so I thought
that it might be caused by this. Thanks for clarification. Anyway, the
strange thing is that the syntax 1.3.6.1.4.1.1466.115.121.1.28 appears
to be human readable.
WHich it is not :

version: 1
dn: m-oid=1.3.6.1.4.1.1466.115.121.1.28,ou=syntaxes,cn=system,ou=schema
objectclass: top
objectclass: metaTop
objectclass: metaSyntax
m-oid: 1.3.6.1.4.1.1466.115.121.1.28
m-description: JPEG
m-obsolete: FALSE
x-not-human-readable: TRUE
entrycsn: 20100111202214.878000Z#000000#000#000000
creatorsname: uid=admin,ou=system
createtimestamp: 20100111145217Z

Depends on the server. OpenLDAP defines the syntax like this:

ldapSyntaxes: ( 1.3.6.1.4.1.1466.115.121.1.28 DESC 'JPEG' X-NOT-HUMAN-READABLE
  'TRUE' )

But OpenDJ like this:

ldapSyntaxes: ( 1.3.6.1.4.1.1466.115.121.1.28 DESC 'JPEG' )

This is probably the difference. (And thanks for pointing that out. I completely forgot that syntax declaration is also part of the schema.)

I believe that the API works with ApacheDS :-) ... but my goal is to make it work with other LDAP servers as well. And the detection of H/R is clearly wrong with OpenDJ. So I'm trying to figure out what's going on. Now it looks like that the OpenDJ declaration of the syntax is correct. I would expect that is no X-NOT-HUMAN-READABLE clause is present then the H/R flag will be set according to the RFC. But it is not. The API seems to be assuming "true" as a default for H/R flag. Is this a bug in the API?

One more datapoint. This is the same test program run on eDirectory. Same problem:

jpegPhoto AttributeType = attributetype ( 0.9.2342.19200300.100.1.60 NAME 'jpegPhoto'
    SYNTAX 1.3.6.1.4.1.1466.115.121.1.40
    USAGE userApplications )
jpegPhoto syntax = ldapsyntax ( 1.3.6.1.4.1.1466.115.121.1.40
    X-NOT-HUMAN-READABLE 'false' )
jpegPhoto syntax H/R = true

eDirectory syntax definition:

ldapSyntaxes: ( 1.3.6.1.4.1.1466.115.121.1.28 X-NDS_SYNTAX '9' )

X-NOT-HUMAN-READABLE 'false', which means it's hulan readable. But I
guess OpenDJ does *not* set the X-NOT-HUMAN-READABLE flag, while
openLDAP does.

Yes, that really seems to be the case.

I expect the server or the client to *know* magically that this
attribute is H/R when connected to OpenDJ, right ? (irony)

No magic needed here (although some magic might come very useful with some LDAP servers :-) ) .... I just expect that when no X-NOT-HUMAN-READABLE is present then the default from the RFC is used. Isn't that a reasonable expectation?

Yes, that's true. the rational is that we do a best effort to inject
values correctly, converting them on the fly.

Note that this H-R flag itself is stupid. It was added 12 years ago as a
way to follow teh RFC, but as a matter of fact, the Syntax itself
already drives the type of data we can store in an Attribute. I made it
even more complex by trying to use Generics. Now, we have those
StringValue and BinaryValue all over the code.

Ideally, we should not have to care about what we store, and always
consider the stored values as byte[]. OTOH, it's not convenient when we
want to manipulate values as String, as converting them over and over
from byte[] to Strings is costly (epecially in the server). But I do
think we went way to far here. This conversion should be done internally
once, and that's it. It would save us a hell lot of time, and would make
the APi more comfortable to use.
I tend to agree. Always storing the value as binary seems to be good
idea.
Depends. from the performance POV, this is killing the server. Most of
the AT are H/R, and require some checks (comparison, normalization, etc)
during the processing of every request. Having only the binary value is
forcing the server to do the conversion back and forth multiple times.
We faced this issue and when we switched to StringValue and BinaryValue,
the performance boost was huge (100%).

Ideally, we should have 2 methods :
- getBinaryValue()
- getStringValue()

because we always know which type we are dealing with. But that's the
point : in the server, for operatiuons involving many attributes, that
would require a check on the Syntax everytime we want to manipulate a
value, which is a bit of a PITA, especially when we don't care about
this type. Having a Value<?> wrapper helps a lot here...

I understand. And storing converted string values is not really a problem. As long as the binary value is the primary one. Current StringValue implementation has it the other way around. And this causes problems. E.g. I have binary value of 2e254d883270c44cd7ae2e254d883270. The '88' and 'C4' are not a valid UTF codes, so if they are converted to string, it will have those strange inverted question mark characters. And when converted back to binary it becomes 2e254defbfbd3270efbfbd4cd7ae2e254defbfbd3270 ... so both the '88' and 'C4' are translated to 'efbfbd' and the data are ruined.

If the StringValue was implemeted the other way around then it may be less harmful. I.e. storing the binary value as a primary and converting that to string. Storing that string in the StringValue object is OK (as far as it is properly invalidated when the bytes change, but that should not be a problem). As far as I understand the StringValue is storing both values even now. So this is only matter of changing the implementation and always storing the binary value as primary - both in BinaryValue and Stringvalue.

I'm really willing to find a better solution, I have worked a full
quarter on this issue (bin/string values) and I haven't be able to come
with something that hide the inconsitency and complexity of LDAP in this
area, sadly... May be it's time for a rehearsal...

I think that the code is not that bad to require a complete redesign. The interfaces should to be OK as far as I can tell now. So maybe only some internal refactoring is needed. That can be done in an evolutionary fashion. What about just starting with storing the binary value as a primary one? Then even if there is a problem with correct detection of attribute type then no data is really lost and the client can still safely use value.getBytes() regardless of whether it is BinaryValue or StringValue.

BTW, now I have been able to work around the jpegPhoto problem by not setting the attributeType into the Modification. And I have been able to work around wrong detection of GUID and ruined data by using custom BinaryAttributeDetector. So I'm OK now. But anyway, I believe that the root causes of these issues should be fixed in the API.

--
Radovan Semancik
Software Architect
evolveum.com

Reply via email to