Re: ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Jordan Brown
On 4/3/2024 10:22 AM, Howard Chu wrote:
> Jordan Brown wrote:
>> Is there even a straightforward way in the protocol to get type information? 
>>  If the protocol won't tell you, a client library can't tell you.
> Any client can retrieve the schema definition of any schema element using an 
> LDAP Search request.

I had thought that all of those schema definitions were
server-specific.  But I see that RFC 4512 standardizes them.  Thanks.

So that would seem to be the answer to the question:  if you want to
know how to handle a particular data item, you need to query its schema.

-- 
Jordan Brown, Oracle ZFS Storage Appliance, Oracle Solaris


Re: ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Howard Chu
Jordan Brown wrote:
> Is there even a straightforward way in the protocol to get type information?  
> If the protocol won't tell you, a client library can't tell you.

Any client can retrieve the schema definition of any schema element using an 
LDAP Search request.

-- 
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Re: ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Jordan Brown
Is there even a straightforward way in the protocol to get type
information?  If the protocol won't tell you, a client library can't
tell you.

-- 
Jordan Brown, Oracle ZFS Storage Appliance, Oracle Solaris


Re: ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Ondřej Kuzník
On Wed, Apr 03, 2024 at 02:08:15PM +0100, Graham Leggett wrote:
> On 03 Apr 2024, at 13:03, Ondřej Kuzník  wrote:
> 
>>> This has been historically vague - first off, what happens if an
>>> attempt is made to call ldap_get_values() on binary data, do you get
>>> an error, or garbage data? The source isn't giving me a clear answer.
>> 
>> Hi Graham,
>> in this case binary data means embedded NULs (\0) can be found: given
>> that what you get back is a naive char * for each value, you stand to
>> lose information about whether that NUL is part of the value or a
>> string terminator.
> 
> So that means garbage data is returned.

Not completely, you just ignore the rest of the value if you use
anything that's strlen()-based.

> So am I right in understanding there is no way to ask the server "what
> type is this attribute you just gave me, is this arbitrary octets or a
> NUL terminated string"?

There's server schema, otherwise no.

Are you processing the values or just treating them as opaque data? If
the latter, why do you care? If you're processing it, you should know
whether you expect a (UTF-8, IA5, ...) string or something else.

Regardless, the ldap_get_values API is legacy, when things were expected
to be just strings and I guess it makes some tasks easier for lazy(?)
programmers. ldap_get_values_len gives you explicit information about
the length of the data, enabling safer processing. Certainly doesn't
stop you from handling strings - the .bv_vals are exactly what you'd
obtain from ldap_get_values.

>> You're welcome to propose better wording if you can make it clearer to a
>> reasonably competent C developer (I'm sure we can assume that they
>> understand how strings are laid out etc.)
> 
> The reason this matters has nothing to do with reasonably competent C
> developers, but rather options given to end users.
> 
> If the end user is allowed to provide an attribute in a configuration
> file, do I force the end user to know about binary values (as is
> common now), or is there a way I can be nice to the end user and have
> the system behave sensibly based on whether the return data is a
> string or binary?

It is up to the developer how they intend to handle the returned data,
see above. The user has no influence over this whatsoever. If the
application lets the user to specify an arbitrary attribute, then it has
to be written accordingly (even if it's only to check that
strlen(.bv_val) == .bv_len), anything else is asking for trouble.

Regards,

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation   http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP


Re: ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Graham Leggett
On 03 Apr 2024, at 13:03, Ondřej Kuzník  wrote:

>> This has been historically vague - first off, what happens if an
>> attempt is made to call ldap_get_values() on binary data, do you get
>> an error, or garbage data? The source isn't giving me a clear answer.
> 
> Hi Graham,
> in this case binary data means embedded NULs (\0) can be found: given
> that what you get back is a naive char * for each value, you stand to
> lose information about whether that NUL is part of the value or a
> string terminator.

So that means garbage data is returned.

>> Second question is how do you know which of ldap_get_values() or
>> ldap_get_values_len() to call? Obviously you can manually know this,
>> but I'm interested in automated behaviour. What is the canonical way
>> to discover that if you queried a jpegPhoto (for example) that the
>> result would be binary?
> 
> You either expect the data to be a string of some sort (no embedded
> NULs), then you're free to use whichever or you are prepared to accept
> arbitrary bytestreams and you need to use the one that returns bervals.
> That's all there is.

So am I right in understanding there is no way to ask the server "what type is 
this attribute you just gave me, is this arbitrary octets or a NUL terminated 
string"?

> You're welcome to propose better wording if you can make it clearer to a
> reasonably competent C developer (I'm sure we can assume that they
> understand how strings are laid out etc.)

The reason this matters has nothing to do with reasonably competent C 
developers, but rather options given to end users.

If the end user is allowed to provide an attribute in a configuration file, do 
I force the end user to know about binary values (as is common now), or is 
there a way I can be nice to the end user and have the system behave sensibly 
based on whether the return data is a string or binary?

Regards,
Graham
--



Re: ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Ondřej Kuzník
On Wed, Apr 03, 2024 at 10:55:26AM +0100, Graham Leggett wrote:
> Hi all,
> 
> Looking back in time to the definitions of ldap_get_values() and
> ldap_get_values_len(), we are told that "If the attribute values are
> binary in nature, and thus not suitable to be returned as an array of
> char *'s, the ldap_get_values_len() routine can be used instead."
> 
> This has been historically vague - first off, what happens if an
> attempt is made to call ldap_get_values() on binary data, do you get
> an error, or garbage data? The source isn't giving me a clear answer.

Hi Graham,
in this case binary data means embedded NULs (\0) can be found: given
that what you get back is a naive char * for each value, you stand to
lose information about whether that NUL is part of the value or a
string terminator.

> Second question is how do you know which of ldap_get_values() or
> ldap_get_values_len() to call? Obviously you can manually know this,
> but I'm interested in automated behaviour. What is the canonical way
> to discover that if you queried a jpegPhoto (for example) that the
> result would be binary?

You either expect the data to be a string of some sort (no embedded
NULs), then you're free to use whichever or you are prepared to accept
arbitrary bytestreams and you need to use the one that returns bervals.
That's all there is.

You're welcome to propose better wording if you can make it clearer to a
reasonably competent C developer (I'm sure we can assume that they
understand how strings are laid out etc.)

Regards,

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation   http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP


ldap_get_values() called on binary data - does this return an error, or garbage data?

2024-04-03 Thread Graham Leggett
Hi all,

Looking back in time to the definitions of ldap_get_values() and 
ldap_get_values_len(), we are told that "If the attribute values are binary in 
nature, and thus not suitable to be returned as an array of char *'s, the 
ldap_get_values_len() routine can be used instead."

This has been historically vague - first off, what happens if an attempt is 
made to call ldap_get_values() on binary data, do you get an error, or garbage 
data? The source isn't giving me a clear answer.

Second question is how do you know which of ldap_get_values() or 
ldap_get_values_len() to call? Obviously you can manually know this, but I'm 
interested in automated behaviour. What is the canonical way to discover that 
if you queried a jpegPhoto (for example) that the result would be binary?

To be clear, the end goal it to patch the following documentation to make this 
clear:

https://www.openldap.org/software/man.cgi?query=ldap_get_values=0=3=OpenLDAP+2.4-Release=default=html

Regards,
Graham
--