Re: Reading Binary Attributes

2011-05-06 Thread Alex Karasulu
On Fri, May 6, 2011 at 5:34 PM, Stefan Seelmann seelm...@apache.org wrote:

 On Fri, May 6, 2011 at 3:30 PM, Daniel Fisher dfis...@vt.edu wrote:
  Specifying BinaryAttributeDetectors might also be interesting in the
 case where the server does not advertise the location of the its schema in
 the RootDSE. But it would leave the connection being halfway schema aware,
 which might be complicated to handle at first sight.
  Something we can discuss about though.
 
  yeah, the problem here is to link such a mechanism into the schema
  manager, but honestly I don't think it is a good
  idea to let user define some behavior to handle the attribute type
  apart from what is already defined in the schema through syntax.
 
  OTOH this conversation makes me think that we should also make
  connection schema aware by default, instead of the current choice
  of letting users call loadSchema() to make it schema aware.
 
  I think you want to support both behaviors. The vast majority of LDAP
  clients do not need to be schema aware. They just need to read strings
  (and sometimes bytes) from the server. Forcing a client to synchronize
  schema updates with their server would place an undue burden on
  application deployers that depend on LDAP.

 I agree with Daniel. LDAP is often used to authenticat users, so only
 a connect and bind is required. In that case loading the schema would
 be too expensive.


+1 Absolutely expensive!


 Another thing: the schema may be big, say up to an Megabyte. So you
 don't want to load the schema on each connect/bind. In Studio we cache
 the schema based on the createTimestamp/modifyTimestamp of the
 subschemaSubentry entry, but that's quite tricky and doesn't always
 work.


Yes a nice optimization if available.


 Loading the schema is also dependent from the access control rules.
 Some servers don't allow reading the schema for non-admin users. So in
 that case the connection can't be made schema-aware.


Yep.


 And using the built-in schema doesn't make sense at all, IMHO. If we
 don't know what the other server is it doesn't make sense to assume
 the schema would match.


+1



  If you don't like exposing the BinaryAttributeDetector, you could
  simply store the raw byte[] in the Attribute along with the UTF-8
  encoded string for every attribute. If I remember correctly this is
  what JLDAP does. Clients would then have the flexibility to use either
  data type regardless of the schema.

 That's a great idea, +1


+1

Alex


Re: [Shared] API Design Questionnaire #1

2011-01-30 Thread Alex Karasulu
On Sun, Jan 30, 2011 at 7:29 PM, Stefan Seelmann seelm...@apache.orgwrote:

 On Sun, Jan 30, 2011 at 5:11 PM, Alex Karasulu akaras...@apache.org
 wrote:
  On Sun, Jan 30, 2011 at 3:17 AM, Emmanuel Lecharny elecha...@gmail.com
 wrote:
 
  On 1/29/11 10:38 PM, Stefan Seelmann wrote:
 
 
   [X] - (c)
 interface = AddRequest
 simple API exposed implementation =
 AddRequest*Impl*
 not so simple internal use implementation =
  AddRequest*Decoder*
  We're applying option 'C' right now. I'm torn but think A might suite
 us
  better for the long term, and for any situation. You also know what's
 an
  interface and what's not although the IDE automatically shows you this
  stuff
  on the package/class browser.
 
  This is my opinion for a low-level API, which 1:1 maps LDAP
  terminology to the Java API. I think we should additional have a
  simplified API where the user don't need to deal with request and
  response objects at all.
 
  BTW: We have this discussion again and again ;-) We really need to
  decide a consistent naming.
 
 
  I think we already discussed it more than once, and we all agreed on
 this
  convention.
 
  I'm not sure we want to rehash this again every 2 years :/
 
 
  When there's a push to release a 1.0 of an API, we need to make the API
  consistent. I can do this myself but the community way is to have a
  discussion. If  you do not want to discuss this feel free not to
  participate, or say you don't care.

 I don't see that anyone said that the API development should not be
 community driven.


I did not suggest anyone said that. If you read above I am saying I have no
choice but to post and share with the community rather than do it myself.

-- 
Alex Karasulu
My Blog :: http://www.jroller.com/akarasulu/
Apache Directory Server :: http://directory.apache.org
Apache MINA :: http://mina.apache.org
To set up a meeting with me: http://tungle.me/AlexKarasulu


[Shared] API Design Questionnaire #1

2011-01-28 Thread Alex Karasulu
Hi community,

Now that we're coming close to finishing up the shared refactoring we have
to make some choices. Not all these choices have major impacts but some
might. In the past we could do what we liked and change our minds etc. Now
with a 1.0 of the shared libraries as the future mother of all Java LDAP
APIs we're going to have to live with our choices.

To opine, just place an 'X' in an option [  ] box.


(1) ModifyRequest has a bunch of methods that were recently added to perform
the same operations that you use the Modification interface for. This is
redundant in my opinion and adds more unnecessary surface area. We don't
need it and don't need an optional path to do the same thing confusing our
users.  I suggest removing them.

[  ] Yes - get rid of extra optional methods
[  ] No  - keep the extra optional methods
[  ] --- - I don't care about this stuff



(2) Interfaces verses simple/basic classes implementing them have been
something I've swayed back and forth on. Here are the options but note I am
just using AddRequest as an example.

[  ] - (a)
interface = *I*AddRequest
simple API exposed implementation = AddRequest
not so simple internal use implementation = AddRequest*Decoder*
[  ] - (b)
interface = AddRequest
simple API exposed implementation = *Simple*AddRequest
not so simple internal use implementation = AddRequest*Decoder*
[  ] - (c)
interface = AddRequest
simple API exposed implementation = AddRequest*Impl*
not so simple internal use implementation = AddRequest*Decoder*
[  ] - (d)
interface = AddRequest
simple API exposed implementation = *Basic*AddRequest
not so simple internal use implementation = AddRequest*Decoder*

[  ] - (e) I pick the fat lady with the pink tutu 

We're applying option 'C' right now. I'm torn but think A might suite us
better for the long term, and for any situation. You also know what's an
interface and what's not although the IDE automatically shows you this stuff
on the package/class browser.


(3) JNDI remnants are somewhat still present even if we've gotten rid of
most of them. In the model interfaces for Control, ExtendedRequest, and
ExtendedResponse (IntermediateResponse as well but this has nothing to do
with JNDI) we have exposed access to ASN.1 encoded data. I think this is a
big mistake to do in the public API.

Controls and extended operation interfaces should simply expose
parameters/properties leaving the rest up to the CODEC to handle. There
should be no need to get or set the entire ASN.1 blob for the control or
extended operation's request response pair. What good does it do anyway?
It's just opening the door for users to incorrectly alter properly encoded
ASN.1 data causing problems. I think the getValue() setValue() methods
remained after we ran screaming away from JNDI. But it seems these
interfaces remained and now they're a liability. Where manipulation of the
binary ASN.1 data is needed we can leave this up to the CODEC under a
decorator to do.

I recommend removing these, what do you think?

[  ] Yes - Remove them, they are more bad then good
[  ] No  - Don't remove them, I like using em
[  ] --- - I don't give a rat's a**


-- 
Alex Karasulu
My Blog :: http://www.jroller.com/akarasulu/
Apache Directory Server :: http://directory.apache.org
Apache MINA :: http://mina.apache.org
To set up a meeting with me: http://tungle.me/AlexKarasulu


Re: [DISCUSSION] General API SPI Concerns

2011-01-06 Thread Alex Karasulu
On Thu, Jan 6, 2011 at 4:49 AM, Emmanuel Lecharny elecha...@gmail.comwrote:

 On 1/6/11 2:36 AM, Alex Karasulu wrote:

 Hi all,

 Excuse the cross post but this also has significance to the API list.

 Problem
 

 For our benefit and the benefit of our users we need to be uber careful
 with
 changes after a major GA release. We have another thread where it seems
 people agree with the Eclipse scheme of versioning and this sounds really
 flexible for our needs. We can do a 2.0.0-M1 release at any time without
 clamping down on API's. Only when we do a RC do we have to freeze changes
 to
 interfaces.

 The debate still remains as to what constitues an interface. Emmanuel
 seems
 to disagree with configuration, schema, and partition db formats as being
 interfaces of concern but for the time being we can just discuss those we
 do
 agree on. There's no doubt about APIs and SPIs.


 I don't disagree with Schema, but Schema are clearly defined by RFCs, there
 is no possible interpretation about their syntax and definition.


Absolutely I do agree with you. I was thinking there's bug or mistake we
find with our core published schema or we change the ApacheDS meta schema.
In this case we'd need to bump up to a major version because then just a
software update will not solve problems with already created entries on disk
using the older schema. The situation will be undefined - very hard to
predict.


 However, the schema manipulation API is in the scope of this discussion.


This is part of the LDAP API and is as critical as Dn, or Entry since it's
tied together.


 Partition and configuration are not part of the Ldap API, thus are
 irrelevant in this discussion about shared refactoring.


Right this has nothing to do with shared APIs but is relavent to the server.
The same policies in API maintenance in shared will have to be applied to
the server.

To me anything exposed is something to consider for backwards compatibility,
not talking just API here. Whether it's LDAP extended operations, web
services, or database formats, these things impact backwards compatibility.



  Solution
 

 So how do we make this as painless to us and users as much as is possible?
 The best way is to keep the surface area of the SPI or API small, create
 solid boundaries, and avoid exposing implementation details and
 implementation classes.

 By reducing the surface area with implementation hiding we can effectively
 limit exposure and reduce the probability of needing to make a change that
 breaks with our user contract. You might be asking what's a real world
 example of this for us in shared?

 And incidentally this is one of the things I've been working on in my
 branch.


 Real World Example in Shared
 

 Let's take the o.a.d.s.ldap.message package as an example. This package
 contains classes and interfaces modeling LDAP requests and responses: i.e.
 AddRequest, DeleteResponse etc. It's in the shared-ldap module.

 In this package, in addition to request response interfaces, we're
 exposing
 implementation classes for them. The implementation classes, in turn have
 dependencies on o.a.d.s.ldap.codec.* packages.

 Not any more, I hope. We did a big refactoring last september in order to
 remove this coupling. Of course, we may have some remaining dependencies,
 but this is more or less not intentional.


Right not intentional and this is just one example in many. Look we've all
used shared as a dumping ground. While our primary focus was a tough problem
in Studio or in ApacheDS we put minimal energy into shared as we deposited
some classes and interfaces into it. This is because the main focus was
something else.

This is not to blame anyone. I am pointing out the problem, and pointing out
a solution to it so we're not screwed by it. The web of dependencies in
shared will f**k us down the line if we don't nix em now.



   This is because some
 implementation classes depend on codec functionality which is an
 implementation detail.


 Not true anymore (or is it?).


Yeah there's some residual dependencies but not a big deal to fix. Trivial
stuff.

The work needed here is a joke really. The big issue with it is the impact
the changes in shared have all over the place in Studio and ApacheDS and the
fact that we're better off waiting for AP work to complete to merge.


  This might be due to eager reuse or the addition of
 utility methods into codec classes for convenience. Some of these
 dependencies can be removed by breaking out non-implementation specific
 methods and constants in codec classes into utility methods outside of the
 package or the module all together. Furthermore the codec implementation
 that handles [de]marshaling has to access package friendly (non-API)
 methods
 on implementation classes while encoding.

 Not sure that I get what you mean here. Can you be a bit more explicit ?


LdapEncoder accesses package friendly methods inside most

Re: [DISCUSSION] General API SPI Concerns

2011-01-06 Thread Alex Karasulu
Your 100% right it's the OSGi environment that enforces the exported  
packages and hiding the rest. So I concur with your 3 points at the end.



Sent from my iPhone

On Jan 6, 2011, at 5:21 PM, Stefan Seelmann m...@stefan-seelmann.de  
wrote:


On Thu, Jan 6, 2011 at 2:58 PM, Alex Karasulu akaras...@apache.org  
wrote:
 In the end, dependency upon further transitive dependencies are  
making us
expose almost all implementation classes in shared, and most can  
easily be
decoupled and hidden. It's effectively making everything in  
shared come

together in one big heap exposing way more than we want to.

It's quite impossible in Java to 'hide' all the classes that a  
user should
not manipulate. Unless you use package protected classes, and it  
quickly has
a limit, I would rather think in term of 'exposed' (ie documented)  
API.



OSGi bundles really helps in this respect. It fills in where Java  
left off.


OSGi makes it so the (bundle) packaging coincides with module  
boundaries. In
Java this is loose and there's leakage all over, as you say, it's  
very hard

to hide all implementation classes.


That this documented API is gathered in one separate module for  
convenience
is another aspect, but the user will still have to depend on all  
the other

modules.


Certainly, you're right, dependencies will still exist. A codec  
will be
depended upon for it's functionality even if we do hide the  
implementation

details under the hood.

The value add here is not from avoiding a dependency. It's from not  
exposing
more than we have to and being able to hide the implementation.  
This way we
can change the implementation at will across point releases without  
having

to bump up to a major revision.


So all in all, should we define a module (a maven module)  
containing the
public API and the associated implementation ? Probably (But this  
is not an
absolute necessity). I guess this is what you have in mind, so  
let's see

what's the proposal is...


We have multiple options for chopping this up. With bundles we have  
a nice
tool to carve out physical not just logical boundaries to our API's  
and only

expose those packages we need to show API users.











LDAP Client API


Everyone agrees that this API is very important to get right with  
a 1.0.

Right now this API pulls in several public interfaces directly from
shared.
Those interfaces also pull in some implementation classes. The  
logical API

extends into shared this way. Effectively the majority of shared is
exposed
by the client API. The client API does not end at it's jar  
boundary.


All this exposure increases the chances of API change when all
implementation details are wide open and part of the client API.   
And this

is what I'm trying to limit. There are ways we can decouple these
dependencies very nicely with a mixed bag of refactoring  
techniques while
breaking up shared-ldap into lesser more coherent modules. The  
idea is to
expose the bare minimum of only what we need to expose. Yes the  
shared

code
has become very stable over time but the most stability is in the
interfaces
and if we only expose these instead of implementation classes  
then we'll

have an awesome API that may remain 1.X for a while and not require
deprecations as new functionality is introduced.



How will you limit the visibility of the modules you don't want  
the user to

be exposed to ?


A combination of refactoring techniques will be used to be able to  
better
use standard Java protection mechanisms to hide implementation  
details

combined with using OSGi bundles instead of Jars to only export those
packages that we do want users to see.


Alex, I agree with you that separating interfaces and implemenation
details is a good thing. Also creating OSGi bundles with a minimal set
exported packages is a good thing. But it only helps if the bundles
are used in an OSGI environment.

I think we'll continue to deploy those OSGi bundles (which are just
Jars with a good META-INF/MANIFEST.MF) to maven central. And each user
using those Jars will see and can use all classes, when not using an
OSGI environment.

So I think we need additional techniques for non-OSGi users to let
them know which packages to use, for example:
- Use a naming convention for internal packages, the name internal
is used in Eclipse and Apache Felix, not sure if is specified in OSGi.
- Create separate Jars for API and implementation (e.g. xyz-api.jar
and xyz-impl.jar)

Kind Regards,
Stefan


Re: [DISCUSSION] General API SPI Concerns

2011-01-06 Thread Alex Karasulu
On Thu, Jan 6, 2011 at 4:43 PM, Emmanuel Lecharny elecha...@gmail.comwrote:

SNIP ...

On 1/6/11 2:58 PM, Alex Karasulu wrote:

 This is not to blame anyone. I am pointing out the problem, and pointing
 out

  a solution to it so we're not screwed by it. The web of dependencies in
 shared will f**k us down the line if we don't nix em now.


 I'm wondering what would be the best way to get rid of those coupling...
 May be creating many maven modules (one per package) then we will
 immediately see the invalid coupling ? Or is there any tool we can use to
 detect the bad coupling ?


Yeah we could do a module per package but that might be too eager.

For now let's let dependencies that we cannot remove with some simple tricks
and some common sense about what coherently goes together loosely guide our
path. But let's be relaxed about it not freaking out about module explosion
but let's not explode needlessly.

I wish one simple equation solved these things but unfortunately they don't
:(.


 I must admit I have not investigated this area yet...


No worries. I got your back here and will give y'all an update about what
was done, how and why so we're on the same page at some point and can think
together on it to finally tidy up.

Just get this AP thing worked out without worrying yourself about these
details too much. What you're doing in AP land is much more important.

 The work needed here is a joke really. The big issue with it is the impact
 the changes in shared have all over the place in Studio and ApacheDS and
 the
 fact that we're better off waiting for AP work to complete to merge.

 Absolutely. I know that I'm a bottleneck here, but OTOH, there is little I
 can do to move faster :/


Please don't feel rushed. Again what you're doing is one of the most nasty
areas of the server and not trivial. It would be easier and simpler writing
a web server than this region of code. So just focus on doing it right so it
does not steal any more of your time.

I'll work on this stuff and update the list. I've got some stupid things to
take care of today so I will not be as agressive until the weekend. Just a
heads up.



   This might be due to eager reuse or the addition of

 utility methods into codec classes for convenience. Some of these
 dependencies can be removed by breaking out non-implementation specific
 methods and constants in codec classes into utility methods outside of
 the
 package or the module all together. Furthermore the codec implementation
 that handles [de]marshaling has to access package friendly (non-API)
 methods
 on implementation classes while encoding.

  Not sure that I get what you mean here. Can you be a bit more explicit
 ?


 LdapEncoder accesses package friendly methods inside most message Impl
 clases to encode them. This also pulls into message dependencies from
 codec
 which can be hidden. But these are really easy to fix. We just need to
 know
 that the situation is there and get rid of it.

 Get it now.

 Btw, I still have some issues with the codec classes
 (LdapEncoder/LdapDecoder). They could be simplified, as we still live with
 some mechanisms used years ago. The Client-API codec is way simpler.

 We can discuss this point in a separate thread.


Sure no problem. I have some ideas here too (nothing big) just to make it so
we can hide implementation better with the code making it more pluggable.
While your working let me test the ideas out and post something about it.



   In the end, dependency upon further transitive dependencies are making us

 expose almost all implementation classes in shared, and most can easily
 be
 decoupled and hidden. It's effectively making everything in shared come
 together in one big heap exposing way more than we want to.

  It's quite impossible in Java to 'hide' all the classes that a user
 should
 not manipulate. Unless you use package protected classes, and it quickly
 has
 a limit, I would rather think in term of 'exposed' (ie documented) API.


 OSGi bundles really helps in this respect. It fills in where Java left
 off.

 OSGi makes it so the (bundle) packaging coincides with module boundaries.
 In
 Java this is loose and there's leakage all over, as you say, it's very
 hard
 to hide all implementation classes.


 True. I ruled out OSGi, but that may help a lot.


Yeah but as Seelman pointed out you only get that benefit in the OSGi
environment. We can do more like break things up better and use this
internal package name component.



  That this documented API is gathered in one separate module for
 convenience
 is another aspect, but the user will still have to depend on all the
 other
 modules.


  Certainly, you're right, dependencies will still exist. A codec will be
 depended upon for it's functionality even if we do hide the implementation
 details under the hood.

 The value add here is not from avoiding a dependency. It's from not
 exposing
 more than we have to and being able to hide the implementation. This way
 we
 can

Re: Entry API

2010-02-04 Thread Alex Karasulu
On Thu, Feb 4, 2010 at 8:38 PM, Emmanuel Lecharny elecha...@gmail.comwrote:

 Hi,

 here are some preliminary thoughts about the Entry class.

 The Entry class
 ---

 It's the base data structure representing an element stored into a LDAP
 server. It has a name (a DN) and some attributes.

 All the existing API use the same name, Entry (or LDAPEntry), except JNDI
 which has no such class (it uses Attributes but without a DN) and this class
 contains at least those two inner elements :
 - a DN
 - a set of Attribute

 There is some difference though as this element is either an Interface
 (ADS, ODS) or a Class (UID, jLdap)

 ODS define an implementing class named SortedEntry, which does not make a
 lot of sense, IMO. ADS class hierarchy is even more complex, as there are
 two lower interfaces (ClientEntry and ServerEntry) with two implementing
 classes (DefaultClientEntry and DefaultServerEntry). Overkilling … (and will
 be rewritten soon)


I'd be careful to remove interfaces.  As an API you have to allow the
broadest range of implementation possibilities. Interfaces are good to have.
 When in doubt keep the interface.

All in all, I'm wondering if it's a good idea to have an interface instead
 of a class, as it does not make a lot of sense to implement some different
 version of such an object.


It's hard to foresee the future here.  You've got to watch out when you're
trying to prognosticate the future while designing an API.

Just my two cents.

Alex