Re: Reading Binary Attributes
On Fri, May 6, 2011 at 5:34 PM, Stefan Seelmann seelm...@apache.org wrote: On Fri, May 6, 2011 at 3:30 PM, Daniel Fisher dfis...@vt.edu wrote: Specifying BinaryAttributeDetectors might also be interesting in the case where the server does not advertise the location of the its schema in the RootDSE. But it would leave the connection being halfway schema aware, which might be complicated to handle at first sight. Something we can discuss about though. yeah, the problem here is to link such a mechanism into the schema manager, but honestly I don't think it is a good idea to let user define some behavior to handle the attribute type apart from what is already defined in the schema through syntax. OTOH this conversation makes me think that we should also make connection schema aware by default, instead of the current choice of letting users call loadSchema() to make it schema aware. I think you want to support both behaviors. The vast majority of LDAP clients do not need to be schema aware. They just need to read strings (and sometimes bytes) from the server. Forcing a client to synchronize schema updates with their server would place an undue burden on application deployers that depend on LDAP. I agree with Daniel. LDAP is often used to authenticat users, so only a connect and bind is required. In that case loading the schema would be too expensive. +1 Absolutely expensive! Another thing: the schema may be big, say up to an Megabyte. So you don't want to load the schema on each connect/bind. In Studio we cache the schema based on the createTimestamp/modifyTimestamp of the subschemaSubentry entry, but that's quite tricky and doesn't always work. Yes a nice optimization if available. Loading the schema is also dependent from the access control rules. Some servers don't allow reading the schema for non-admin users. So in that case the connection can't be made schema-aware. Yep. And using the built-in schema doesn't make sense at all, IMHO. If we don't know what the other server is it doesn't make sense to assume the schema would match. +1 If you don't like exposing the BinaryAttributeDetector, you could simply store the raw byte[] in the Attribute along with the UTF-8 encoded string for every attribute. If I remember correctly this is what JLDAP does. Clients would then have the flexibility to use either data type regardless of the schema. That's a great idea, +1 +1 Alex
Re: [Shared] API Design Questionnaire #1
On Sun, Jan 30, 2011 at 7:29 PM, Stefan Seelmann seelm...@apache.orgwrote: On Sun, Jan 30, 2011 at 5:11 PM, Alex Karasulu akaras...@apache.org wrote: On Sun, Jan 30, 2011 at 3:17 AM, Emmanuel Lecharny elecha...@gmail.com wrote: On 1/29/11 10:38 PM, Stefan Seelmann wrote: [X] - (c) interface = AddRequest simple API exposed implementation = AddRequest*Impl* not so simple internal use implementation = AddRequest*Decoder* We're applying option 'C' right now. I'm torn but think A might suite us better for the long term, and for any situation. You also know what's an interface and what's not although the IDE automatically shows you this stuff on the package/class browser. This is my opinion for a low-level API, which 1:1 maps LDAP terminology to the Java API. I think we should additional have a simplified API where the user don't need to deal with request and response objects at all. BTW: We have this discussion again and again ;-) We really need to decide a consistent naming. I think we already discussed it more than once, and we all agreed on this convention. I'm not sure we want to rehash this again every 2 years :/ When there's a push to release a 1.0 of an API, we need to make the API consistent. I can do this myself but the community way is to have a discussion. If you do not want to discuss this feel free not to participate, or say you don't care. I don't see that anyone said that the API development should not be community driven. I did not suggest anyone said that. If you read above I am saying I have no choice but to post and share with the community rather than do it myself. -- Alex Karasulu My Blog :: http://www.jroller.com/akarasulu/ Apache Directory Server :: http://directory.apache.org Apache MINA :: http://mina.apache.org To set up a meeting with me: http://tungle.me/AlexKarasulu
[Shared] API Design Questionnaire #1
Hi community, Now that we're coming close to finishing up the shared refactoring we have to make some choices. Not all these choices have major impacts but some might. In the past we could do what we liked and change our minds etc. Now with a 1.0 of the shared libraries as the future mother of all Java LDAP APIs we're going to have to live with our choices. To opine, just place an 'X' in an option [ ] box. (1) ModifyRequest has a bunch of methods that were recently added to perform the same operations that you use the Modification interface for. This is redundant in my opinion and adds more unnecessary surface area. We don't need it and don't need an optional path to do the same thing confusing our users. I suggest removing them. [ ] Yes - get rid of extra optional methods [ ] No - keep the extra optional methods [ ] --- - I don't care about this stuff (2) Interfaces verses simple/basic classes implementing them have been something I've swayed back and forth on. Here are the options but note I am just using AddRequest as an example. [ ] - (a) interface = *I*AddRequest simple API exposed implementation = AddRequest not so simple internal use implementation = AddRequest*Decoder* [ ] - (b) interface = AddRequest simple API exposed implementation = *Simple*AddRequest not so simple internal use implementation = AddRequest*Decoder* [ ] - (c) interface = AddRequest simple API exposed implementation = AddRequest*Impl* not so simple internal use implementation = AddRequest*Decoder* [ ] - (d) interface = AddRequest simple API exposed implementation = *Basic*AddRequest not so simple internal use implementation = AddRequest*Decoder* [ ] - (e) I pick the fat lady with the pink tutu We're applying option 'C' right now. I'm torn but think A might suite us better for the long term, and for any situation. You also know what's an interface and what's not although the IDE automatically shows you this stuff on the package/class browser. (3) JNDI remnants are somewhat still present even if we've gotten rid of most of them. In the model interfaces for Control, ExtendedRequest, and ExtendedResponse (IntermediateResponse as well but this has nothing to do with JNDI) we have exposed access to ASN.1 encoded data. I think this is a big mistake to do in the public API. Controls and extended operation interfaces should simply expose parameters/properties leaving the rest up to the CODEC to handle. There should be no need to get or set the entire ASN.1 blob for the control or extended operation's request response pair. What good does it do anyway? It's just opening the door for users to incorrectly alter properly encoded ASN.1 data causing problems. I think the getValue() setValue() methods remained after we ran screaming away from JNDI. But it seems these interfaces remained and now they're a liability. Where manipulation of the binary ASN.1 data is needed we can leave this up to the CODEC under a decorator to do. I recommend removing these, what do you think? [ ] Yes - Remove them, they are more bad then good [ ] No - Don't remove them, I like using em [ ] --- - I don't give a rat's a** -- Alex Karasulu My Blog :: http://www.jroller.com/akarasulu/ Apache Directory Server :: http://directory.apache.org Apache MINA :: http://mina.apache.org To set up a meeting with me: http://tungle.me/AlexKarasulu
Re: [DISCUSSION] General API SPI Concerns
On Thu, Jan 6, 2011 at 4:49 AM, Emmanuel Lecharny elecha...@gmail.comwrote: On 1/6/11 2:36 AM, Alex Karasulu wrote: Hi all, Excuse the cross post but this also has significance to the API list. Problem For our benefit and the benefit of our users we need to be uber careful with changes after a major GA release. We have another thread where it seems people agree with the Eclipse scheme of versioning and this sounds really flexible for our needs. We can do a 2.0.0-M1 release at any time without clamping down on API's. Only when we do a RC do we have to freeze changes to interfaces. The debate still remains as to what constitues an interface. Emmanuel seems to disagree with configuration, schema, and partition db formats as being interfaces of concern but for the time being we can just discuss those we do agree on. There's no doubt about APIs and SPIs. I don't disagree with Schema, but Schema are clearly defined by RFCs, there is no possible interpretation about their syntax and definition. Absolutely I do agree with you. I was thinking there's bug or mistake we find with our core published schema or we change the ApacheDS meta schema. In this case we'd need to bump up to a major version because then just a software update will not solve problems with already created entries on disk using the older schema. The situation will be undefined - very hard to predict. However, the schema manipulation API is in the scope of this discussion. This is part of the LDAP API and is as critical as Dn, or Entry since it's tied together. Partition and configuration are not part of the Ldap API, thus are irrelevant in this discussion about shared refactoring. Right this has nothing to do with shared APIs but is relavent to the server. The same policies in API maintenance in shared will have to be applied to the server. To me anything exposed is something to consider for backwards compatibility, not talking just API here. Whether it's LDAP extended operations, web services, or database formats, these things impact backwards compatibility. Solution So how do we make this as painless to us and users as much as is possible? The best way is to keep the surface area of the SPI or API small, create solid boundaries, and avoid exposing implementation details and implementation classes. By reducing the surface area with implementation hiding we can effectively limit exposure and reduce the probability of needing to make a change that breaks with our user contract. You might be asking what's a real world example of this for us in shared? And incidentally this is one of the things I've been working on in my branch. Real World Example in Shared Let's take the o.a.d.s.ldap.message package as an example. This package contains classes and interfaces modeling LDAP requests and responses: i.e. AddRequest, DeleteResponse etc. It's in the shared-ldap module. In this package, in addition to request response interfaces, we're exposing implementation classes for them. The implementation classes, in turn have dependencies on o.a.d.s.ldap.codec.* packages. Not any more, I hope. We did a big refactoring last september in order to remove this coupling. Of course, we may have some remaining dependencies, but this is more or less not intentional. Right not intentional and this is just one example in many. Look we've all used shared as a dumping ground. While our primary focus was a tough problem in Studio or in ApacheDS we put minimal energy into shared as we deposited some classes and interfaces into it. This is because the main focus was something else. This is not to blame anyone. I am pointing out the problem, and pointing out a solution to it so we're not screwed by it. The web of dependencies in shared will f**k us down the line if we don't nix em now. This is because some implementation classes depend on codec functionality which is an implementation detail. Not true anymore (or is it?). Yeah there's some residual dependencies but not a big deal to fix. Trivial stuff. The work needed here is a joke really. The big issue with it is the impact the changes in shared have all over the place in Studio and ApacheDS and the fact that we're better off waiting for AP work to complete to merge. This might be due to eager reuse or the addition of utility methods into codec classes for convenience. Some of these dependencies can be removed by breaking out non-implementation specific methods and constants in codec classes into utility methods outside of the package or the module all together. Furthermore the codec implementation that handles [de]marshaling has to access package friendly (non-API) methods on implementation classes while encoding. Not sure that I get what you mean here. Can you be a bit more explicit ? LdapEncoder accesses package friendly methods inside most
Re: [DISCUSSION] General API SPI Concerns
Your 100% right it's the OSGi environment that enforces the exported packages and hiding the rest. So I concur with your 3 points at the end. Sent from my iPhone On Jan 6, 2011, at 5:21 PM, Stefan Seelmann m...@stefan-seelmann.de wrote: On Thu, Jan 6, 2011 at 2:58 PM, Alex Karasulu akaras...@apache.org wrote: In the end, dependency upon further transitive dependencies are making us expose almost all implementation classes in shared, and most can easily be decoupled and hidden. It's effectively making everything in shared come together in one big heap exposing way more than we want to. It's quite impossible in Java to 'hide' all the classes that a user should not manipulate. Unless you use package protected classes, and it quickly has a limit, I would rather think in term of 'exposed' (ie documented) API. OSGi bundles really helps in this respect. It fills in where Java left off. OSGi makes it so the (bundle) packaging coincides with module boundaries. In Java this is loose and there's leakage all over, as you say, it's very hard to hide all implementation classes. That this documented API is gathered in one separate module for convenience is another aspect, but the user will still have to depend on all the other modules. Certainly, you're right, dependencies will still exist. A codec will be depended upon for it's functionality even if we do hide the implementation details under the hood. The value add here is not from avoiding a dependency. It's from not exposing more than we have to and being able to hide the implementation. This way we can change the implementation at will across point releases without having to bump up to a major revision. So all in all, should we define a module (a maven module) containing the public API and the associated implementation ? Probably (But this is not an absolute necessity). I guess this is what you have in mind, so let's see what's the proposal is... We have multiple options for chopping this up. With bundles we have a nice tool to carve out physical not just logical boundaries to our API's and only expose those packages we need to show API users. LDAP Client API Everyone agrees that this API is very important to get right with a 1.0. Right now this API pulls in several public interfaces directly from shared. Those interfaces also pull in some implementation classes. The logical API extends into shared this way. Effectively the majority of shared is exposed by the client API. The client API does not end at it's jar boundary. All this exposure increases the chances of API change when all implementation details are wide open and part of the client API. And this is what I'm trying to limit. There are ways we can decouple these dependencies very nicely with a mixed bag of refactoring techniques while breaking up shared-ldap into lesser more coherent modules. The idea is to expose the bare minimum of only what we need to expose. Yes the shared code has become very stable over time but the most stability is in the interfaces and if we only expose these instead of implementation classes then we'll have an awesome API that may remain 1.X for a while and not require deprecations as new functionality is introduced. How will you limit the visibility of the modules you don't want the user to be exposed to ? A combination of refactoring techniques will be used to be able to better use standard Java protection mechanisms to hide implementation details combined with using OSGi bundles instead of Jars to only export those packages that we do want users to see. Alex, I agree with you that separating interfaces and implemenation details is a good thing. Also creating OSGi bundles with a minimal set exported packages is a good thing. But it only helps if the bundles are used in an OSGI environment. I think we'll continue to deploy those OSGi bundles (which are just Jars with a good META-INF/MANIFEST.MF) to maven central. And each user using those Jars will see and can use all classes, when not using an OSGI environment. So I think we need additional techniques for non-OSGi users to let them know which packages to use, for example: - Use a naming convention for internal packages, the name internal is used in Eclipse and Apache Felix, not sure if is specified in OSGi. - Create separate Jars for API and implementation (e.g. xyz-api.jar and xyz-impl.jar) Kind Regards, Stefan
Re: [DISCUSSION] General API SPI Concerns
On Thu, Jan 6, 2011 at 4:43 PM, Emmanuel Lecharny elecha...@gmail.comwrote: SNIP ... On 1/6/11 2:58 PM, Alex Karasulu wrote: This is not to blame anyone. I am pointing out the problem, and pointing out a solution to it so we're not screwed by it. The web of dependencies in shared will f**k us down the line if we don't nix em now. I'm wondering what would be the best way to get rid of those coupling... May be creating many maven modules (one per package) then we will immediately see the invalid coupling ? Or is there any tool we can use to detect the bad coupling ? Yeah we could do a module per package but that might be too eager. For now let's let dependencies that we cannot remove with some simple tricks and some common sense about what coherently goes together loosely guide our path. But let's be relaxed about it not freaking out about module explosion but let's not explode needlessly. I wish one simple equation solved these things but unfortunately they don't :(. I must admit I have not investigated this area yet... No worries. I got your back here and will give y'all an update about what was done, how and why so we're on the same page at some point and can think together on it to finally tidy up. Just get this AP thing worked out without worrying yourself about these details too much. What you're doing in AP land is much more important. The work needed here is a joke really. The big issue with it is the impact the changes in shared have all over the place in Studio and ApacheDS and the fact that we're better off waiting for AP work to complete to merge. Absolutely. I know that I'm a bottleneck here, but OTOH, there is little I can do to move faster :/ Please don't feel rushed. Again what you're doing is one of the most nasty areas of the server and not trivial. It would be easier and simpler writing a web server than this region of code. So just focus on doing it right so it does not steal any more of your time. I'll work on this stuff and update the list. I've got some stupid things to take care of today so I will not be as agressive until the weekend. Just a heads up. This might be due to eager reuse or the addition of utility methods into codec classes for convenience. Some of these dependencies can be removed by breaking out non-implementation specific methods and constants in codec classes into utility methods outside of the package or the module all together. Furthermore the codec implementation that handles [de]marshaling has to access package friendly (non-API) methods on implementation classes while encoding. Not sure that I get what you mean here. Can you be a bit more explicit ? LdapEncoder accesses package friendly methods inside most message Impl clases to encode them. This also pulls into message dependencies from codec which can be hidden. But these are really easy to fix. We just need to know that the situation is there and get rid of it. Get it now. Btw, I still have some issues with the codec classes (LdapEncoder/LdapDecoder). They could be simplified, as we still live with some mechanisms used years ago. The Client-API codec is way simpler. We can discuss this point in a separate thread. Sure no problem. I have some ideas here too (nothing big) just to make it so we can hide implementation better with the code making it more pluggable. While your working let me test the ideas out and post something about it. In the end, dependency upon further transitive dependencies are making us expose almost all implementation classes in shared, and most can easily be decoupled and hidden. It's effectively making everything in shared come together in one big heap exposing way more than we want to. It's quite impossible in Java to 'hide' all the classes that a user should not manipulate. Unless you use package protected classes, and it quickly has a limit, I would rather think in term of 'exposed' (ie documented) API. OSGi bundles really helps in this respect. It fills in where Java left off. OSGi makes it so the (bundle) packaging coincides with module boundaries. In Java this is loose and there's leakage all over, as you say, it's very hard to hide all implementation classes. True. I ruled out OSGi, but that may help a lot. Yeah but as Seelman pointed out you only get that benefit in the OSGi environment. We can do more like break things up better and use this internal package name component. That this documented API is gathered in one separate module for convenience is another aspect, but the user will still have to depend on all the other modules. Certainly, you're right, dependencies will still exist. A codec will be depended upon for it's functionality even if we do hide the implementation details under the hood. The value add here is not from avoiding a dependency. It's from not exposing more than we have to and being able to hide the implementation. This way we can
Re: Entry API
On Thu, Feb 4, 2010 at 8:38 PM, Emmanuel Lecharny elecha...@gmail.comwrote: Hi, here are some preliminary thoughts about the Entry class. The Entry class --- It's the base data structure representing an element stored into a LDAP server. It has a name (a DN) and some attributes. All the existing API use the same name, Entry (or LDAPEntry), except JNDI which has no such class (it uses Attributes but without a DN) and this class contains at least those two inner elements : - a DN - a set of Attribute There is some difference though as this element is either an Interface (ADS, ODS) or a Class (UID, jLdap) ODS define an implementing class named SortedEntry, which does not make a lot of sense, IMO. ADS class hierarchy is even more complex, as there are two lower interfaces (ClientEntry and ServerEntry) with two implementing classes (DefaultClientEntry and DefaultServerEntry). Overkilling … (and will be rewritten soon) I'd be careful to remove interfaces. As an API you have to allow the broadest range of implementation possibilities. Interfaces are good to have. When in doubt keep the interface. All in all, I'm wondering if it's a good idea to have an interface instead of a class, as it does not make a lot of sense to implement some different version of such an object. It's hard to foresee the future here. You've got to watch out when you're trying to prognosticate the future while designing an API. Just my two cents. Alex