Again, it's not a separate project. It's a new module within existing project. CLI is not part of the library. CLI uses the library. CLI is part of the project. This should be reflected properly.
Conceptually they are separate, in terms of dependencies they are separate. To me that's enough to separate. Afterall, packages and namespaces serve similar reason and nobody argues that CLI or MaxEnt should be in a separate package. The rest (like classpath scanning) are consequences. Aliaksandr On Wed, Nov 23, 2011 at 2:32 PM, Jörn Kottmann <[email protected]> wrote: > On 11/23/11 1:53 PM, Aliaksandr Autayeu wrote: > >> My proposal is different. The release should remain a single one. Let me >> make the proposal more concrete in terms of steps: >> >> 1) Create a opennlp-cli module with its own pom. This will make tools >> package smaller and will improve the experience of library-only users: >> less >> stuff to drag around. >> 2) Keep the same single release of OpenNLP. >> >> Currently most of our code lives in opennlp-tools and is separated by java >> >>> packages >>> which I believe works really great and there is no need to cut this down >>> further. >>> >>> Exactly! And CLI is already a separate package. >> > > I am not convinced that the additional sub-project opennlp-cli is worth > the non-noticeable advantage of having less classes on the classpath. > The java packages give us already good separation. > > > Maybe we should do even the opposite and also move the maxent code in >>> there? >>> I am +1 on that, actually. >>> >>> It does make sense given current tight integration. However, >> conceptually, >> this will break modularization. Even MaxEnt is not a pure maxent anymore - >> there is perceptron inside as well. Nicolas Hernandez mentioned in a >> recent >> thread "for may be considering alternatives to the MaxEnt algorithm". >> Rolling everything into one bundle will make these possible plans more >> difficult. If these plans would advance, this might lead to some >> abstraction to interfaces and (several) implementations, which might >> become >> optional dependencies. So I would keep current level of modularization >> with >> respect to maxent. >> > > We are planning a refactoring which will rename it to ml, but that is a > different > story. > Adding a new algorithm like we did with perceptron works well in this > setup. > What is not possible is the addition of a new algorithm in user code. > There are various things which need to be solved for this, e.g. how to > pass down training options > from the tools package? How to load the new model with out zip package? > How to load > the classes which implement the algorithm? > > Anyway that are issues which are orthogonal to our project structure > decision. > > > >> On the other side you could argument to cut things down, then you might >> end >> >>> up >>> with a couple of different sub-projects. Another prime candidate for >>> moving >>> is the coreference package because it introduces an extra dependency, >>> which no >>> other component needs. >>> >>> Nice point. Illustrates similar situation. This is actually, a good >> argument in favor of per-component modules, but for now that would >> complicate things too much. So, here I would refactor it to make the >> dependency on JWNL optional, continuing on the lines of existing >> Dictionary >> interface and providing means to register your own implementation. This >> will get rid of dependency. Afterall, there are alternatives. >> >> >> When you think this further you could end up with a sub-project per >>> component. >>> >>> Although there are scenarios where this makes sense, I would avoid this >> for >> the moment. >> >> > Maybe I am mistaken, but I really cannot see the advantage of maintaining > the cli package or > other code in separate projects. Less classes (we are speaking about a few > 10 classes) > on the classpath is not a good enough reason to do this in my opinion. > Java does lazy class > loading, only classes which are needed are loaded into memory. The only > advantage which > might be there is that classpath scanning is faster, but I doubt that this > will be noticeable, > or affect many users. > > Jörn > >
