Re: Nuclear Fission, Splitting the core: The SPI Effect [was: Improving the accessibility of the Jackrabbit core]

2006-09-13 Thread David Nuescheler

Hi,


However, I'm a bit concerned about the revolutionary approach of the
SPI effort. Rather than refactoring the Jackrabbit core to better
separate the session-local parts, the SPI comes up with a brand new
interface contract. This is probably the best thing to do given the
SPI goals, but it does leave the big question of how and when are we
going to integrate it with Jackrabbit core unanswered.


I think you are right. Just to be clear, I do not look at the
architecture suggested or hinted by the SPI to be implemented
in a the very near future in a clean way.

My original intention of this thread really was to stimulate some
discussions around a possible Jackrabbit 2.x architecture.


As of now the easiest way I see to integrate the SPI effort with the
Jackrabbit core is through a generic spi2jcr adapter, but that doesn't
really affect the core design or increase code re-use.


I agree.

As a side note: I even see value for an spi2jcr adapter
beyond the mid-term goal of a better remoting for the current
Jackrabbit core. I think that a spi2jcr adaptor (in conjunction with the
jcr2spi-client and the protocol bindings of the SPI) could serve as a
general purpose remoting layer for any JCR compliant repository.

Generally, I think we could also look at a phased approach, that
allows us to test, evolve and mature the components individually.

I think we could also do something like:
(Step1) Isolate the session-local parts into a standalone client (JCR2SPI)
(Step2) Build the SPI2JCR layer that exposes the current Jackrabbit
impl to the SPI
(Step3) Refactor the Jackrabbit core to natively implement the SPI

Thoughts?



What would a
more deeply integrated spi2jackrabbit component look like, and how
would we implement it in the core?

I am not sure how that would look like and I guess that this would be
subject to some investigations. It may well be that some portions
would benefit from being refactored to work efficiently.


And on the other hand, how can the SPI effort better reuse the
experience built into the session-local parts of Jackrabbit core?
For example looking at the SessionImpl implementations from both jcr2spi
and the core, I see quite a lot of duplicate functionality. How does
the SPI make sure that the lessons learned developing the core are
included in the new codebase?

I agree. I think the lessons learned should be transported through the
Jackrabbit Community and its experience with JCR and Jackrabbit
over the past years. Of course I would also prefer to re-use as much
existing and well tested code as possible. But personally I think
we should not make architectural sacrifices at this point.

I believe that the overlap and the redundance of the code between the
session-local parts and the core are rooted on the original compact
and intertwined design.

Do you think we would see the same overlap if we would basically
have a straight-up SPI implementation (on the server-side) more or
less from scratch and strictly separate the session-local parts?

regards,
david


Re: Nuclear Fission, Splitting the core: The SPI Effect [was: Improving the accessibility of the Jackrabbit core]

2006-09-11 Thread Jukka Zitting

Hi,

On 9/7/06, David Nuescheler [EMAIL PROTECTED] wrote:

In my mind the introduction of the SPI would lead to a clean
split of the Jackrabbit core architecture that allows for
much better re-use and better transparency. Essentially
the core could be reduced to the server which should
siginifcantly reduce the complexity.


I very much agree with the benefits of the SPI approach, especially
for remoting and re-use.

However, I'm a bit concerned about the revolutionary approach of the
SPI effort. Rather than refactoring the Jackrabbit core to better
separate the session-local parts, the SPI comes up with a brand new
interface contract. This is probably the best thing to do given the
SPI goals, but it does leave the big question of how and when are we
going to integrate it with Jackrabbit core unanswered.

As of now the easiest way I see to integrate the SPI effort with the
Jackrabbit core is through a generic spi2jcr adapter, but that doesn't
really affect the core design or increase code re-use. What would a
more deeply integrated spi2jackrabbit component look like, and how
would we implement it in the core?

And on the other hand, how can the SPI effort better reuse the
experience built into the session-local parts of Jackrabbit core? For
example looking at the SessionImpl implementations from both jcr2spi
and the core, I see quite a lot of duplicate functionality. How does
the SPI make sure that the lessons learned developing the core are
included in the new codebase?

BR,

Jukka Zitting

--
Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED]
Software craftsmanship, JCR consulting, and Java development


Re: Nuclear Fission, Splitting the core: The SPI Effect [was: Improving the accessibility of the Jackrabbit core]

2006-09-07 Thread Thomas Mueller

Hi,

I think it's a good thing to do. Some random ideas:

I don't understand why it needs to be stateless (about my understanding of
stateless, see http://en.wikipedia.org/wiki/Stateless_server). As far as I
see stateless means it's slower, and I really don't like slow ;-) Even HTTP
is becoming more and more stateful to improve performance I guess. Maybe Roy
could give his view about stateless versus stateful. I know there are some
other advantages / disadvantages.

Maybe we actually need two 'standards': a stateful binary protocol, and a
stateless SOAP-style protocol. TCP/IP is the most common 'fast' protocol. I
suggest to standardize the binary wire level  protocol as well (details
could be done after/independent of the SPI).

Requirements of clustering should be kept in mind as well.

And I like the graphic! Nice 3D effect!

Thomas


RE: Nuclear Fission, Splitting the core: The SPI Effect [was: Improving the accessibility of the Jackrabbit core]

2006-09-07 Thread Miro Walker
David,

Agreed regarding use of client and server as terminology - I think
this leads to some fuzzy thinking as the client and the application
that uses the client get confused.

Could we borrow from JDBC and call them the JCR Driver and JCR
Server? That to me gives the right sort of thinking in terms of having
a Driver that is used by a client application. The JCR Driver can handle
calls to the JCR Server either locally or remotely (via whatever actual
transport is required). The JCR Server exposes the SPI as a low-level
interface and the JCR Driver translates this into a
standards-compliant API.

Miro

-Original Message-
From: David Nuescheler [mailto:[EMAIL PROTECTED] 
Sent: 07 September 2006 10:00
To: dev@jackrabbit.apache.org
Subject: Nuclear Fission, Splitting the core: The SPI Effect [was:
Improving the accessibility of the Jackrabbit core]

Hi All,

I would like to use Jukka's initiative as a starting point to discuss
a couple of high level architecture topics around the SPI initiative
and its potential effect on the overall Jackrabbit architecture.

Please consider all of the following comments as my
personal views which I would like to put up for discussion.
Nothing should at any point be considered set in
stone. I would like to trigger a design discussion on a
potential revision of the Jackrabbit architecture.

Looking at the current core architecture of Jackrabbit [1] we
have a relatively tightly coupled, heavily interdependant,
monolithic and compact core architecture, which really has
no additional public interface between the JCR API and the
Persistence Manager Interface.

Over the course of the last months (even years) we encountered
new requirements that people wanted to re-use portions of the
Jackrabbit core code or that we wanted to support deployment
Model 3 [2] in a reasonable way from a network perspective.
This lead to the SPI initiative [3] which was first intended to be
developed as part of JSR-283, but was decided to be out of
scope for this spec.

Generally, the idea of the SPI (Service Provider Interface) is
to create an interface that separates the transient
space (I will call it the client for the lack of a better name)
from the persistent portion of a repository (which I will call
the server [better naming would be very welcome]) of
Jackrabbit. Not only should this allow to better componentize
the Jackrabbit core and re-use the client and the server
independantly but it also should allow for meaningful remoting.

I think a resulting architecture could look something like this [4]
showing a clean split into a client or server portion.

As mentioned above the introduction of the SPI should allow for
meaningful remoting, this involves a somewhat stateless and
flat (service oriented) interface that lends itself to remoting.
The SPI should also provide an abstraction for people to plug
in any remoting layer be it RMI, WebDAV, SOAP or a
more efficient binary protocol that is specifically designed
for that layer. The SPI should also work without a remoting
layer alltogether to still support the deployment models
1 and 2 [5] in an efficient way.

Providing such a more suitable remoting layer will also
allow people to write clients in non-java environments more
easily [6][7] (I am also aware of .NET and Javascript
clients that are at early stages of the development)

The SPI also has triggered the interest of (commercial) developers
who want to implement a JCR layer on top of their existing legacy
repository, without having to re-implement all the client portions
needed in the transient space. Those developers would look at
implementing the SPI (the server) and leverage the common
Jackrabbit client to reach JCR compliance quicker. I think this
could lead to a more widely used Jackrabbit client and therefore
to a very well tested and scalable implementation.

In my mind the introduction of the SPI would lead to a clean
split of the Jackrabbit core architecture that allows for
much better re-use and better transparency. Essentially
the core could be reduced to the server which should
siginifcantly reduce the complexity.

Please let me know what you think, as mentioned before
this should be the starting point of an architecture discussion.

regards,
david

[1] http://jackrabbit.apache.org/images/arch/jackrabbit-ism.jpg
[2] http://jackrabbit.apache.org/doc/deploy/howto-model3.html
[3] http://www.mail-archive.com/dev@jackrabbit.apache.org/msg01496.html
[4]
http://www.day.com/o.file/spi-arch.jpg?get=0a1a63a2a86ab7041a6bce2e0f55b
4a0
[5] http://jackrabbit.apache.org/doc/deploy.html
[6] http://search.cpan.org/~hanenkamp/Java-JCR-0.07/
[7] http://svn.apache.org/repos/asf/jackrabbit/trunk/contrib/phpcr/


Re: Nuclear Fission, Splitting the core: The SPI Effect [was: Improving the accessibility of the Jackrabbit core]

2006-09-07 Thread David Nuescheler

Hi Thomas,

Thanks for your thoughtful comment.


I don't understand why it needs to be stateless (about my understanding of
stateless, see http://en.wikipedia.org/wiki/Stateless_server). As far as I
see stateless means it's slower, and I really don't like slow ;-) Even HTTP
is becoming more and more stateful to improve performance I guess. Maybe Roy
could give his view about stateless versus stateful. I know there are some
other advantages / disadvantages.


Hmm... I am not sure if I would agree with the generally slower
statement, but I completely agree that this touchy topic.

I remember a somewhat lengthy verbal discussion
revolving around that a similar topic.

Some of the legacy repositories that may want to implement the SPI are
stateful, which makes it less intuitive for them to implement a completely
stateless SPI. I still have not found a completely satisfying solution
for that, but somehow it would be great if a well-behaved client
could issue something like login() and possibly logout() to indicate to
the server that some heavy-weight resources can be disposed.
I think I understand that something like that could possibly break
the stateless contract, but it could solve a very practical need.

I could envison something along the lines of passing something like a token
(or a cookie to borrow an HTTP analogy) on the login() call which would
be passed back to the server on subsequent calls to help identify the
server-session. Of course the server should also be able to work
without this token but from a performance perspective would be
capable of optimizing the use of some of its resources.

What do you think?

regards,
david


Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread David Nuescheler

Hi All,

Dave, thanks a lot for your input.


. Screenshots or easily downloadable sample app which
actually does something with custom node types. the base war
download is good, but how far could you go with it. Most open
source applications have a contacts application or a phone book,
or something similar. something that has a face, like a jsp to
view whats in the repository would be great
. the wiki has not been updated regularly, either the information is old or not 
many people go to it
. the deployment models - creating a complete tomcat dist, which has the 
various deployment options running right out of the box would be nice.
. a java example to add node types, for example for a phone book, which CRUDs 
the  node types would be nice
. maybe a page, which lists the possibilities of applications that could be 
built with JR will be useful for newbies.


I completely agree with you that all of the above are excellent measures
that we should be looking at to ease the adoption of new
content application developers. I think it is very important that people
get things up and running very quickly and are equipped with very good
user documentation.

Personally, I think we have to separate the concerns though, I think
Jukka's initial post was going into the direction of making the internals
of the core more accessible to more developers.

I think that there are a number of steps that we can take into that
direction and I also think that for example the separation eventually
provided by the SPI will bring some more architectural clarity.

While I agree that we need to have a modular design where people can
plug-in their extensions at certain defined interfaces and extension points,
I would discourage the idea that every user needs to be able to submit
patches to the core.

In my mind the core should be very compact and very controlled since
it has to be extremely stable and scalable, meaning that there is not
really a need to have dozens of developers working on a more
smallish core.

regards,
david


Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread Nicolas

On 9/6/06, David Nuescheler [EMAIL PROTECTED] wrote:


While I agree that we need to have a modular design where people can
plug-in their extensions at certain defined interfaces and extension
points,
I would discourage the idea that every user needs to be able to submit
patches to the core.

In my mind the core should be very compact and very controlled since
it has to be extremely stable and scalable, meaning that there is not
really a need to have dozens of developers working on a more
smallish core.



Hi,

My two cents on the subject drawing from my experience on the backup tool.

At first Jukka and I wanted to avoid impact on the core for the reasons you
mentionned. It turned out we had to eventually update some parts of the
core: some functionnalities were simply not there. We minimized the changes
(only a few lines)... But they were quite bad (I exposed something that
shouldn't). After some rethinking and a few try out, I am back to my initial
plan with a few classes added to the core.

This example shows the Core is not over in the sense, it lacks some
functionnality (for instance in my case a way to import the versions). I
think we need to remember JR is still a fairly new project and some use
cases have still not been detected.

Some functionnalities have not been needed yet for the core contributors but
might emerge from other companies/individual (for instance my company would
need to extend JR to support our needs). I think discouraging those
contributions can be a bad idea: we should encourage them, keep the code and
refactor them if necessary. This way both the contributor and the communitu
take benefit from it: a new functionnality with a cleaner code.

I agree with you though that we should encourage contribution and not update
to the core. But we should document the core. In my case, it took me a lot
of time the part I needed (I wrote a new UpdatableStateManager since I
couldn't figure out how the EventFactory was working).

BR
Nicolas
my blog! http://www.deviant-abstraction.net !!


Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread Jukka Zitting

Hi,

On 9/6/06, David Nuescheler [EMAIL PROTECTED] wrote:

Personally, I think we have to separate the concerns though, I think
Jukka's initial post was going into the direction of making the internals
of the core more accessible to more developers.


Correct. In any case, Dave's points are a valuable addition to the
feedback I gathered a while ago before the 1.0 release with the issue
of streamlining the end-user experience.


While I agree that we need to have a modular design where people can
plug-in their extensions at certain defined interfaces and extension points,
I would discourage the idea that every user needs to be able to submit
patches to the core.


I'm most concerned about the overhead for people going in trying to
trace why Jackrabbit is behaving the way it does in some specific
issue. This is often the first step of becoming a contributor, and in
my opinion it's currently quite a high step to overcome.




In my mind the core should be very compact and very controlled since
it has to be extremely stable and scalable, meaning that there is not
really a need to have dozens of developers working on a more
smallish core.




BR,

Jukka Zitting

--
Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED]
Software craftsmanship, JCR consulting, and Java development


Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread Stefan Guggisberg

On 9/6/06, Nicolas [EMAIL PROTECTED] wrote:

On 9/6/06, David Nuescheler [EMAIL PROTECTED] wrote:

 While I agree that we need to have a modular design where people can
 plug-in their extensions at certain defined interfaces and extension
 points,
 I would discourage the idea that every user needs to be able to submit
 patches to the core.

 In my mind the core should be very compact and very controlled since
 it has to be extremely stable and scalable, meaning that there is not
 really a need to have dozens of developers working on a more
 smallish core.


Hi,

My two cents on the subject drawing from my experience on the backup tool.

At first Jukka and I wanted to avoid impact on the core for the reasons you
mentionned. It turned out we had to eventually update some parts of the
core: some functionnalities were simply not there. We minimized the changes
(only a few lines)... But they were quite bad (I exposed something that
shouldn't). After some rethinking and a few try out, I am back to my initial
plan with a few classes added to the core.

This example shows the Core is not over in the sense, it lacks some
functionnality (for instance in my case a way to import the versions). I
think we need to remember JR is still a fairly new project and some use
cases have still not been detected.

Some functionnalities have not been needed yet for the core contributors but
might emerge from other companies/individual (for instance my company would
need to extend JR to support our needs). I think discouraging those
contributions can be a bad idea: we should encourage them, keep the code and
refactor them if necessary. This way both the contributor and the communitu
take benefit from it: a new functionnality with a cleaner code.


i don't follow your argumentation. why would this lead to cleaner code?

cheers
stefan



I agree with you though that we should encourage contribution and not update
to the core. But we should document the core. In my case, it took me a lot
of time the part I needed (I wrote a new UpdatableStateManager since I
couldn't figure out how the EventFactory was working).

BR
Nicolas
my blog! http://www.deviant-abstraction.net !!




Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread David Nuescheler

Hi Nico,

Thanks for your mail.


I will work on the documentation directly on the wiki (when I can start this
task). I will ask a lot of questions *though*.

Looking forward to it ;)


One precision on the backup tool: it is working (and I am polishing the code
that needs to fit in Core). And with my new JR understanding, I plan to
start implementing a version 2 in my spare time having hotbackup.

Excellent, thanks for all your efforts.

I did not mean to imply that the backup tool was not working. If I should
have said anything like that, I would like to apologize.

regards,
david


Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread Jukka Zitting

Hi,

On 9/6/06, David Nuescheler [EMAIL PROTECTED] wrote:

Got it. Generally, I am more of a given the right eyeballs, all bugs
are shallow type of person to begin with.


Perhaps we can find common ground at enough right eyeballs. ;-)


If I currently take look at the shallowness of actual core bugs ;)
in Jackrabbit I see that the Jackrabbit community has an
outstanding bug resolution time. To me this is probably one of the
biggest strengths of Jackrabbit and its community.
Do you see this as a weakness that needs improvement?


Definitely not. :-) What I do see as a weakness is that we rely on a
handful of core developers to keep up this level of support when we
could better tap the great potential within the community. In fact I'd
rather see the core developers spending more time being proactive
designing new features and improvements (like improving performance,
scalability, etc.) than reactive analyzing user issues when large
parts of that work could be distributed.


I think in the end it all boils down to matter of priorities and
I would be very interested in having a discussion around what
we think drives and hinders the Jackrabbit adoption and community
today and tomorrow, and therefore what we should focus on.


+1 There's already quite a lot of feedback on the adoption part, but
that would need to be summarized and analyzed to better focus the
efforts.

BR,

Jukka Zitting

--
Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED]
Software craftsmanship, JCR consulting, and Java development


Re: Improving the accessibility of the Jackrabbit core

2006-09-06 Thread Roy T. Fielding

On Sep 6, 2006, at 4:14 AM, David Nuescheler wrote:


Personally, I believe that for example a restore facility has to be
buried deep down in the core and therefore the code has to comply
with the high quality requirements that we have for code in the core
and for the seasoned Jackrabbit experience of a developer.


That is why each of the core developers has veto power over the code.
If we want to ensure that every line is adequately reviewed, then ask
for the core code to be governed by the RTC (review-then-commit) rule.
Note, however, that such a requirement will extend to all commits on
that part of the code.


In my mind your experience with developing very close
to the heart of Jackrabbit should not lead us to opening
up the core so inexperienced Jackrabbit developers can
contribute, but it should help us realize that we have
very high requirements for Jackrabbit developers that make
modifications to the core.


I don't think you understand.  This is an Apache project and anyone can
contribute to any part of it.  The degree of review we require of those
contributions is decided by the PMC (our committers).  We can increase
the requirements on review of the core code and we can separate  
compatible

and incompatible changes into versioned branches, but we cannot ask of
others what we do not accept of ourselves.

In my opinion, the core code continues to evolve as people try to do
larger and more expressive things with Jackrabbit and apply JCR to
real problem sets.  We need to welcome that and change things based
on their technical merits, not any preconceived notions of how much
a person knows about the current (highly opaque) core architecture.
Most likely, this will mean simplifying the core by removing or
refactoring some of the spaghetti dependencies.

One of those things that will change is the degree of extensibility,
since that is the heart of any successful open source project and
Jackrabbit isn't even halfway there yet.  I am sure that others with
fresh energy will see new ways to solve the same problem that will
not be burdened with the legacy decisions that we made for one reason
or another.  When those ideas are presented, they will be subject to
intense scrutiny and adopted based only on their proven benefits.
They will not be judged based on who wrote them or how much time they
spent writing the initial core code.

Roy


Re: Improving the accessibility of the Jackrabbit core

2006-09-05 Thread Dave Bobby
I will put in my 2c since I did not see many replies to this post and I think 
addressing this question is very important for any open source project. i have 
not had much time to play with JR due to other work, so some of this might 
already be there.

. Screenshots or easily downloadable sample app which actually does something 
with custom node types. the base war download is good, but how far could you go 
with it. Most open source applications have a contacts application or a phone 
book, or something similar. something that has a face, like a jsp to view whats 
in the repository would be great
. the wiki has not been updated regularly, either the information is old or not 
many people go to it
. the deployment models - creating a complete tomcat dist, which has the 
various deployment options running right out of the box would be nice. 
. a java example to add node types, for example for a phone book, which CRUDs 
the  node types would be nice 
. maybe a page, which lists the possibilities of applications that could be 
built with JR will be useful for newbies.

just my 2c.

Thanks

Dave

Nicolas  [EMAIL PROTECTED] wrote: Hi,

I have got familiar with JR codebase in the last few months and follow is
based on my experience in the backup tool.
The community is really helpful when you need some help but in order to
understand the basic concept you need to dig into the code and into the JCR
spec.

A general documentation might be a good idea: a user one where key concepts
are explained (versioning, nodetypes, and so on). We can I think mostly
copy/paste from the JCR to the Wiki. We also need  I believe some
documentations about JR 's internals: how a node is updated what is an
ItemState.

BR
Nico
my blog! http://www.deviant-abstraction.net !!



-
 All-new Yahoo! Mail - Fire up a more powerful email and get things done faster.

Improving the accessibility of the Jackrabbit core

2006-08-30 Thread Jukka Zitting

Hi,

Based on private discussions I'd like to raise the issue of the
accessibility of the Jackrabbit core codebase. We have a small number
of people who are intimately familiar with the core codebase (see the
numbers below), but others find the core hard to navigate and that
this drives up the barrier of entry of contributing to Jackrabbit.

Please share any good ideas on how we could best lower the barrier.
I'm open to all sorts of ideas, like more documentation (javadocs, UML
diagrams, architectural descriptions, etc.), scheduled QA sessions on
IRC, an informal Jackrabbit workshop during the Hackathon in
ApacheCon, etc. I'm also interested in the priorities, i.e. what would
give us the most bang for buck in terms of making it easier for
people to get familiar with the Jackrabbit core and start
contributing.

$ svn log src/main/java/org/apache/jackrabbit/core | \
 perl -lne '/^r[0-9]+ \| (.*?) \|/ and print $1' | sort | uniq -c | sort -n -r
   371 stefan
   199 tripod
   185 mreutegg
   127 jukka
27 dpfister
13 fielding
10 angela
 4 fmeschbe
 3 edgarpoce
 2 sylvain

BR,

Jukka Zitting

--
Yukatan - http://yukatan.fi/ - [EMAIL PROTECTED]
Software craftsmanship, JCR consulting, and Java development