Re: SVN organization

2006-11-10 Thread Marshall Schor
+1 to Adam's careful reasoning. 
It looks like only the website location / names would be changing.

As part of this, it would be good to clean up the initial upload for this
which won't be used anymore (site-publish).

I'd like to see something get done here, so we can start incrementally 
improving

our web-site soon  :-)
-Marshall

Adam Lally wrote:

On 11/10/06, Marshall Schor <[EMAIL PROTECTED]> wrote:

/common<<< the shared top node
/trunk
   /uima-website
   /uima-docbook



I think that things should be grouped under a trunk only if they are
likely to be branched/tagged together.  I don't think the website and
documentation would qualify.

Also I am thinking the documentation may want to stay under uimaj.  It
will be included as part of the uimaj build, and it makes sense to be
branched/tagged along with the Java code.  The C++ might have its own
separate documentation, at least to start.  We can always reorganize
the documentation later if we run into trouble.

For now I just want to get our website straightened out.  So I propose
that we restructure as follows:

site/
 trunk/
   uima-website/
 ** xml sources and generated html files go here

uimaj/
 trunk/
   uima-docbooks/
 ** documentation is here
   uimaj-/
 ** code is here


I can go ahead and do this myself if there are no objections.  Unless
I hear some +1's, I'll wait until Monday morning before touching
anything.

Also, if anyone has any changes to uima-website, now would be a good
time to commit them. :)

-Adam






Re: SVN organization

2006-11-10 Thread Adam Lally

On 11/10/06, Marshall Schor <[EMAIL PROTECTED]> wrote:

/common<<< the shared top node
/trunk
   /uima-website
   /uima-docbook



I think that things should be grouped under a trunk only if they are
likely to be branched/tagged together.  I don't think the website and
documentation would qualify.

Also I am thinking the documentation may want to stay under uimaj.  It
will be included as part of the uimaj build, and it makes sense to be
branched/tagged along with the Java code.  The C++ might have its own
separate documentation, at least to start.  We can always reorganize
the documentation later if we run into trouble.

For now I just want to get our website straightened out.  So I propose
that we restructure as follows:

site/
 trunk/
   uima-website/
 ** xml sources and generated html files go here

uimaj/
 trunk/
   uima-docbooks/
 ** documentation is here
   uimaj-/
 ** code is here


I can go ahead and do this myself if there are no objections.  Unless
I hear some +1's, I'll wait until Monday morning before touching
anything.

Also, if anyone has any changes to uima-website, now would be a good
time to commit them. :)

-Adam


Re: SVN organization

2006-11-10 Thread Thilo Goetz

Marshall Schor wrote:


/uimaj
   /trunk
  /uimaj-x   <<< the Java framework. Java examples, build 
scripts, Eclipse tooling etc.


So if we organize all uimaj stuff like this, can we drop the uimaj- 
prefix for the project names this time?  I don't have a very strong 
preference, I just think it's redundant.




 ** future **
/uimacpp<<< not uimac because that is a funny word :-)
   /trunk
  /uimacpp-x<<< various c modules

/sandbox  <<< maybe no branches/tags and therefore no need 
for trunk

   /project1


I guess we may have projects in a variety of programming languages in 
the sandbox.  Is that something we want to think about now? 
Alternatively, we can assume for now that all contributions will be in 
Java and worry about it once somebody wants to contribute stuff in other 
languages.


--Thilo



Re: SVN organization

2006-11-10 Thread Marshall Schor

Michael Baessler wrote:
But if I'm honest after this long discussion I do not remember all the 
details of each of the
suggestions. I would like to have a summary of the two suggestions and 
than start a vote for it.
Many projects save votes for those times where there is not a clear 
consensus.  I hope we don't need a vote here :-)


Here's another slight variation proposal:  under incubator/uima, the 
following structure


/common<<< the shared top node
   /trunk
  /uima-website
  /uima-docbook

/uimaj
   /trunk
  /uimaj-x   <<< the Java framework. Java examples, build 
scripts, Eclipse tooling etc.


 ** future **
/uimacpp<<< not uimac because that is a funny word :-)
   /trunk
  /uimacpp-x<<< various c modules

/sandbox  <<< maybe no branches/tags and therefore no need 
for trunk

   /project1

/test-corpora
   /project1 etc.

Feel free to improve :-)

-Marshall


Re: SVN organization

2006-11-10 Thread Michael Baessler

Marshall Schor wrote:

I think the bottom line is that both approaches can be argued for -
I don't have strong preference, except I do like to (in general)
split up things into more manageable-sized chunks.  So, I think I would
come down on the side of having separate top-level projects for the
uima-java and uima-cpp things - they're pretty big pieces of code.
I also think that having separate top-level projects for uima-java and 
uima-cpp is better.

Especially when thinking about branching code...
But if I'm honest after this long discussion I do not remember all the 
details of each of the
suggestions. I would like to have a summary of the two suggestions and 
than start a vote for it.


Maybe this means we might want a project for "smaller" things that 
don't belong in

either one (like the docbook).  We could collect these things
into a general top-level thing called uima-shared - and if it got too 
big,

we could consider splitting it.
Sounds good to me, to separate shared uima stuff into a own top level 
project.


-- Michael



Re: SVN organization

2006-11-09 Thread Marshall Schor

Adam Lally wrote:


Even if there's no technical barrier (seeing as svn realizes branches
and tags as copies anyway), to me it's a more natural mental model if
branches/tags actually correspond to a single artifact that will be /
has been released together.

I think the bottom line is that both approaches can be argued for -
I don't have strong preference, except I do like to (in general)
split up things into more manageable-sized chunks.  So, I think I would
come down on the side of having separate top-level projects for the
uima-java and uima-cpp things - they're pretty big pieces of code.

Maybe this means we might want a project for "smaller" things that don't 
belong in

either one (like the docbook).  We could collect these things
into a general top-level thing called uima-shared - and if it got too big,
we could consider splitting it.

-Marshall


Re: SVN organization

2006-11-09 Thread Adam Lally

On 11/9/06, Marshall Schor <[EMAIL PROTECTED]> wrote:


> So branching and tagging aren't very straightforward in that scenario.
I'm not sure this is true.  Yes, creating a branch or tag is copying the
whole project, but it is a lazy copy.  So, yes, it would "collect a bunch of
things that didn't belong" - but does that cause a problem?


It doesn't make it impossible, just less straightforward.  By
straightforward I mean I can create a tag "version 1.2.3" that
contains only things that are actually in version 1.2.3, instead of
dragging along things that have nothing to do with that version.

Even if there's no technical barrier (seeing as svn realizes branches
and tags as copies anyway), to me it's a more natural mental model if
branches/tags actually correspond to a single artifact that will be /
has been released together.

-Adam


Re: SVN organization

2006-11-09 Thread Marshall Schor

Adam Lally wrote:

On 11/9/06, Marshall Schor <[EMAIL PROTECTED]> wrote:


I like having more synchronization (as an option, not as a 
requirement:-)

If we have, say, the C# framework under uima-sdk, and it has an
unsynchronized release schedule - would that be a problem or not?


It might be.  If we wanted to create an SVN tag for the unsynchronized
release that could be trouble.  If we just created a tag for the whole
trunk we'd collect a bunch of things that didn't really belong in the
release.  The same thing for branches, if we wanted e.g. a bug fix
branch for just the Java version that was released while the C++ went
on with development.  So branching and tagging aren't very
straightforward in that scenario.
I'm not sure this is true.  Yes, creating a branch or tag is copying the 
whole
project, but it is a lazy copy.  So, yes, it would "collect a bunch of 
things

that didn't belong" - but does that cause a problem?

There is even a way - if you want to create a tag with a mixture of
files at different revision levels - to do this.

Can anyone come up with a scenario where there would be trouble
with this approach?

-Marshall



Re: SVN organization

2006-11-09 Thread Adam Lally

On 11/9/06, Marshall Schor <[EMAIL PROTECTED]> wrote:


I like having more synchronization (as an option, not as a requirement:-)
If we have, say, the C# framework under uima-sdk, and it has an
unsynchronized release schedule - would that be a problem or not?


It might be.  If we wanted to create an SVN tag for the unsynchronized
release that could be trouble.  If we just created a tag for the whole
trunk we'd collect a bunch of things that didn't really belong in the
release.  The same thing for branches, if we wanted e.g. a bug fix
branch for just the Java version that was released while the C++ went
on with development.  So branching and tagging aren't very
straightforward in that scenario.

Given that it may be best to avoid this structure if we imagine
unsynchronized releases.  But do we?  Also do we know if the Incubator
(since they approve releases) has any issue with us saying we want to
release PART of our project but not some other part?

Of course, we can still choose to synchronize releases even if we have
different top-level directories (different trunks) for the different
frameworks -- if we want to do a tag we just have to apply the same
SVN operation separately to each of the top-level directories.

-Adam


Re: SVN organization

2006-11-09 Thread Marshall Schor

Adam Lally wrote:

 I agree with that meaning of SDK, but seem to have come to the
opposite conclusion. :)  What I was proposing as uima-sdk does break
down into uimaj-core, uimaj-tools and uimaj-examples, among other
things.  It really does capture the whole SDK.
OK, I see your point.  The uima-sdk would be quite different from 
"corpora" or "sandbox".


I do agree that the deciding factor should be whether these code bases
are released as one.  A related question would be whether they have
the same version number.  In the past our C++ version numbers have not
been in sync with our Java version numbers, but I don't think we were
ever entirely happy with that.  Maybe a tighter synchronization of
releases would be a good thing.

I like having more synchronization (as an option, not as a requirement:-)
If we have, say, the C# framework under uima-sdk, and it has an
unsynchronized release schedule - would that be a problem or not?

-Marshall


Re: SVN organization

2006-11-09 Thread Adam Lally

To me, the "SDK" has a meaning of adding tools, examples, etc. to a core
thing.  Using it as a top-level collection name for the various
framework implementations seems a bit "off".


I agree with that meaning of SDK, but seem to have come to the
opposite conclusion. :)  What I was proposing as uima-sdk does break
down into uimaj-core, uimaj-tools and uimaj-examples, among other
things.  It really does capture the whole SDK.



I think our big code bases (Java, C++, maybe others in the future - e.g.
C#, javaScript) could go into their own top-level things.  One criteria
to balance here is independence of releases.  Each top level thing might
reasonably be assumed to be release independent from other top level
things. This isn't quite true with C++ and Java - they often have some
(weak) dependencies due to naming issues, usually.


I'm not quite getting why Java-C++ have a tighter coupling than
Java-C# might have.

I do agree that the deciding factor should be whether these code bases
are released as one.  A related question would be whether they have
the same version number.  In the past our C++ version numbers have not
been in sync with our Java version numbers, but I don't think we were
ever entirely happy with that.  Maybe a tighter synchronization of
releases would be a good thing.

-Adam


Re: SVN organization

2006-11-09 Thread Marshall Schor
To me, the "SDK" has a meaning of adding tools, examples, etc. to a core 
thing.  Using it as a top-level collection name for the various 
framework implementations seems a bit "off".


I think our big code bases (Java, C++, maybe others in the future - e.g. 
C#, javaScript) could go into their own top-level things.  One criteria 
to balance here is independence of releases.  Each top level thing might 
reasonably be assumed to be release independent from other top level 
things. This isn't quite true with C++ and Java - they often have some 
(weak) dependencies due to naming issues, usually.


Other top-level things might be "test-corpora" if we get into that business.

-Marshall

Adam Lally wrote:

Our SVN structure is currently something like this:
site/
 site_publish/
   ** the html files for our website are currently here

uimaj/
 trunk/
   uima-website/
 ** website sources are here
   uima-docbooks/
 ** documentation is here
   uimaj-/
 ** code is here



I like the idea of the top-level separation between the website and
the code.  I think we should move the uima-website project to
underneath site.  (We have to do something to consolidate these,
soon.)

Another decision to make is about "trunk" directories.  I think we
should have a trunk directory under each top-level directory.  This
allows us to create branches and tags separately for the different
top-level things (site and uimaj).

I also wonder whether "uimaj" is the right name for the top level
directory.  When we add C++ support, do we want that as a separate top
level "uimacpp" directory.  That would mean that it would have
separate branches/tags from uimaj.  I'm not sure if that's good.  When
we make a release wouldn't we want to tag all of the code for Java and
C++ together?  Maybe "uima-sdk" is a better name, and we'd expect to
add the C++ projects there later?

So to recap I'm thinking of a structure like this:

site/
 trunk/
   uima-website/
 ** xml sources and generated html files go here

uima-sdk/
 trunk/
   uima-docbooks/
 ** documentation is here
   uimaj-/
 ** code is here
  uimacpp-/ [someday]

uima-sandbox/ [someday, possibly]



How does that sound?
 -Adam