I agree there should be as little duplication as possible. But, our ultimate goal in doing this is so we have more control over the apache content in the maven repository.

I'm currently working with the avalon group to resolve duplicates there. This will be an ongoing process. We will need to work through the list of duplicates and resolve how each will be managed.

If you look, for instance, at the avalon project, they are maintaining a pseudo maven repository structure for jars mixed in with that of the nightly build/distribution structure for dist

avalon
  -> binaries
  -> sources
  -> jars

so in many cases

/java-repository/avalon-subproject/jars --> dist/avalon/subproject/jars


In terms of the maven repository, the real objective is primarily to provide a "canonical maven repository" for apache projects thats maintained at Apache. If your publishing jar files in dist under any other location, the following needs to be resolved:


1.) they should be either symlinked in java-repository to the jars directory and your maven project-id.

or

2.) you should migrate your publishing of jar distributions to /java-repository/your-project-id.

This has been going on via ibiblio for the last year or so in a very unscalable fashion. trying to release a version or even a snapshot onto ibiblio is a painful process to get started, especially if your very new at it.



Breakdown of the duplicated files in java-repository that previously existed on dist:

Avalon: ~86 (43 + md5 for each file)
Cactus: 3 (without md5 sums)
Jakarta Commons: 3 (without md5 sums)
Lucene: 2 (without md5 sums)
Turbine: 2 (without md5 sums)

Breakdown of duplicate files within the repository itself:

~35 files (with duplicate md5 sums)

So the primary projects I'm working on are:

A.)Work with Avalon to clean up duplication issues.

b.)Work in java-repository to deal with duplication by making sure symlinking is used between snapshot distributions and existing builds.

Its also important to note that a snapshot version release of a jar in the repository may be the same size and have the same content as another snapshot of that jar, the md5 sums will be equal, the files are essentially equal contents, but they were built on different dates.

On top of this there are the following duplicates that have nothing to do with our move

perl: 20 duplicate files
velocity: 5 duplicate files
log4j: 25 duplicate files
http: 4 duplicate files
xalan: 1 file

Finally, I suspect we will eventually work out permissions so that only members of a specific group can actually publish to that maven repository directory for their project.

Hope this clears up whats going on to resolve this issue.
-Mark Diggory

Noel J. Bergman wrote:

 -- numerous doubles: the repository contains a lot of stuff that
   is already somewhere else in 'dist/'.


the script I provided could be modified to make the appropriate symlinks.


If the repository could use symlinks, that might be good.  Much less traffic
to the mirrors, and the code stays in one place with the project(s).

--- Noel


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]


-- Mark Diggory Software Developer Harvard MIT Data Center http://osprey.hmdc.harvard.edu

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to