Christian Weisgerber wrote:

Some preliminary discussion at the last hackathon produced the
opinion that even Java ports should be built from source by all
means.  However, that discussion didn't include any of our porters
who are interested in Java...

[My apologies of this appears twice; I posted it earlier today but it didn't show up.]

I have four comments below. My perspective is from an OpenBSD porter of alternative tool chains, (JamVM, Cacao, and classpath), and as a user of J2EE frameworks for web applications.

These comments are supported the following notions:
 1) .jar files are a kind of distfile;
 2) source availability is an important debugging aid;
 3) when we trust the jar's creator we don't have to recompile;
 4) a pre-built openbsd package should be worthy of being trusted;
 5) source includes javadoc;
 6) reducing or merging .jars raises upstream support problems;
7) recompile or not, packages with jars and libs should be usable by all tool chains;

SUMMARY
=======

-- OpenBSD should create a "verified repository" for trusted builds of pure Java frameworks. (They work on all platforms with identical bytecodes.); -- a port produces .jars, native libs, and javadoc, from source, usable by all tool chains; *BUT* for the specific framework/application, not the pure Java dependencies, which are obtained from the verified repository.
 -- an OpenBSD package bundles .jars and libs for all tool chains;
 -- a Java specific tool such as Maven may be a candidate for
repository support.

Fred

[The discussion is intended to be self-contained for those that are not familiar with common practices in the Java community. These are my views.]


PURE JAVA PORTS
===============

A port of a pure java framework is trivial. I just copy the .jar or unzip the .zip or .tar.gz; If I need the source for debugging I get the source; and the Javadocs. If I can't do this I don't choose that framework technology and I move on to something else. This is the way the Java community works. There are often several alternatives. I get to decide this because I'm the system architect (for apps in my org.)

Copy deployment works for all pure Java frameworks on all platforms
supporting Java.  The jar is a distfile that I don't have to compile if
I am willing to trust it's creator.  But it sure would be nice to know
that it matches a checksum.

The interesting frameworks have many dependencies.  Binary distributions
of those frameworks (e.g. Spring and Hibernate) may include *dozens* of
other .jars that they depend on, each of which may have much more source
code than the original framework.  So those embedded jars are their own
distfile, in addition to the source.

Where did those embedded .jars come from?  Clearly we are trusting the
upstream developer's released package.  I think a port of Spring should
include checking of the distfiles of the component jars.

Recompiling source is one way to accomplish this; but another is the ports "makesum" target. The difference: do I want to trust the porter? or am I going to insist to see it built with my own eyes?

JAR FILE DUPLICATION & MULTIPLE VERSIONS
========================================

Second, there is is a lot of jar-file duplication in real projects.
Xerces can occur several times in a real web app (just to pick one
favorite) depending on other frameworks being used. (You'll get three
copies from Struts, Spring, and Hibernate alone.)  They will all be
incarnated in the same JVM.  And they would not necessarily be the same
version level.  Individual projects are unsynchronized and older
versions are likely to be around for a long time.  This is reality.

Now in OpenBSD we like to rationalize the libraries in our ports and
eliminate duplicates, update versions where it makes sense, etc.,
because this improves security and fits the doctrine that OpenBSD is a
complete system, not a kernel+gnu.

Rationalizing the .jars across several Java frameworks with dozens of
jars is a lot of work. Developers often just don't do it and instead burn the resources at execution time. The upstream developer of any particular framework just does not care. Maybe they should.

Furthermore, rationalizing .jars may not change the security profile of
a Java application one bit.  The bigger problem is that if you try to
merge versions you are likely to introduce bugs, or certainly have
created a combination of jars that would be declared unsupportable by
the upstream developer.

How an application relates to the jars it uses by version seems to me
to be much less controlled in the Java world than OpenBSD.  You
can call this sloppy engineering or whatever.  But I don't see what
value will be added by OpenBSD ports recompiling everything or merging
libraries.

Validating the checksums on .jars is valuable.

WERE DO THE JARS GO?
====================

So you compiled all the jars.  Where do you put them? It is not always
clear.

Do they go in
        / usr / local / lib / java ?

Or perhaps
        / usr / local / share / java / ?

When I build apps for Tomcat, they need to go in
        webapps / <myapp> / WEB-INF / classes.

Jars end up in a lot of different places.  Defining the configuration of
an application, meaning placement of jars, is a *major* part of
deploying web apps.  There is no equivalent to /usr/local/lib in the
java environment, and I mean not JDK, which is well defined, but the
J(2)EE environments that are actually used and are always application
specific.

IMO the best thing we can do is provide a way of building up a
repository of trusted jars to be used in a deployment.  Somebody
compiles them up once, they get checked in and that's that.

Compiling opens the pandora's box of where to put them.  I say compile
them once and put them all in a repository to be used as needed.  The
repository jars are for all platforms, and we can trust where we got them. We know where they came from.

This problem is so acute that the Apache Maven project attempts to
address it.  I am not a Maven maven.  Maybe we need to figure out how to
make an OpenBSD version of Maven that will support trusted .jars that
are built once.

PORTS WITH NATIVE DEPENDENCIES
==============================

Ports that require native platform support are a different matter.

Right now we have explicit ports dependencies on the Sun JDK tool chain
in the ports that are built from source.

This creates an a problem for *alternative* tool chains.  For example,
after some fumbling around I can fake out our Eclipse package, which has
explicit dependencies on the Sun JDK, and run it on classpath, jamvm,
and cacao.

To the best of my knowledge there is not way way to tell the ports
system "make sure a 1.4.2 JVM is installed."  (This would be a
categorical dependency, rather than a dependency on a specific version
or series of versions.)

The good news is that our JDK built binaries are perfectly compatible
with these alternative tools.  (The FUD about compatibility issues is
just that.)  There is no reason why I should have to hack a package to
use its native binaries and jars.

Tool chain dependencies are the most significant issue in my opinion.

Fred

Reply via email to