[ 
https://issues.apache.org/jira/browse/AVRO-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982587#action_12982587
 ] 

Holger Hoffstätte commented on AVRO-735:
----------------------------------------


> The initial move did not try and solve this problem because it is
> trickier than it looks and would complicate the initial split ticket
> significantly.

That's fine - probably would have done the same thing initially. What I don't 
understand is: if the separate artifacts are tied to each other anyway, why 
split them up in the first place?
If the dependency for ipc, mapred and tools were the motivation then maybe 
avro, compiler and ipc should have stayed one project, with ipc's 
transport-specific dependencies separated out.
Sorry, not blaming anyone..came too late to the party to suggest otherwise :/

> There are some cases that would be simple moves, but there are others
> that simple movement would require changing classes or methods with
> package scope visibility to public, which is not acceptable.  To
> avoid this there is more refactoring required and some API breakages.

I understand, but why does this preclude doing the easy/nonbreaking things now? 
I mean, I really cannot see any reason why avro:..ipc has to contain generic 
IOStream classes just because they are also used in ipc. Move them to avro:..io 
and done. Same for the AvroRemoteException - it can stay in avro alright, but 
there is no harm by moving it into avro..io or wherever. Bang: one down.

> This is a change that I agree we should attempt, but I'm not
> convinced that we should do so before 1.5.0 or that it is even
> possible.  If it is we could introduce the resulting API breakages in
> a later release.  1.5.0 may be very soon.

..which is why I wanted to fix the easy (really trivial) things now; I am not 
at all suggesting the full surgery in the last minute. My understanding is that 
1.5 is already another break compared to previous versions (see the 
Hadoop-related drama). Selling more breaks later will just get harder and 
harder.
Can't really put the horseshoes under the horse when it's out of the barn..

> Conceptually, the requirement that we can't share packages across
> jars means that avro-ipc can only use public API's to work with avro
> -- and that may never be desirable.

So have public and private APIs?! No need to rely on package overlaps for that.

> Its not possible to build avro-ipc and avro using Maven in the same
> project -- avro-ipc requires compiling schema files into Java
> classes.  In order to compile those schema files, the build needs to
> have already created the avro-compiler artifact which depends on
> avro.

I think that just shows that those three really belong together, and that 
dependency problems come from ipc, mapred and the tools. avro, compiler and ipc 
together are still pretty small.

> Would it be possible for OSGi to simply not support a smaller bundle
> than avro + avro-ipc?  I think all other components can separate
> cleanly by package. Alternatively, we could build a variation
> avro-ipc.jar that shades in avro.jar that could be the smallest unit
> for OSGi.  This however would mean that all Avro users have to pull
> in jetty and netty even if they aren't using those features.

I'm not convinced that trying to build "special" artifacts is going to fix 
anything in either the short, medium or long runs. As an example, it's fairly 
easy to embed the avro/compiler/ipc trifecta and just block the imports that a 
bundle doesn't need (assuming the bundle has service-like standalone 
functionality). This would only be necessary for no good reason whatsoever, 
increase bloat and cost everyone's time over and over again.
I fully agree that not every jar has to be bundleized by itself (as some people 
try and complain about..), but if the jar is useless on its own without a set 
of add-ons - why are they separate in the first place?

Maybe I should have explained my initial motivation for all this earlier :)
I intend to use avro-ipc as a transport layer for OSGi RemoteServices, and 
probably would have been fine with split packages etc. since I can just embed 
the jars into the transport bundle and block stuff I don't need, as described 
above. But since 1.5.0 is already a breaking release I figured we can fix the 
easy things now, so that I can go spelunking on the not-so-easy things 
afterwards, for 1.6/2.0.

> Another approach would be to trim the dependencies from avro-ipc down
> by removing implementations like netty and jetty.  Then we could have
> a separate jar with those implementations, which could be in a
> different package.

This would have been my step 3 or 5 :)
Definitely a good way forward and also very useful for non-OSGi (plain maven 
etc.) users.

Not sure if that helped? I don't want to hold up the release.


> Split packages across artifacts
> -------------------------------
>
>                 Key: AVRO-735
>                 URL: https://issues.apache.org/jira/browse/AVRO-735
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.5.0
>            Reporter: Holger Hoffstätte
>             Fix For: 1.5.0
>
>
> I was glad to see the ongoing work for a more modular build (thanks Scott 
> Carey!). Whilst looking into the cross-platform IPC facilities for use in 
> OSGi I noticed something that makes OSGi compatibility (and maintenance) more 
> difficult than necessary, for no good reason. I plan to submit OSGi bundle 
> patches later (though not necessarily for the 1.5.0 release) so this is a 
> necessary prelude.
> The term "split packages" refers to the situation that two artifacts carry 
> the same packages, which means that the classes in both packages are more or 
> less randomly munged together at runtime. This unfortunate situation is 
> "mostly" without consequence in "normal" flat-classpath Java (assuming there 
> are no overlaps!), but bad for OSGi since class visibility & wiring is based 
> on package visibility. Split packages generally make any form of automatic 
> package resolution (for deployment) almost impossible.
> As far as I can see there are several classes in packages across artifacts 
> that can easily be moved a bit without really disturbing anything. Some 
> examples:
> org.apache.avro.specific is defined by acro, compiler AND ipc
> org.apache.avro.ipc (!) is defined in avro and contains classes that could go 
> into avro:avro.io (the buffers) or avro-ipc:org.apache.avro.ipc
> It seems that the previously unmodular package membership of classes has been 
> carried over during the artifact separation. I'd like to see this cleaned up 
> as well before the 1.5.0 release, as this is a breaking change. However, most 
> of the overlaps can be fixed easily with IDE refactorings like package 
> renaming or by moving classes.
> Please let me know if this is an acceptable change and if you want me to 
> provide help/patches etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to