Back to origin issue (before breaking down 'storm-core'), turned out
IntelliJ doesn't recognize relocated classes within project. That's why
build (via Maven) for master branch succeeds but IDEA compile doesn't.

There're some issues filed but no action has been made.
https://youtrack.jetbrains.com/issue/IDEA-93855
https://youtrack.jetbrains.com/issue/IDEA-126596

So suppose we have two modules A and B within project, and A relocates L to
Lr.
B relies on A's method which returns a class of Lr or has parameters for a
class of LR, B needs to use Lr rather than L, and Lr is not recognized from
B.

Moving 'storm-drpc-server' to 'storm-core' may help but it's not a nice
solution though. (think about why we add new module 'storm-drpc-server')
To minimize dependency for worker (which actually affects end users) we
should break down 'storm-core' and it will remain to be headache.

There seemed to be little workarounds.

1. Guide IDEA users to take hacky workaround.

Quoting https://youtrack.jetbrains.com/issue/IDEA-93855#comment=27-1838157:
"A hacky workaround is to make the module in intellij with the dependency
depend on the jar explicitly in target/. This at least allows things to
compile and tests to run."

That is really bad and annoying, but we might have no choice when we don't
want to take other workarounds.

2. Maintaining separate project for relocated dependencies.

This avoids contributors to take hacky workaround so good to go, but
maintaining relocated artifacts might be another headache, and I'm not sure
ASF (or LICENSE of relocated targets) allows to do that.

3. Minimize (or remove) relocate targets and/or don't relocate troublesome
targets.

For 'storm-drpc-server', there seems to be three troublesome targets:

- 'thrift'
- 'codahale metrics'
- 'jetty server' (We may be able to move this to 'storm-drpc-server' when
another webapp port is done.)

If we are OK to give up relocating those things we might be OK for now. We
may want to extend the list when we break down more modules from
'storm-core'.

Btw, IMHO relocating is not a good option. Elastic gives up shading
anything for 2.0. (https://www.elastic.co/blog/to-shade-or-not-to-shade)
Someone might feel that it's a regression, but we need to decide to do it
when it can provide better shape.

Please add ideas if you have any, and give your opinions about above
options.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 3월 28일 (화) 오후 10:23, Bobby Evans <[email protected]>님이 작성:

Sure I am happy to help out how I can.  I really would like to spend more
time on storm, but sadly work has shifted and my team got 2 new projects
recently, but we have not increased the head count to cover it yet, so I am
swamped.  But if you do need help with some of these let me know and I'll
see what I can do in my spare time.


- Bobby

On Tuesday, March 28, 2017, 2:10:46 AM CDT, Jungtaek Lim <[email protected]>
wrote:Bobby,

I just tried to follow your suggestion and found it's less error-prone
compared to my approach, and has lots of benefits. (I am seeing the great
chance to minimize dependencies for 'storm-client', say, Worker.)

Thanks for the suggestion. I'm working on this now. I'll mention you when I
finish working this, or need your help.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 3월 28일 (화) 오전 8:15, Jungtaek Lim <[email protected]>님이 작성:

I think we could also fix this issue for separating 'storm-core' and
'storm-webapp' (rename from 'storm-drpc-server'), since local cluster
doesn't need to have 'storm-webapp', DRPC server (local DRPC will still be
in 'storm-core'), UI, Logviewer. That's what I'm working on, which seems to
require heavy efforts.

Your plan looks really promising, but in other perspective this plan is
even much harder to address.
Do you have time frame for working on this? If you can finish the work in
time frame so that it can be included in 2.0.0, I'll just discard my work
and move forward to port other things (logviewer, ui) first.

Regarding local mode, exposing local mode provides easy debug functionality
with IDE, and hiding it takes away such functionality. We have
ConfigurableTopology for 2.0.0 which helps to remove ceremony code, so
exposing is not that bad.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 3월 27일 (월) 오후 11:10, Bobby Evans <[email protected]>님이 작성:

+1 for fixing dependency/IDE issues, but I am not sure it is as simple as
what you describe.

The issue is that there is no clean way to get local mode without pulling
in almost all of the daemons too.  If we are going to go through the pain
of separating them out, I would prefer to do it once and do it right.  I am
happy to help out with this, as it is something I have been thinking about
for a while, but just haven't found the time to tackle on my own.
First we need a good way to give a control to our users about the base
classpath of the worker, ideally the JVM version too.  We have been doing a
really good job with rolling upgrades and I think it would be great if we
could have multiple versions of storm/JVM installed on the worker nodes and
the end user can pick what JVM and what version of storm they want their
worker to run with.  We can argue over details of how that would work
later. The point is that it lets us make changes to the classpath in very
drastic ways without breaking end users.

Second we need a better way to hide local mode.  Every example we have
supports local mode which means we will ship a copy of the storm daemons in
each topology jar if we pull them out of the default classpath.  We need to
be able to run existing topologies that do not have "local mode support" in
local mode.  We should be able to make storm-submitter work, there are
already stubs for this kind of thing, but we may need to play around with
DRPC and a few other APIs to make it transparent.

We then create new jars from the existing storm-core and storm-drpc-server.

storm-client - Just what the client and worker needs.  The only external
dependencies are logging and possibly metrics.
storm-local - This would pull in local mode dependencies (almost everything
in storm core).  We might even make it a test jar.

storm-daemon - all of our daemon processes (most if not all shading
removed).  We can subdivide this more if we want to.

storm-core would go away or just pull in storm-client.
The storm jar command would by default only pull in storm-client and its
dependencies.  If you wanted local mode you could add in a flag that would
adjust the classpath, boot up a local mode cluster, change the client to
transparently interact with that instead of a regular cluster, and jump to
the end users main.  There could also be an option to just include
everything on the classpath without the local mode cluster.  Ideally if we
include everything on the classpath with storm jar, that would also add a
flag that would make the supervisor include everything on the classpath
when launching the worker.


- Bobby

On Monday, March 27, 2017, 12:11:44 AM CDT, Jungtaek Lim <[email protected]>
wrote:Hi devs,

I took a first step of finalizing port work via resolving dependency issue
with DRPC.

Here's what I'm giving a try:
- rename 'storm-drpc-server' to 'storm-webapp'
- remove 'storm-core' from 'storm-drpc-server'
-- 'storm-drpc-server' will have its own library directory or shaded jar
- create 'storm-common' and extract all the things used for both
'storm-core' and 'storm-webapp'

It requires numerous files to be moved to, and huge code block should be
moved / modified. A bit painful to work on.

Other approach would be separating 'storm-worker' (or 'storm-client') and
'storm-daemon', and link to different libraries directory.
(Maybe we could make uber jar for 'storm-daemon'.)
This also requires similar work and maybe introduce more big effect to
users.

Other than above ideas I don't have any other ideas. We're shading
libraries which are both needed from 'storm-core' and 'storm-drpc-server'
which in turn makes known issue - able to build with maven but IDE can't
compile 'storm-drpc-server' project.

Please share other ideas if you have one.

Thanks,
Jungtaek Lim (HeartSaVioR)

Reply via email to