Overall I support the proposal and this is what I was in favor of during the last discussion. I'd also propose changing the name of the artifact id and git repo from malhar to apex-library as part of the major version change. It will make it more consistent with the entire project and will also allow using 1.0 (1.0-SNAPSHOT) version that more closely reflects maturity of the library.

Thank you,

Vlad

On 8/18/17 17:04, Thomas Weise wrote:
The proposed change does NOT require changes to existing apps. It follows
what you see for example in the JDK where older classes are retired without
making breaking major release changes.

Even though not required technically, it looks like the preference is to
make this change with a 4.0 release. I will therefore propose to change the
master branch to 4.0-SNAPSHOT and make this and other changes deemed
worthwhile.

To me it is more important that to newcomers the codebase does not look
like an outdated legacy situation and therefore if I have to make a choice
than pretty much any version number will do for that. So far I have seen
very little interest and initiative in addressing maintenance work like the
packages, checkstyle, CI stability, documentation etc. So it would be great
if people interested would step up and contribute.

Here is the next proposal how to proceed:

    - create release-3.8 branch for last 3.8 release right away
    - update master to 4.0-SNAPSHOT as part of the open PR #664
    - identify further cleanup work for master
    - release 4.0

Thanks,
Thomas









On Thu, Aug 17, 2017 at 4:48 PM, Amol Kekre <a...@datatorrent.com> wrote:

This following pull request should be taken up in 4.0.0. See my comments in
https://github.com/apache/apex-malhar/pull/664
<https://www.google.com/url?q=https%3A%2F%2Fgithub.com%
2Fapache%2Fapex-malhar%2Fpull%2F664&sa=D&ust=1503020269444000&usg=
AFQjCNHDPZIg2e6I33Jb1XjB5Ir1FkdURQ>
https://github.com/apache/apex-malhar/pull/662
<https://www.google.com/url?q=https%3A%2F%2Fgithub.com%
2Fapache%2Fapex-malhar%2Fpull%2F662&sa=D&ust=1503020269444000&usg=
AFQjCNF7Xde7X55M3qmc8z-D5y3ZwNg7Fg>

This merge should not be done without a consensus. This will require code
changes to existing apps.

Thks,
Amol



E:a...@datatorrent.com | M: 510-449-2606 | Twitter: @*amolhkekre*

www.datatorrent.com


On Mon, Aug 14, 2017 at 7:48 PM, Thomas Weise <t...@apache.org> wrote:

Hi,

I opened following PRs for the package change:

https://github.com/apache/apex-malhar/pull/662

Moves all classes with history retained (hence 2 commits). Also contains
checkstyle and other mechanical changes.

https://github.com/apache/apex-malhar/pull/664

Adds backward compatibility jar.

Once above PRs are merged the new artifact can be deployed and introduced
as dependency in malhar-library.

Please review.

Thanks,
Thomas



On Sun, Jul 16, 2017 at 7:04 AM, Thomas Weise <t...@apache.org> wrote:

My original list of work items contained the b/w compatibility aspect,
I
don't think there should be any confusion of whether it will be covered
here or not.

The proposed shading will provide the old classes and they will be
frozen
as of release 3.7. That's the same as making a copy of the code and
never
again making changes to the original classes. This cannot be
accomplished
by using the older 3.7 release in your project because you cannot use 2
different versions of Malhar in parallel unless you apply shading.

The shaded artifact will only expose the com.datatorrent classes, and
they
will be self-contained as the rest of the classes that they may depend
on
are shaded. The shaded artifact does not evolve, there are not more
changes
to com.datatorrent classes after they are relocated in master.

Thanks,
Thomas


On Sun, Jul 16, 2017 at 2:00 AM, Pramod Immaneni <
pra...@datatorrent.com
wrote:

I don't think we can limit the topic strictly to relocation without
having
a good b/w compatibility story or at least one that goes far enough.

The shading idea sounds interesting. Why not let the shaded version
move
forward with each release till we hit a major release. If it is going
to
remain pegged at 3.7.0, why shade in the first place as the regular
3.7.0
release would do the same job and it would be same as the loss of b/w
compatibility with newer releases.

Thanks

On Sat, Jul 15, 2017 at 7:57 AM, Thomas Weise <t...@apache.org> wrote:

Discussing what in the future might become stable needs to be a
separate
thread, it will be a much bigger discussion.

The topic here is to relocate the packages. With a few exceptions
relocation won't affect the semantic versioning. Semantic versioning
is
essentially not effective for Malhar because almost everything is
@Evolving
(and there are reasons for that.. -> separate topic)

I don't really like the idea of creating bw compatibility stubs for
the
follow up PR. It creates even more clutter in the source tree than
there
already is and so here is an alternative suggestion:

https://github.com/tweise/apex-malhar/blob/malhar37-
compat/shaded-malhar37/pom.xml

Create a shaded artifact that provides the old com.datatorrent.*
classes as
of release 3.7. Users can include that artifact if they don't want
to
change import statements. At the same time they have an incentive to
switch
to the relocated classes to take advantage of bug fixes and new
functionality.

I will work on the first PR that does the relocate. In the meantime,
we
can
finalize what backward compatibility support we want to provide and
how.
Thanks,
Thomas




On Fri, Jul 14, 2017 at 11:33 AM, Pramod Immaneni <
pra...@datatorrent.com>
wrote:

How about coming up with a list of @Evolving operators that we
would
like
to see in the eventual stable list and move those along with the
not
@Evolving ones in org.apache.apex with b/w stubs and leave the
rest
as
they
are. Then have a follow up JIRA for the rest to be moved over to
contrib
and be deprecated.

Thanks

On Fri, Jul 14, 2017 at 10:37 AM, Thomas Weise <
thomas.we...@gmail.com>
wrote:

We need to keep the discussion here on topic, if other things
are
piled
on
top then nothing gets done.

Most operators today are not stable, they are @Evolving. So
perhaps
it
is
necessary to have a separate discussion about that, outside of
this
thread.
My guess is that there are only a few operators that we could
declare
stable. Specifically, under contrib the closest one might have
been
Kafka,
but that is already superseded by the newer versions.

Thomas


On Fri, Jul 14, 2017 at 10:21 AM, Pramod Immaneni <
pra...@datatorrent.com>
wrote:

We had a discussion a while back, agreed to relegate
non-stable
and
experimental operators to contrib and also added this to
contribution
guidelines. We aexecuted on this and cleaned up the repo by
moving
operators deemed non-suitable for production use at that time
to
contrib.
So I wouldn't say the operators that are at the top level
today
or
the
ones
in library are 0.x.x quality. Granted, we may need to do one
more
pass
as
some of the operators may no longer be considered the right
implementations
with the advent of the windowed operator.

Since contrib used to be the place that housed operators that
required
third party libraries, there are some production quality
operators in
there
that need to be pulled out into top level modules.

I think we are in agreement that for operators that we
consider
stable,
we
should provide a b/w stub. I would add that we strongly
consider
creating
the org.apache.apex counterparts of any stable operators that
are
in
contrib out in top level modules and have the com.datatorrent
stubs
in
contrib.

For the operators not considered stable, I would prefer we
either
leave
the
current package path as is or if we change the package path
then
create
the
b/w stub as I am not sure how widely they are in use (so, in
essence,
preserve semantic versioning). It would be good if there is a
followup
JIRA
that takes a look at what other operators can be moved to
contrib
with
the
advent of the newer frameworks and understanding.

Thanks

On Fri, Jul 14, 2017 at 9:44 AM, Thomas Weise <t...@apache.org
wrote:
Most of the operators evolve, as is quite clearly indicated
by
@Evolving
annotations. So while perhaps 0.x.x would be a more
appropriate
version
number, I don't think you can change that.

Thomas

On Fri, Jul 14, 2017 at 9:35 AM, Vlad Rozov <
v.ro...@datatorrent.com
wrote:

If entire library is not stable, its version should be
0.x.x
and
there
should not be any semantic versioning enabled or implied.
It
evolves.
If
some operators are stable as 3.8.x version suggests, the
library
should
follow semantic versioning and it is not OK to make a
major
change
that
breaks semantic versioning across the entire library. It
is
not a
finding
an excuse not to make a change. For me semantic versioning
and
backward
compatibility is more important than a proper package
name.
Thank you,

Vlad


On 7/14/17 09:11, Thomas Weise wrote:

Semantic versioning makes sense when you have a stable
baseline.
As
long
as
frequent fixes need to be made to address basic issues,
it
makes
no
sense
to declare operators stable, which is why they are marked
evolving.
Instead of finding reasons to avoid changes that should
have
been
made a
long time ago, why not discuss what a "stable operator"
is
and
which
ones
deserve to be in that category. Those are the ones that
will
get
the
backward compatibility stub.

Thanks,
Thomas


On Fri, Jul 14, 2017 at 8:34 AM, Vlad Rozov <
v.ro...@datatorrent.com>
wrote:

My preference is to agree that the next Malhar release is
4.0.0
and
make
the proposed changes when 3.8.0 goes out. Otherwise why
to
keep
semantic
versioning checks on Malhar in case there is no version
compatibility.
Thank you,

Vlad


On 7/14/17 08:11, Thomas Weise wrote:

next release is 3.8.0
On Fri, Jul 14, 2017 at 7:55 AM, Vlad Rozov <
v.ro...@datatorrent.com>
wrote:

next release - 3.9.0 or 4.0.0?

Thank you,

Vlad

On 7/13/17 22:25, Thomas Weise wrote:

It is time to resurrect this thread and get going with
the
work.
For the next release, I will sign up to do the
package
move
in
Malhar:
https://issues.apache.org/
jira/browse/APEXMALHAR-2517
In general this will be straightforward; most classes
in
Malhar
are
marked
evolving and it is trivial for users to change import
statements.
However,
I would suggest to discuss if there are selected
operators
that
are
worth
keeping a backward compatibility stub in the old
location.
Here is my plan:

1. Relocate all classes in *lib* and *contrib* within
one
PR -
this
is
all

IDE automated work
2. Add backward compatibility classes, if, any in
separate
PR
3. Create PR for Megh library to reflect moved
classes
Thanks,
Thomas



On Wed, Mar 1, 2017 at 2:24 PM, Pramod Immaneni <
pra...@datatorrent.com
wrote:

Inline

On Mon, Feb 27, 2017 at 11:03 PM, Thomas Weise <
t...@apache.org>
wrote:

-->

On Mon, Feb 27, 2017 at 1:27 PM, Pramod Immaneni <
pra...@datatorrent.com

wrote:

For malhar, for existing operators, I prefer we do
this as
part
of
the

planned refactoring for breaking the monolith
modules
into
baby
packages

and would also prefer deprecating the existing
operators
in
place.
Refactor into smaller modules was discussed for
malhar-contrib
and
given
the overall state of that module I think it is OK
to
defer
package
renaming

there. I do however prefer to see the package rename
addressed
for
other
modules, especially for the main library module.

Should we consider breaking the library into
smaller
modules
as
well,

the
file/block operators for example probably can be in
their
own
module
from
just an organizational perspective.



This

will help us achieve two things. First, the user
will
see
all
the
new
changes at once as opposed to dealing with it
twice
(with
package
rename

and dependency changes) and second it will allow
for
a
smoother
transition

as the existing code will still work in a
deprecated
state.
It
will
also

give a more consistent structure to malhar. For new
operators,
we
can
go
with the new package path but we need to ensure
they
will
get
moved
into
the baby packages as well.

I think existing operators should be renamed so
that
git
history
remains. A
possible solution for backward compatibility could
be
to
subsequently
add
empty subclasses in the previous location (for
existing
concrete
operators

that we know are actually in use) to simplify
migration
for
users.
Yes we can do that.

For demos, we can modify the paths as the apps are
typically
used
wholesale
and the interface is typically manual interaction.

For core, if we are adding new api subsystems,
like
the
launcher
api
we
added recently for example, we can go with new
package
path
but
if
we

are

making incremental additions to existing
functionality, I
feel
it
is
better

to keep it in the same package. I also prefer we
keep
the
package
of
the

implementation classes consistent with api, for
understandability
and

readability of the code. So, for example, we don't
change
package
path

of

LogicalPlan as it is an implementation of DAG. It
is
subjective,
but
it

will be good if we can also do the same with
classes
closely
related
to

the

implementation classes as well. Maybe we can moving
these
on a
package

by

package basis, like everything in
com.datatorrent.stram.engine
could
be

moved. For completely internal components like
buffer
server,
we
can
move

them wholesale. We can consider moving all api and
classes,
when
we
go
to
next major release but would like to see if we can
find a
way
to
support
existing api for one more major release in
deprecated
mode.
The point of the major release is to enable
backward
incompatible
changes
and I don't think it is realistic to support the
existing
API
for
another
major release. IMO it is also not necessary as most
existing
application
code refers to operators, attributes and the
application
interface.
Perhaps

it is possible to keep those around as interface
extensions
to
help
migration. Custom operators may need to be migrated
to
reflect
API
changes,

and I would consider that a reasonable task for
operator
developers
as

part

of a major upgrade.

It would be good if we can keep them as deprecated
interface
extensions

for
one release to provide a smoother transition.


API and implementation in engine are kept separate
intentionally.
They

reside in different packages today, so I don't see a
problem
renaming
com.datatorrent.stram.engine as you say, even when
the
API
cannot
be
touched right away.

They are different packages but sharing a common
prefix
with
api
will
be

helpful to someone new to codebase in terms of
readability.
Not
a
big
deal
and can be changed.


Thanks

On Mon, Feb 27, 2017 at 7:39 AM, Thomas Weise <
t...@apache.org>
wrote:

Hi,

This topic has come up on several PRs and I think
it
warrants a
broader

discussion.
At the time of incubation, the decision was to
defer
change
of
Java
packages from com.datatorrent to org.apache.apex
till
next
major
release

to

ensure backward compatibility for users.

Unfortunately that has lead to some confusion, as
contributors
continue

to
add new code under legacy packages.
It is also a wider issue that examples for using
Apex
continue
to
refer

to
com.datatorrent packages, nearly one year after
graduation.
More
and
more

user code is being built on top of something that
needs
to
change,
the

can
is being kicked down the road and users will face
more
changes
later.

I would like to propose the following:

1. All new code has to be submitted under
org.apache.apex
packages
2. Not all code is under backward compatibility
restriction
and
in
those

cases we can rename the packages right away.
Examples:
buffer
server,

engine, demos/examples, benchmarks
3. Discuss when the core API and operators can be
changed.
For
operators

we

have a bit more freedom to do changes before a
major
release
as
most

of

them are marked @Evolving and users have the
ability
to
continue
using
prior version of Malhar with newer engine due to
engine
backward
compatibility guarantee.
Thanks,
Thomas







Thank you,

Vlad

Reply via email to