Hey,

I have completely reworked the way we managed tuple serialization for
streaming. Now it is possible for the user to call .setMutability(true) on
an operator to enable object reuse at tuple deserialization.

What do you think, what should be the default mutability setting for
operators? We use immutable at the moment.

Cheers,
Gyula


On Wed, Jul 16, 2014 at 3:19 PM, Márton Balassi <[email protected]>
wrote:

> Thanks, Robert:
>
>    - ZeroMQ - thanks, we have it in another repo
>    - Spark & LGPL - Sean Owen was kind enough to clarify the situation
>    - BTree: The whole org.apache.flink.streaming.index is somewhat legacy
>    code, currently being unused - was for the purpose of state management,
> but
>    the API got refactored since then and we decided to leave some parts
> there
>    that we have to readdress. It is quite likely that we are not using it
> any
>    more. I'm removing it.
>    - The hadoop-2 profile is indeed copypasta, good call.
>    - The hadoop-1 profile was interesting for me, because it builds on "my
>    machine" :)
>
>
>
>
> On Wed, Jul 16, 2014 at 3:02 PM, Robert Metzger <[email protected]>
> wrote:
>
> > Cool. Thanks for the update.
> > I think you can host the ZeroMQ connectors on a private repository or so.
> >
> > "by the way Spark has LGPL licensed packages in its NOTICE" --> did you
> > find the discussion in their mailing list / JIRA regarding this? Maybe
> they
> > contacted the authors of the code or got a special permission to do that?
> >
> > What is the license of this file?
> >
> >
> https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/index/BTreeIndex.java
> >
> >
> >
> > The build errors for hadoop1 are:
> >
> > [INFO]
> > ------------------------------------------------------------------------
> > [ERROR] Failed to execute goal
> > org.apache.maven.plugins:maven-compiler-plugin:3.1:compile
> > (default-compile) on project flink-streaming-core: Compilation
> > failure: Compilation failure:
> > [ERROR]
> >
> /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[190,89]
> > incompatible types:
> > java.lang.Class<org.apache.flink.streaming.api.streamrecord.StreamRecord>
> > cannot be converted to java.lang.Class<? extends
> > org.apache.flink.streaming.api.streamrecord.StreamRecord<IN>>
> > [ERROR]
> >
> /home/travis/build/mbalassi/incubator-flink/flink-addons/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/streamcomponent/AbstractStreamComponent.java:[208,42]
> > incompatible types:
> >
> java.lang.Class<org.apache.flink.streaming.partitioner.DefaultPartitioner>
> > cannot be converted to java.lang.Class<? extends
> >
> >
> org.apache.flink.runtime.io.network.api.ChannelSelector<org.apache.flink.streaming.api.streamrecord.StreamRecord<OUT>>>
> >
> > I guess thats easy to fix ?
> >
> > The build errors for hadoop2 are:
> > I think its copypasta.. You've copied the build-profiles stuff from
> another
> > project. Your pom is including the "flink-hbase" and "flink-yarn"
> > submodules (of flink-streaming).
> >
> >
> https://github.com/mbalassi/incubator-flink/blob/streaming-ready/flink-addons/flink-streaming/pom.xml#L109
> >
> > Just remove the whole <profiles> .. </profiles> block in the
> > flink-streaming/pom.xml.
> >
> >
> >
> >
> > On Wed, Jul 16, 2014 at 1:56 PM, Márton Balassi <
> [email protected]>
> > wrote:
> >
> > > Hi all,
> > >
> > > We've decided to do our preparations on a fork of the main repo:
> > > *https://github.com/mbalassi/incubator-flink/tree/streaming-ready
> > > <https://github.com/mbalassi/incubator-flink/tree/streaming-ready>*
> > >
> > > We've fixed the code to match the coding style and added the modules to
> > the
> > > maven build.
> > >
> > >
> >
> https://github.com/mbalassi/incubator-flink/commit/f8a6b0ecf7f453cad13ed6752051f29783ec0469
> > >
> > > As for the licensing:
> > >
> > >
> >
> https://github.com/mbalassi/incubator-flink/commit/5ddcebc6f0cedfcb3ed67a4f53ee1b415dd1d82f
> > >
> > >    - Removed JBlas as it is no longer needed
> > >    - Included the information for RabbitMQ
> > >    - Deleted the ZeroMQ package and it's dependency as a whole - by the
> > way
> > >    Spark has LGPL licensed packages in its NOTICE
> > >    - Did not include additional information for Apache Kafka
> > >
> > > On my machine the project builds with the default hadoop profile and
> Java
> > > 6&7 and the tests are passing, however the Travis CI for the latest
> > travis
> > > build is way less rosy:
> > > *https://travis-ci.org/mbalassi/incubator-flink/builds/30055789
> > > <https://travis-ci.org/mbalassi/incubator-flink/builds/30055789>*
> > >
> > >    - The ones with the hadoop-2 profile fail with not finding one of
> the
> > >    poms (?)
> > >    - The ones with the hadoop-1 profile either fail in flink-tests with
> > an
> > >    error in DataSink (maybe an the travis slot run out of disk...) or
> an
> > >    exception in the streaming code that did not occur when neither
> when I
> > >    built the project locally with maven nor in Eclipse
> > >
> > > Do you have any suggestion for additional requirements or fixing the CI
> > > build?
> > >
> > > Cheers,
> > >
> > > Marton
> > >
> > >
> > > On Mon, Jul 14, 2014 at 6:28 PM, Márton Balassi <
> > [email protected]>
> > > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > @Stefan: Thanks for the script, we've gone through the commits with
> > > Gabor,
> > > > Gyula is reviewing it right now.
> > > > https://github.com/mbalassi/incubator-flink/commits/streamrebase3
> > > >
> > > > @Robert: We've went through the coding style, the update commit is
> > > already
> > > > pushed to our old repo, I'm merging it to my flink fork soon.
> > > >
> > > > @Henry: Ok, I'm pinging all the contributors with the subject, the
> > three
> > > > of us already signed the form.
> > > >
> > > > I'm dealing with the Licensing tomorrow.
> > > >
> > > >
> > > > On Mon, Jul 14, 2014 at 4:58 PM, Stephan Ewen <[email protected]>
> > wrote:
> > > >
> > > >> Before adding this contribution to the project, there are some legal
> > > >> things
> > > >> to do:
> > > >>
> > > >>  - Obtain ICLAs from all major contributors. There are 7 in the
> > > streaming
> > > >> code, out of which three did the largest portion of the work: Márton
> > > >> Balassi, Gyula Fóra, Hermann Gábor
> > > >>  - @mentors: Should the other 4 also sign and send ICLAs?
> > > >>
> > > >>  - Licenses: Walk through the code, collect all dependencies and
> make
> > > sure
> > > >> they are ASL compatible.Here are some links with information:
> > > >>     - http://www.apache.org/legal/resolved.html
> > > >>     -
> > http://www.apache.org/foundation/license-faq.html#WhatDoesItMEAN
> > > >>
> > > >>  - All used licenses must be mentioned in the LICENSE files
> > > >>    - under ./LICENSE
> > > >>    - under ./flink-dist/src/main/flink-bin/LICENSE
> > > >>
> > > >>  - Check headers for ASF compliance.
> > > >>
> > > >>
> > > >> This looks manageable. Anything I forgot?
> > > >>
> > > >> Greetings,
> > > >> Stephan
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Mon, Jul 14, 2014 at 4:43 PM, Stephan Ewen <[email protected]>
> > wrote:
> > > >>
> > > >> > Ho guys!
> > > >> >
> > > >> > I made a scripted manual rebase of each commit (basically add the
> > > commit
> > > >> > not via its diff, but such that it reflects the code base after
> the
> > > >> commit)
> > > >> >
> > > >> >
> https://github.com/StephanEwen/incubator-flink/commits/streamrebase
> > > >> >
> > > >> > No more merge commits that mess things up. You should be able to
> > > squash
> > > >> > things easily via "git rebase -i
> > > >> 3002258f8a22a8adbdb230e57c972ad17910debf"
> > > >> >
> > > >> > The commit diffs may be a bit different than before (not too much
> > if I
> > > >> did
> > > >> > things correctly), but can you have a quick look at the commits to
> > see
> > > >> > whether they make sense?
> > > >> >
> > > >> > Stephan
> > > >> >
> > > >> >
> > > >> > BTW: I used this way to do it:
> > > >> >
> > > >> > Have two repositories (clones)
> > > >> >   - /data/repositories/flink
> > > >> >   - /data/repositories/flinkbak
> > > >> >
> > > >> > The do the following for every non-merge commit:
> > > >> >  - Check out the state after a commit in the backup (detached
> head)
> > > >> >  - Remove current streaming directory (physically and from the
> > index)
> > > >> >  - Add it again (files and index), with the state of the cloned
> repo
> > > >> >  - Commit (git recreates the diffs in a way that they reflect the
> > > >> original
> > > >> > commit plus any merges)
> > > >> >
> > > >> > ---------------------
> > > >> >
> > > >> > #!/bin/bash
> > > >> >
> > > >> > for line in $(cat commits)
> > > >> > do
> > > >> >   cd /data/repositories/flinkbak
> > > >> >   author=`git --no-pager show -s --format='%an <%ae>' $line`
> > > >> >   message=`git --no-pager show -s --format='%s%n' $line`
> > > >> >
> > > >> >   echo "picking commit $line from author $author"
> > > >> >
> > > >> >   git checkout $line
> > > >> >   cd /data/repositories/flink
> > > >> >   rm -rf "/data/repositories/flink/flink-addons/flink-streaming"
> > > >> >   git rm -r
> "/data/repositories/flink/flink-addons/flink-streaming"
> > > >> >   cp -r "/data/repositories/flinkbak/flink-addons/flink-streaming"
> > > >> > "/data/repositories/flink/flink-addons/flink-streaming"
> > > >> >   git add /data/repositories/flink/flink-addons/flink-streaming
> > > >> >   git commit --author "$author" --m "$message"
> > > >> >
> > > >> > #  read -rsp $'Press any key to continue...\n' -n1 key
> > > >> > done
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Mon, Jul 14, 2014 at 1:10 PM, Gyula Fóra <[email protected]
> >
> > > >> wrote:
> > > >> >
> > > >> >> By the way, I forked your repo switch to the streaming branch and
> > > then
> > > >> I
> > > >> >> executed the commands (I think this is how it should have been
> > done)
> > > >> >>
> > > >> >>
> > > >> >> On Mon, Jul 14, 2014 at 1:09 PM, Gyula Fóra <
> [email protected]>
> > > >> wrote:
> > > >> >>
> > > >> >>> This is what I get with "rebase -i -p master":
> > > >> >>>
> > > >> >>> pick 9456624 Merge branch 'master' of
> > > >> file:///data/repositories/streamin
> > > >> >>> into streaming
> > > >> >>> pick 89299b8 [streaming] Post-merge cleanups
> > > >> >>>
> > > >> >>> #Rebase 1fd457d..89299b8 onto 1fd457d
> > > >> >>> #......
> > > >> >>>
> > > >> >>>
> > > >> >>> On Mon, Jul 14, 2014 at 12:47 PM, Stephan Ewen <
> [email protected]>
> > > >> wrote:
> > > >> >>>
> > > >> >>>> Can you do "rebase -i -p master". That should include all
> commits
> > > and
> > > >> >>>> might save you the meeting hell.
> > > >> >>>>
> > > >> >>>
> > > >> >>>
> > > >> >>
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to