I think this easier done in a straw poll than in an email conversation.
I created one at: http://www.strawpoll.me/12535073
(Note that you have multiple choices.)
Though I prefer Java 8 most of the time I have to work on Java 7. A lot of the
infrastructure I work on still runs Java 7, one of the
I agree with Vasia that for data scientist it's likely easier to learn the
high-level api. I like the material from
http://dataartisans.github.io/flink-training/ but all of them focus on the
high level api.
Maybe we could have a guide (blog post, lecture, whatever) on how to get
into Flink as a
I think the focus of this discussion should be how we proceed not what to
do. The what comes from the committers anyway.
There are several people who like to commit, including people from the
Streamline project. Having pull requests that are older than 6 Month is not
good for any project.
The
During this year's FOSDEM Martin Junghans and I set together and gathered
some feedback for the Flink project. It is based on our personal experience
as well as the feedback and questions from People we taught the system.
This is going to be a longer email therefore I have split things into
ugin to enforce a
> minimum version of netty?
> We recently downgraded (a minor version) of netty because of an issue.
> Maybe that's the issue.
>
> Can you check the enforcer rules of your project?
>
> On Wed, Jan 20, 2016 at 1:48 PM, Martin Neumann <mneum...@sics.se> wrote
Hi,
I have a weird problem. Yesterday I had to clean my local maven cache for a
different application.
Since afterwards one of my Flink streaming jobs does not compile anymore. I
didn't change any code just made maven pull all dependencies again.
I'm totally stomped by this, please help me!
org> wrote:
> Hi Martin.
>
> can you try to exclude the netty dependency from your Flink dependencies?
> Another approach would be to disable the check, or add an exception to it
> ;)
>
> Why did you add the check in the first place?
>
>
> On Wed, Jan 20, 20
Hej,
What is the correct way of initializing a state-full operator that is using
a hashmap? modelMapInit.getClass() does not work neither does
HashMap.class. Do I have to implement my own TypeInformation class or is
there a simpler way?
cheers Martin
private
gt; > Yes what you wrote should work. You can alternatively use
> > TypeExtractor.getForObject(modelMapInit) to extract the tye information.
> >
> > I also like to implement my custom type info for Hashmaps and the other
> > types and use that.
> >
> >
Hej,
I'm working with some state full streaming operators at the moment and I
noticed that the Documentation is out of date.
The documentation says:
@Override
public void open(Configuration config) {
counter = getRuntimeContext().getOperatorState(“counter”, 0L, false);
}
I tried out Spargel during my work with Spotify and have implemented
several algorithms using it. In all implementations I ended up storing
additional Data and Flags on the Vertex to carry them over from one UDF to
the next one. It definitely makes the code harder to write and maintain.
I wonder
The problem with having many different graph model in gelly is that it
might get quite confusing for a user.
Maybe this can be fixed with good documentation so that its clear how each
model works and what its benefits are (and maybe when its better to use it
over a different model).
On Tue, Nov
@gmail.com
> > >> > wrote:
> > >>
> > >>> Hey,
> > >>>
> > >>> Thanks for reporting the problem, Martin. I have not merged the PR
> > >>> Stephan
> > >>> is referring to yet. [1] There I am cleaning
test
> please?
>
> [1] https://github.com/apache/flink/pull/1155
>
> On Fri, Oct 2, 2015 at 8:26 PM, Martin Neumann <mneum...@sics.se> wrote:
>
> > One of my colleagues found it today when we where hunting bugs today. We
> > where using the latest 0.10 version
It seems like I'm one of the few people that run into the mutable elements
trap on the Batch API from time to time. At the moment I always clone when
I'm not 100% sure to avoid hunting the bugs later. So far I was happy to
learn that this is not a problem in Streaming, but that's just me.
When
Hej,
In one of my Programs I run a Fold on a GroupedDataStream. The aim is to
aggregate the values in each group.
It seems the aggregator in the Fold function is shared on operator level,
so all groups that end up on the same operator get mashed together.
Is this the wanted behavior? If so, what
, Stephan Ewen <se...@apache.org> wrote:
> I think these operations were recently moved to the internal state
> interface. Did the behavior change then?
>
> @Marton or Gyula, can you comment? Is it per chance not mapped to the
> partitioned state?
>
> On Fri, Oct 2, 2015 at 6:3
After some work experience with the current solution I want to give some
feedback and maybe start a discussion about event time in streaming. This
is not about watermarks or any of the incoming improvements just some
observations from the current code.
*Starttime for EventTime:*
In the current
Hej,
Up to what sizes are broadcast sets a good idea?
I have large dataset (~5 GB) and I'm only interested in lines with a
certain ID that I have in a file. The file has ~10 k entries.
I could either Join the dataset with the IDList or I could broadcast the ID
list and do the filtering in a
Hej,
I was busy with other stuff for a while but I hope I will have more time to
work on Flink and Graphs again now.
I need to do some basic analytic's on a large graph set (stuff like degree
distribution, triangle count, component size distribution etc.)
Is there anything implemented in Gelli
Hej,
I was busy with other stuff for a while but I hope I will have more time to
work on Flink and Graphs again now.
I need to do some basic analytic's on a large graph set (stuff like degree
distribution, triangle count, component size distribution etc.)
Is there anything implemented in Gelli
Hej,
Very interesting discussion.
I hadn't heard of the SSP model before, looks like something I want to look
into.
I wonder if any of the algorithms that would work in that model would not
work in an asynchronous model. Since asynchronous is basically a SSP model
with infinite slack. Iterative
22 matches
Mail list logo