@Robert is this about "hot code replace" in the JVM? On Mon, Jan 23, 2017 at 1:11 PM, Robert Metzger <rmetz...@apache.org> wrote:
> I think I saw similar segfaults when rebuilding Flink while Flink daemons > were running out of the build-target repository. After the flink-dist jar > has been rebuild and I tried stopping the daemons, they were running into > these segfaults. > > On Mon, Jan 23, 2017 at 12:02 PM, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > Hi, > > I tried it with and without RocksDB, no difference. Also tried with the > > newest JDK version, still fails for that specific topology. > > > > On the other hand after I made some other changes in the topology > (somwhere > > else) I can't reproduce it anymore. So it seems to be a very specific > > thing. To me it looks like a JVM problem, if it happens again I will let > > you know. > > > > Gyula > > > > Stephan Ewen <se...@apache.org> ezt írta (időpont: 2017. jan. 22., V, > > 21:16): > > > > > Hey! > > > > > > I am actually a bit puzzled how these segfaults could come, unless via > a > > > native library, or a JVM bug. > > > > > > Can you try how it behaves when not using RocksDB or using a newer JVM > > > version? > > > > > > Stephan > > > > > > > > > On Sun, Jan 22, 2017 at 7:51 PM, Gyula Fóra <gyf...@apache.org> wrote: > > > > > > > Hey All, > > > > > > > > I am trying to port some of my streaming jobs to Flink 1.2 from 1.1.4 > > > and I > > > > have noticed some very strange segfaults. (I am running in test > > > > environments - with the minicluster) > > > > It is a fairly complex job so I wouldnt go into details but the > > > interesting > > > > part is that adding/removing a simple filter in the wrong place in > the > > > > topology (such as (e -> true) or anything actually ) seems to cause > > > > frequent segfaults during execution. > > > > > > > > Basically the key part looks something like: > > > > > > > > ... > > > > DataStream stream = > > > source.map().setParallelism(1)..uid("AssignFieldIds"). > > > > name("AssignFieldIds").startNewChain(); > > > > DataStream filtered = input1.filter(t -> true).setParallelism(1) > > > > IterativeStream itStream = filtered.iterate(...) > > > > ... > > > > > > > > Some notes before the actual error: replacing the filter with a map > or > > > > other chained transforms also leads to this problem. If the filter is > > not > > > > chained there is no error (or if I remove the filter). > > > > > > > > The error I get looks like this: > > > > https://gist.github.com/gyfora/da713b99b1e85a681ff1a4182f001242 > > > > > > > > I wonder if anyone has seen something like this before, or have some > > > ideas > > > > how to debug it. The simple work around is to not chain the filter > but > > > it's > > > > still very strange. > > > > > > > > Regards, > > > > Gyula > > > > > > > > > >