[jira] [Created] (FLINK-5603) Use Flink's futures in QueryableStateClient

2017-01-22 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5603:
--

 Summary: Use Flink's futures in QueryableStateClient
 Key: FLINK-5603
 URL: https://issues.apache.org/jira/browse/FLINK-5603
 Project: Flink
  Issue Type: Improvement
  Components: Queryable State
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi
Priority: Minor


The current {{QueryableStateClient}} exposes Scala's Futures as the return type 
for queries. Since we are trying to get away from hard Scala dependencies in 
the current master, we should proactively replace this with Flink's Future 
interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5604) Replace QueryableStateClient constructor

2017-01-22 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5604:
--

 Summary: Replace QueryableStateClient constructor
 Key: FLINK-5604
 URL: https://issues.apache.org/jira/browse/FLINK-5604
 Project: Flink
  Issue Type: Improvement
  Components: Queryable State
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi


The {{QueryableStateClient}} constructor expects a configuration object which 
makes it very hard for users to see what's expected to be there and what not.

I propose to split this constructor up and add some static helper for the 
common cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5605) Make KvStateRequestSerializer package private

2017-01-22 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5605:
--

 Summary: Make KvStateRequestSerializer package private 
 Key: FLINK-5605
 URL: https://issues.apache.org/jira/browse/FLINK-5605
 Project: Flink
  Issue Type: Improvement
  Components: Queryable State
Reporter: Ufuk Celebi
Priority: Minor


>From early users I've seen that many people use the KvStateRequestSerializer 
>in their programs. This was actually meant as an internal package to be used 
>by the client and server for internal message serialization.

I vote to make this package private and create an explicit 
{{QueryableStateClientUtil}} for user serialization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5606) Remove magic number in key and namespace serialization

2017-01-22 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5606:
--

 Summary: Remove magic number in key and namespace serialization
 Key: FLINK-5606
 URL: https://issues.apache.org/jira/browse/FLINK-5606
 Project: Flink
  Issue Type: Improvement
  Components: Queryable State
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi
Priority: Minor


The serialized key and namespace for state queries contains a magic number 
between the key and namespace: {{key|42|namespace}}. This was for historical 
reasons in order to skip deserialization of the key and namespace for our old 
{{RocksDBStateBackend}} which used the same format. This has now been 
superseded by the keygroup aware state backends and there is no point in doing 
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5607) Move location lookup retry out of KvStateLocationLookupService

2017-01-22 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5607:
--

 Summary: Move location lookup retry out of 
KvStateLocationLookupService
 Key: FLINK-5607
 URL: https://issues.apache.org/jira/browse/FLINK-5607
 Project: Flink
  Issue Type: Improvement
  Components: Queryable State
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi


If a state location lookup fails because of an {{UnknownJobManager}}, the 
lookup service will automagically retry this. I think it's better to move this 
out of the lookup service and the retry be handled out side by the caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Strange segfaults in Flink 1.2 (probably not RocksDB related)

2017-01-22 Thread Gyula Fóra
Hey All,

I am trying to port some of my streaming jobs to Flink 1.2 from 1.1.4 and I
have noticed some very strange segfaults. (I am running in test
environments - with the minicluster)
It is a fairly complex job so I wouldnt go into details but the interesting
part is that adding/removing a simple filter in the wrong place in the
topology (such as (e -> true)  or anything actually ) seems to cause
frequent segfaults during execution.

Basically the key part looks something like:

...
DataStream stream = source.map().setParallelism(1)..uid("AssignFieldIds").
name("AssignFieldIds").startNewChain();
DataStream filtered = input1.filter(t -> true).setParallelism(1)
IterativeStream itStream = filtered.iterate(...)
...

Some notes before the actual error: replacing the filter with a map or
other chained transforms also leads to this problem. If the filter is not
chained there is no error (or if I remove the filter).

The error I get looks like this:
https://gist.github.com/gyfora/da713b99b1e85a681ff1a4182f001242

I wonder if anyone has seen something like this before, or have some ideas
how to debug it. The simple work around is to not chain the filter but it's
still very strange.

Regards,
Gyula


Re: Strange segfaults in Flink 1.2 (probably not RocksDB related)

2017-01-22 Thread Stephan Ewen
Hey!

I am actually a bit puzzled how these segfaults could come, unless via a
native library, or a JVM bug.

Can you try how it behaves when not using RocksDB or using a newer JVM
version?

Stephan


On Sun, Jan 22, 2017 at 7:51 PM, Gyula Fóra  wrote:

> Hey All,
>
> I am trying to port some of my streaming jobs to Flink 1.2 from 1.1.4 and I
> have noticed some very strange segfaults. (I am running in test
> environments - with the minicluster)
> It is a fairly complex job so I wouldnt go into details but the interesting
> part is that adding/removing a simple filter in the wrong place in the
> topology (such as (e -> true)  or anything actually ) seems to cause
> frequent segfaults during execution.
>
> Basically the key part looks something like:
>
> ...
> DataStream stream = source.map().setParallelism(1)..uid("AssignFieldIds").
> name("AssignFieldIds").startNewChain();
> DataStream filtered = input1.filter(t -> true).setParallelism(1)
> IterativeStream itStream = filtered.iterate(...)
> ...
>
> Some notes before the actual error: replacing the filter with a map or
> other chained transforms also leads to this problem. If the filter is not
> chained there is no error (or if I remove the filter).
>
> The error I get looks like this:
> https://gist.github.com/gyfora/da713b99b1e85a681ff1a4182f001242
>
> I wonder if anyone has seen something like this before, or have some ideas
> how to debug it. The simple work around is to not chain the filter but it's
> still very strange.
>
> Regards,
> Gyula
>


[jira] [Created] (FLINK-5608) Cancel button not always visible

2017-01-22 Thread Shannon Carey (JIRA)
Shannon Carey created FLINK-5608:


 Summary: Cancel button not always visible
 Key: FLINK-5608
 URL: https://issues.apache.org/jira/browse/FLINK-5608
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 1.1.4
Reporter: Shannon Carey
Assignee: Shannon Carey
Priority: Minor


When the window is not wide enough, or when the job name is too long, the 
"Cancel" button in the Job view of the web UI is not visible because it is the 
first element that gets wrapped down and gets covered by the secondary navbar 
(the tabs). This causes us to often need to resize the browser wider than our 
monitor in order to use the cancel button.

In general, the use of Bootstrap's ".navbar-fixed-top" is problematic if the 
content may wrap, especially if the content's horizontal width if not known & 
fixed. The ".navbar-fixed-top" uses fixed positioning, and therefore any 
unexpected change in height will result in overlap with the rest of the 
normal-flow content in the page. The Bootstrap docs explain this in their 
"Overflowing content" callout.

I am submitting a PR which does not attempt to resolve all issues with the 
fixed navbar approach, but attempts to improve the situation by using less 
horizontal space and by altering the layout approach of the Cancel button.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


GlobFilePathFilter NotSerializableException

2017-01-22 Thread Andrew Psaltis
Hi,
I am trying to use the GlobFilePathFIlter with Flink 1.2-SNAPSHOT have also
tried using the latest 1.3-SNAPSHOT code and get the same error. Basically
if using the GlobFilePathFilter there is a serialization exception due to
the inner class in sun.nio.fs.UnixFileSystem not being serializable. I have
tried various different kryo registrations, but must be missing something,
I am happy to work on fixing this, but may need some direction. The below
code (which I lifted from the testReadMultiplePatterns() in the
FileInputFormatTest class) reproduces the error, the exception and stack
trace follows. FWIW, I am testing this on OSX.

public static void main(String[] args) throws Exception {

final ExecutionEnvironment env =
ExecutionEnvironment.getExecutionEnvironment();
final TextInputFormat format = new TextInputFormat(new Path("/temp"));

format.setFilesFilter(new GlobFilePathFilter(
Collections.singletonList("**"),
Arrays.asList("**/another_file.bin", "**/dataFile1.txt")
));

DataSet result = env.readFile(format,"/tmp");
result.writeAsText("/temp/out");
env.execute("GlobFilePathFilter-Test");

}


Exception in thread "main" org.apache.flink.optimizer.CompilerException:
Error translating node 'Data Source "at
readFile(ExecutionEnvironment.java:520)
(org.apache.flink.api.java.io.TextInputFormat)" : NONE [[ GlobalProperties
[partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null,
grouped=null, unique=null] ]]': Could not write the user code wrapper class
org.apache.flink.api.common.operators.util.UserCodeObjectWrapper :
java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:381)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:106)
at
org.apache.flink.optimizer.plan.SourcePlanNode.accept(SourcePlanNode.java:86)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:128)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:192)
at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:188)
at
org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
at com.apsaltis.EventDetectionJob.main(EventDetectionJob.java:75)
Caused by:
org.apache.flink.runtime.operators.util.CorruptConfigurationException:
Could not write the user code wrapper class
org.apache.flink.api.common.operators.util.UserCodeObjectWrapper :
java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
at
org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:281)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createDataSourceVertex(JobGraphGenerator.java:888)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:281)
... 8 more
Caused by: java.io.NotSerializableException: sun.nio.fs.UnixFileSystem$3
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at java.util.ArrayList.writeObject(ArrayList.java:747)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at
org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:317)
at
org.apache.flink.util.Instantiat

[Dev] Flink 'InputFormat' Interface execution related problem

2017-01-22 Thread Pawan Manishka Gunarathna
Hi,

When we are implementing that Flink *InputFormat* Interface, if we have that*
input split creation* part in our data analytics server APIs can we
directly go to the second phase of the flink InputFormat Interface
execution.

Basically I need to know that can we read those InputSplits directly,
without generating InputSplits inside the InputFormat Interface. So it
would be great if you can provide any kind of help.

Thanks,
Pawan

-- 

*Pawan Gunaratne*
*Mob: +94 770373556*