[GitHub] flink issue #2244: [FLINK-3874] Add a Kafka TableSink with JSON serializatio...

2016-08-08 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2244
  
@mushketyk I will review it this week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3874) Add a Kafka TableSink with JSON serialization

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411454#comment-15411454
 ] 

ASF GitHub Bot commented on FLINK-3874:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2244
  
@mushketyk I will review it this week.


> Add a Kafka TableSink with JSON serialization
> -
>
> Key: FLINK-3874
> URL: https://issues.apache.org/jira/browse/FLINK-3874
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Add a TableSink that writes JSON serialized data to Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4323) Checkpoint Coordinator Removes HA Checkpoints in Shutdown

2016-08-08 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411460#comment-15411460
 ] 

Ufuk Celebi commented on FLINK-4323:


Should we still remove the shut down hook? The shut down hook would only hide 
problems with the {{ExecutionGraph}} cleanup. The {{CheckpointCoordinator}} 
should stay independent of the {{RecoveryMode}}.

> Checkpoint Coordinator Removes HA Checkpoints in Shutdown
> -
>
> Key: FLINK-4323
> URL: https://issues.apache.org/jira/browse/FLINK-4323
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Priority: Blocker
> Fix For: 1.2.0, 1.1.1
>
>
> The {{CheckpointCoordinator}} has a shutdown hook that "shuts down" the 
> savepoint store, rather than suspending it.
> As a consequence, HA checkpoints may be lost when the JobManager process 
> fails but allows the shutdown hook to run.
> I would suggest to remove the sutdown hook from the CheckpointCoordinator all 
> together. The JobManager process is responsible for cleanups and can better 
> decide what should be cleaned up and what not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2327: [FLINK-4304] [runtime-web] Jar names that contain whitesp...

2016-08-08 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2327
  
Thanks for reviewing. Merging...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4304) Jar names that contain whitespace cause problems in web client

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411463#comment-15411463
 ] 

ASF GitHub Bot commented on FLINK-4304:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2327
  
Thanks for reviewing. Merging...


> Jar names that contain whitespace cause problems in web client
> --
>
> Key: FLINK-4304
> URL: https://issues.apache.org/jira/browse/FLINK-4304
> Project: Flink
>  Issue Type: Bug
>  Components: Web Client
>Affects Versions: 1.1.0
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> If the Jar file name contains whitespaces the web client can not start or 
> delete a job:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: JAR file does not 
> exist 
> '/var/folders/w8/k702f8s1017dfbfx_qlv2p24gn/T/flink-web-upload-4c52b922-8307-4098-b196-58b971864c51/980cad63-304c-48bb-a403-4756aea26ab4_Word%20Count.jar'
>   at 
> org.apache.flink.client.program.PackagedProgram.checkJarFile(PackagedProgram.java:755)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:181)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:91)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4328) Harden JobManagerHA*ITCase

2016-08-08 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-4328:
--

 Summary: Harden JobManagerHA*ITCase
 Key: FLINK-4328
 URL: https://issues.apache.org/jira/browse/FLINK-4328
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Reporter: Ufuk Celebi


The {{JobManagerHACheckpointRecoveryITCase}} and 
{{JobManagerHAJobGraphRecoveryITCase}} tests currently run the JobManager and 
TaskManager components as full processes. This is unnecessary and often 
problematic on Travis CI (we have seen multiple issues with these tests). I 
would like to rewrite these tests along the lines of the 
{{JobManagerHARecoveryTest}}, which is more light-weight and allows for better 
control of the test setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2327: [FLINK-4304] [runtime-web] Jar names that contain ...

2016-08-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2327


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4304) Jar names that contain whitespace cause problems in web client

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411470#comment-15411470
 ] 

ASF GitHub Bot commented on FLINK-4304:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2327


> Jar names that contain whitespace cause problems in web client
> --
>
> Key: FLINK-4304
> URL: https://issues.apache.org/jira/browse/FLINK-4304
> Project: Flink
>  Issue Type: Bug
>  Components: Web Client
>Affects Versions: 1.1.0
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> If the Jar file name contains whitespaces the web client can not start or 
> delete a job:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: JAR file does not 
> exist 
> '/var/folders/w8/k702f8s1017dfbfx_qlv2p24gn/T/flink-web-upload-4c52b922-8307-4098-b196-58b971864c51/980cad63-304c-48bb-a403-4756aea26ab4_Word%20Count.jar'
>   at 
> org.apache.flink.client.program.PackagedProgram.checkJarFile(PackagedProgram.java:755)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:181)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:91)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4304) Jar names that contain whitespace cause problems in web client

2016-08-08 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-4304.
-
   Resolution: Fixed
Fix Version/s: 1.1.0

Fixed in 8d25c64146ef0a2d0e9ee4c3e8ac3229e72a1416.

> Jar names that contain whitespace cause problems in web client
> --
>
> Key: FLINK-4304
> URL: https://issues.apache.org/jira/browse/FLINK-4304
> Project: Flink
>  Issue Type: Bug
>  Components: Web Client
>Affects Versions: 1.1.0
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.1.0
>
>
> If the Jar file name contains whitespaces the web client can not start or 
> delete a job:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: JAR file does not 
> exist 
> '/var/folders/w8/k702f8s1017dfbfx_qlv2p24gn/T/flink-web-upload-4c52b922-8307-4098-b196-58b971864c51/980cad63-304c-48bb-a403-4756aea26ab4_Word%20Count.jar'
>   at 
> org.apache.flink.client.program.PackagedProgram.checkJarFile(PackagedProgram.java:755)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:181)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:91)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2329: [FLINK-3138] [types] Method References are not supported ...

2016-08-08 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2329
  
Thanks Stephan. Merging...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3138) Method References are not supported as lambda expressions

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411549#comment-15411549
 ] 

ASF GitHub Bot commented on FLINK-3138:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2329
  
Thanks Stephan. Merging...


> Method References are not supported as lambda expressions
> -
>
> Key: FLINK-3138
> URL: https://issues.apache.org/jira/browse/FLINK-3138
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.10.2
>Reporter: Stephan Ewen
>Assignee: Timo Walther
> Fix For: 1.0.0
>
>
> For many functions (here for example KeySelectors), one can use lambda 
> expressions:
> {code}
> DataStream stream = ...;
> stream.keyBy( v -> v.getId() )
> {code}
> Java's other syntax for this, Method References, do not work:
> {code}
> DataStream stream = ...;
> stream.keyBy( MyType::getId )
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2329: [FLINK-3138] [types] Method References are not sup...

2016-08-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2329


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3138) Method References are not supported as lambda expressions

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411561#comment-15411561
 ] 

ASF GitHub Bot commented on FLINK-3138:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2329


> Method References are not supported as lambda expressions
> -
>
> Key: FLINK-3138
> URL: https://issues.apache.org/jira/browse/FLINK-3138
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.10.2
>Reporter: Stephan Ewen
>Assignee: Timo Walther
> Fix For: 1.0.0
>
>
> For many functions (here for example KeySelectors), one can use lambda 
> expressions:
> {code}
> DataStream stream = ...;
> stream.keyBy( v -> v.getId() )
> {code}
> Java's other syntax for this, Method References, do not work:
> {code}
> DataStream stream = ...;
> stream.keyBy( MyType::getId )
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4304) Jar names that contain whitespace cause problems in web client

2016-08-08 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4304:

Fix Version/s: (was: 1.1.0)
   1.2.0

> Jar names that contain whitespace cause problems in web client
> --
>
> Key: FLINK-4304
> URL: https://issues.apache.org/jira/browse/FLINK-4304
> Project: Flink
>  Issue Type: Bug
>  Components: Web Client
>Affects Versions: 1.1.0
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.2.0
>
>
> If the Jar file name contains whitespaces the web client can not start or 
> delete a job:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: JAR file does not 
> exist 
> '/var/folders/w8/k702f8s1017dfbfx_qlv2p24gn/T/flink-web-upload-4c52b922-8307-4098-b196-58b971864c51/980cad63-304c-48bb-a403-4756aea26ab4_Word%20Count.jar'
>   at 
> org.apache.flink.client.program.PackagedProgram.checkJarFile(PackagedProgram.java:755)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:181)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:91)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4304) Jar names that contain whitespace cause problems in web client

2016-08-08 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4304:

Fix Version/s: (was: 1.2.0)
   1.2

> Jar names that contain whitespace cause problems in web client
> --
>
> Key: FLINK-4304
> URL: https://issues.apache.org/jira/browse/FLINK-4304
> Project: Flink
>  Issue Type: Bug
>  Components: Web Client
>Affects Versions: 1.1.0
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.2.0
>
>
> If the Jar file name contains whitespaces the web client can not start or 
> delete a job:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: JAR file does not 
> exist 
> '/var/folders/w8/k702f8s1017dfbfx_qlv2p24gn/T/flink-web-upload-4c52b922-8307-4098-b196-58b971864c51/980cad63-304c-48bb-a403-4756aea26ab4_Word%20Count.jar'
>   at 
> org.apache.flink.client.program.PackagedProgram.checkJarFile(PackagedProgram.java:755)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:181)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:91)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4304) Jar names that contain whitespace cause problems in web client

2016-08-08 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-4304:

Fix Version/s: (was: 1.2)
   1.2.0

> Jar names that contain whitespace cause problems in web client
> --
>
> Key: FLINK-4304
> URL: https://issues.apache.org/jira/browse/FLINK-4304
> Project: Flink
>  Issue Type: Bug
>  Components: Web Client
>Affects Versions: 1.1.0
>Reporter: Timo Walther
>Assignee: Timo Walther
> Fix For: 1.2.0
>
>
> If the Jar file name contains whitespaces the web client can not start or 
> delete a job:
> {code}
> org.apache.flink.client.program.ProgramInvocationException: JAR file does not 
> exist 
> '/var/folders/w8/k702f8s1017dfbfx_qlv2p24gn/T/flink-web-upload-4c52b922-8307-4098-b196-58b971864c51/980cad63-304c-48bb-a403-4756aea26ab4_Word%20Count.jar'
>   at 
> org.apache.flink.client.program.PackagedProgram.checkJarFile(PackagedProgram.java:755)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:181)
>   at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:147)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:91)
>   at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:50)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-3138) Method References are not supported as lambda expressions

2016-08-08 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther resolved FLINK-3138.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed in ff777084ba2e1f1070ce5ecbc5afc122756ba851.

> Method References are not supported as lambda expressions
> -
>
> Key: FLINK-3138
> URL: https://issues.apache.org/jira/browse/FLINK-3138
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.10.2
>Reporter: Stephan Ewen
>Assignee: Timo Walther
> Fix For: 1.2.0, 1.0.0
>
>
> For many functions (here for example KeySelectors), one can use lambda 
> expressions:
> {code}
> DataStream stream = ...;
> stream.keyBy( v -> v.getId() )
> {code}
> Java's other syntax for this, Method References, do not work:
> {code}
> DataStream stream = ...;
> stream.keyBy( MyType::getId )
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-08-08 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411582#comment-15411582
 ] 

Aljoscha Krettek commented on FLINK-4282:
-

Hi,
the offset just hast to be taken into account when defining the start/end 
timestamps of the assigned windows. This way, the rest of the system does not 
need to be aware of the offset.

> Add Offset Parameter to WindowAssigners
> ---
>
> Key: FLINK-4282
> URL: https://issues.apache.org/jira/browse/FLINK-4282
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>
> Currently, windows are always aligned to EPOCH, which basically means days 
> are aligned with GMT. This is somewhat problematic for people living in 
> different timezones.
> And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-08 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411588#comment-15411588
 ] 

Aljoscha Krettek commented on FLINK-4326:
-

[~iemejia] I think you would also be interested in this.

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3899) Document window processing with Reduce/FoldFunction + WindowFunction

2016-08-08 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411590#comment-15411590
 ] 

Aljoscha Krettek commented on FLINK-3899:
-

Yes, I thing that would be enough.

> Document window processing with Reduce/FoldFunction + WindowFunction
> 
>
> Key: FLINK-3899
> URL: https://issues.apache.org/jira/browse/FLINK-3899
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Streaming
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>
> The streaming documentation does not describe how windows can be processed 
> with FoldFunction or ReduceFunction and a subsequent WindowFunction. This 
> combination allows for eager window aggregation (only a single element is 
> kept in the window) and access of the Window object, e.g., to have access to 
> the window's start and end time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2337: [FLINK-3042] [FLINK-3060] [types] Define a way to let typ...

2016-08-08 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2337
  
There are tests failing. I will update the PR later today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3042) Define a way to let types create their own TypeInformation

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411592#comment-15411592
 ] 

ASF GitHub Bot commented on FLINK-3042:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2337
  
There are tests failing. I will update the PR later today.


> Define a way to let types create their own TypeInformation
> --
>
> Key: FLINK-3042
> URL: https://issues.apache.org/jira/browse/FLINK-3042
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Timo Walther
> Fix For: 1.0.0
>
>
> Currently, introducing new Types that should have specific TypeInformation 
> requires
>   - Either integration with the TypeExtractor
>   - Or manually constructing the TypeInformation (potentially at every place) 
> and using type hints everywhere.
> I propose to add a way to allow classes to create their own TypeInformation 
> (like a static method "createTypeInfo()").
> To support generic nested types (like Optional / Either), the type extractor 
> would provide a Map of what generic variables map to what types (deduced from 
> the input). The class can use that to create the correct nested 
> TypeInformation (possibly by calling the TypeExtractor again, passing the Map 
> of generic bindings).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411602#comment-15411602
 ] 

ASF GitHub Bot commented on FLINK-4081:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2297
  
The length is not sufficient to differentiate between empty column and 
empty string.
E.g. if you have quoted strings: `,,` and `,"",`
With the changes in this PR we don't modify the return values, but deposit 
the information of an empty column in the error state.


> FieldParsers should support empty strings
> -
>
> Key: FLINK-4081
> URL: https://issues.apache.org/jira/browse/FLINK-4081
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Reporter: Flavio Pompermaier
>Assignee: Timo Walther
>  Labels: csvparser, table-api
>
> In order to parse CSV files using the new Table API that converts rows to Row 
> objects (that support null values), FiledParser implementations should 
> support emptry strings setting the parser state to 
> ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser 
> doesn't respect this constraint)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2297: [FLINK-4081] [core] [table] FieldParsers should support e...

2016-08-08 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2297
  
The length is not sufficient to differentiate between empty column and 
empty string.
E.g. if you have quoted strings: `,,` and `,"",`
With the changes in this PR we don't modify the return values, but deposit 
the information of an empty column in the error state.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411605#comment-15411605
 ] 

Ismaël Mejía commented on FLINK-4326:
-

For ref this is the commit of my proposed solution:
https://github.com/iemejia/flink/commit/af9ac6cdb3f6601d6248abe82df4fd44de4453e5

Notice that I finally closed the pull request since we could hack this via the 
&& wait for the docker script that was my purpose, but I still agree that 
having foreground processes has its value. For ref, the JIRA where we discussed 
this early on:
https://issues.apache.org/jira/browse/FLINK-4208

Is there an alternative PR for this ?

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-08-08 Thread Aditi Viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411608#comment-15411608
 ] 

Aditi Viswanathan commented on FLINK-4282:
--

Hi Aljoscha,

I've written down an example scenario to make sure that my understanding of
the problem is correct-

Scenario:
Current processing time: 5
Offset: 10
Window size: 100

1st window: start = 10, end = 110; but elements are assigned to this window
from processing time 0 to 100, so the element that comes at processing time
= 5 is assigned to this window

2nd window: start = 110, end = 210; but elements are assigned to this
window from processing time 100 to 200

My implementation will work for the scenario above. I've passed the offset
to the TimeWindow so that the window triggers at 100 (window end - offset)
rather than at 110 (window end) as is the current implementation:

(From ProcessingTimeTrigger)

public TriggerResult onElement(Object element, long timestamp,
TimeWindow window, TriggerContext ctx) {
   ctx.registerProcessingTimeTimer(window.maxTimestamp() - window.getOffset());
   return TriggerResult.CONTINUE;
}

(From TimeWindow)

this.end = end + offset;

@Override
public long maxTimestamp() {
   return end - 1;
}

Please let me know if this makes sense to you, or if there's something
I'm missing.

Thanks,
Aditi





Aditi Viswanathan | +91-9632130809
Data Engineer,
[24]7 Customer Ltd.

On Mon, Aug 8, 2016 at 3:12 PM, Aljoscha Krettek (JIRA) 



> Add Offset Parameter to WindowAssigners
> ---
>
> Key: FLINK-4282
> URL: https://issues.apache.org/jira/browse/FLINK-4282
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>
> Currently, windows are always aligned to EPOCH, which basically means days 
> are aligned with GMT. This is somewhat problematic for people living in 
> different timezones.
> And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3703) Add sequence matching semantics to discard matched events

2016-08-08 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411622#comment-15411622
 ] 

Till Rohrmann commented on FLINK-3703:
--

Hi [~ivan.mushketyk],

Yes you're right with your description of the matching behaviours. The list 
mustn't necessarily be exhaustive. If you can think of other interesting 
matching behaviours, then you can add them as well.

The matching semantics should be assigned to the pattern definition and then 
applied to each sequence individually. For "after first" this would mean that 
you remove all elements of the matched sequence from the shared buffer and all 
other elements which have the first element as a predecessor (also 
transitively). I hope this makes things a bit clearer :-)

> Add sequence matching semantics to discard matched events
> -
>
> Key: FLINK-3703
> URL: https://issues.apache.org/jira/browse/FLINK-3703
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> There is no easy way to decide whether events can be part of multiple 
> matching sequences or not. Currently, the default is that an event can 
> participate in multiple matching sequences. E.g. if you have the pattern 
> {{Pattern.begin("a").followedBy("b")}} and the input event stream 
> {{Event("A"), Event("B"), Event("C")}}, then you will generate the following 
> matching sequences: {{Event("A"), Event("B")}}, {{Event("A"), Event("C")}} 
> and {{Event("B"), Event("C")}}. 
> It would be useful to allow the user to define where the matching algorithm 
> should continue after a matching sequence has been found. Possible option 
> values could be 
>  * {{from first}} - continue keeping all events for future matches (that is 
> the current behaviour) 
>  * {{after first}} -  continue after the first element (remove first matching 
> event and continue with the second event)
>  * {{after last}} - continue after the last element (effectively discarding 
> all elements of the matching sequence)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2244: [FLINK-3874] Add a Kafka TableSink with JSON serializatio...

2016-08-08 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2244
  
@twalthr Great! Thank your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3874) Add a Kafka TableSink with JSON serialization

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411623#comment-15411623
 ] 

ASF GitHub Bot commented on FLINK-3874:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2244
  
@twalthr Great! Thank your help.


> Add a Kafka TableSink with JSON serialization
> -
>
> Key: FLINK-3874
> URL: https://issues.apache.org/jira/browse/FLINK-3874
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Add a TableSink that writes JSON serialized data to Kafka.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4329) Streaming File Source Must Correctly Handle Timestamps/Watermarks

2016-08-08 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-4329:
---

 Summary: Streaming File Source Must Correctly Handle 
Timestamps/Watermarks
 Key: FLINK-4329
 URL: https://issues.apache.org/jira/browse/FLINK-4329
 Project: Flink
  Issue Type: Bug
  Components: Streaming Connectors
Affects Versions: 1.1.0
Reporter: Aljoscha Krettek
 Fix For: 1.1.1


The {{ContinuousFileReaderOperator}} does not correctly deal with watermarks, 
i.e. they are just passed through. This means that when the 
{{ContinuousFileMonitoringFunction}} closes and emits a {{Long.MAX_VALUE}} that 
watermark can "overtake" the records that are to be emitted in the 
{{ContinuousFileReaderOperator}}. Together with the new "allowed lateness" 
setting in window operator this can lead to elements being dropped as late.

Also, {{ContinuousFileReaderOperator}} does not correctly assign ingestion 
timestamps since it is not technically a source but looks like one to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2297: [FLINK-4081] [core] [table] FieldParsers should support e...

2016-08-08 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2297
  
Okay, I think the main source of confusion here is that `EMPTY_STRING` does 
actually not refer to an empty string (as specific to the string parser) but to 
something like "null", "empty column", or "missing value".

Let's call it something like that and then it actually makes sense to me to 
communicate that via an error state. Because it is actually not a regular value 
(as an empty string would be).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4081) FieldParsers should support empty strings

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411745#comment-15411745
 ] 

ASF GitHub Bot commented on FLINK-4081:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2297
  
Okay, I think the main source of confusion here is that `EMPTY_STRING` does 
actually not refer to an empty string (as specific to the string parser) but to 
something like "null", "empty column", or "missing value".

Let's call it something like that and then it actually makes sense to me to 
communicate that via an error state. Because it is actually not a regular value 
(as an empty string would be).



> FieldParsers should support empty strings
> -
>
> Key: FLINK-4081
> URL: https://issues.apache.org/jira/browse/FLINK-4081
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Reporter: Flavio Pompermaier
>Assignee: Timo Walther
>  Labels: csvparser, table-api
>
> In order to parse CSV files using the new Table API that converts rows to Row 
> objects (that support null values), FiledParser implementations should 
> support emptry strings setting the parser state to 
> ParseErrorState.EMPTY_STRING (for example FloatParser and DoubleParser 
> doesn't respect this constraint)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2261: [FLINK-4226] Typo: Define Keys using Field Expressions ex...

2016-08-08 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2261
  
merging ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4226) Typo: Define Keys using Field Expressions example should use window and not reduce

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411759#comment-15411759
 ] 

ASF GitHub Bot commented on FLINK-4226:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2261
  
merging ...


> Typo: Define Keys using Field Expressions example should use window and not 
> reduce
> --
>
> Key: FLINK-4226
> URL: https://issues.apache.org/jira/browse/FLINK-4226
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.3
>Reporter: Ahmad Ragab
>Priority: Trivial
>  Labels: documentation
>
> ...
> {code:java}
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").window(/*window specification*/)
> // or, as a case class, which is less typing
> case class WC(word: String, count: Int)
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").reduce(/*window specification*/)
> {code}
> Should be: 
> val wordCounts = words.keyBy("word").-reduce- window(/*window specification*/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4226) Typo: Define Keys using Field Expressions example should use window and not reduce

2016-08-08 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-4226.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

Resolved in http://git-wip-us.apache.org/repos/asf/flink/commit/0e0c40d8

Thank you for the contribution.

> Typo: Define Keys using Field Expressions example should use window and not 
> reduce
> --
>
> Key: FLINK-4226
> URL: https://issues.apache.org/jira/browse/FLINK-4226
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.3
>Reporter: Ahmad Ragab
>Priority: Trivial
>  Labels: documentation
> Fix For: 1.2.0
>
>
> ...
> {code:java}
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").window(/*window specification*/)
> // or, as a case class, which is less typing
> case class WC(word: String, count: Int)
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").reduce(/*window specification*/)
> {code}
> Should be: 
> val wordCounts = words.keyBy("word").-reduce- window(/*window specification*/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2261: [FLINK-4226] Typo: Define Keys using Field Expressions ex...

2016-08-08 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2261
  
@ASRagab: Could you close the pull request? I can not close the PR from the 
Github interface and I forgot to include the magic string to close it from the 
commit message.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4226) Typo: Define Keys using Field Expressions example should use window and not reduce

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411767#comment-15411767
 ] 

ASF GitHub Bot commented on FLINK-4226:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2261
  
@ASRagab: Could you close the pull request? I can not close the PR from the 
Github interface and I forgot to include the magic string to close it from the 
commit message.


> Typo: Define Keys using Field Expressions example should use window and not 
> reduce
> --
>
> Key: FLINK-4226
> URL: https://issues.apache.org/jira/browse/FLINK-4226
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.3
>Reporter: Ahmad Ragab
>Priority: Trivial
>  Labels: documentation
> Fix For: 1.2.0
>
>
> ...
> {code:java}
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").window(/*window specification*/)
> // or, as a case class, which is less typing
> case class WC(word: String, count: Int)
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").reduce(/*window specification*/)
> {code}
> Should be: 
> val wordCounts = words.keyBy("word").-reduce- window(/*window specification*/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4226) Typo: Define Keys using Field Expressions example should use window and not reduce

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411771#comment-15411771
 ] 

ASF GitHub Bot commented on FLINK-4226:
---

Github user ASRagab commented on the issue:

https://github.com/apache/flink/pull/2261
  
Closing...


> Typo: Define Keys using Field Expressions example should use window and not 
> reduce
> --
>
> Key: FLINK-4226
> URL: https://issues.apache.org/jira/browse/FLINK-4226
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.3
>Reporter: Ahmad Ragab
>Priority: Trivial
>  Labels: documentation
> Fix For: 1.2.0
>
>
> ...
> {code:java}
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").window(/*window specification*/)
> // or, as a case class, which is less typing
> case class WC(word: String, count: Int)
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").reduce(/*window specification*/)
> {code}
> Should be: 
> val wordCounts = words.keyBy("word").-reduce- window(/*window specification*/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2261: [FLINK-4226] Typo: Define Keys using Field Expressions ex...

2016-08-08 Thread ASRagab
Github user ASRagab commented on the issue:

https://github.com/apache/flink/pull/2261
  
Closing...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4226) Typo: Define Keys using Field Expressions example should use window and not reduce

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411772#comment-15411772
 ] 

ASF GitHub Bot commented on FLINK-4226:
---

Github user ASRagab closed the pull request at:

https://github.com/apache/flink/pull/2261


> Typo: Define Keys using Field Expressions example should use window and not 
> reduce
> --
>
> Key: FLINK-4226
> URL: https://issues.apache.org/jira/browse/FLINK-4226
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.3
>Reporter: Ahmad Ragab
>Priority: Trivial
>  Labels: documentation
> Fix For: 1.2.0
>
>
> ...
> {code:java}
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").window(/*window specification*/)
> // or, as a case class, which is less typing
> case class WC(word: String, count: Int)
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").reduce(/*window specification*/)
> {code}
> Should be: 
> val wordCounts = words.keyBy("word").-reduce- window(/*window specification*/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2261: [FLINK-4226] Typo: Define Keys using Field Express...

2016-08-08 Thread ASRagab
Github user ASRagab closed the pull request at:

https://github.com/apache/flink/pull/2261


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2318: [FLINK-4291] Add log entry for scheduled reporters

2016-08-08 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2318
  
merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2312: [FLINK-4276] Fix TextInputFormatTest#testNestedFileRead

2016-08-08 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2312
  
merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4291) No log entry for unscheduled reporters

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411777#comment-15411777
 ] 

ASF GitHub Bot commented on FLINK-4291:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2318
  
merging


> No log entry for unscheduled reporters
> --
>
> Key: FLINK-4291
> URL: https://issues.apache.org/jira/browse/FLINK-4291
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> When a non-Scheduled reporter is configured no log message is printed. It 
> would be nice if we would print a log message for every instantiated 
> reporter, not just Scheduled ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4276) TextInputFormatTest.testNestedFileRead fails on Windows OS

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411776#comment-15411776
 ] 

ASF GitHub Bot commented on FLINK-4276:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/2312
  
merging


> TextInputFormatTest.testNestedFileRead fails on Windows OS
> --
>
> Key: FLINK-4276
> URL: https://issues.apache.org/jira/browse/FLINK-4276
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>  Labels: test-stability
>
> Stack-trace i got when running the test on W10:
> Running org.apache.flink.api.java.io.TextInputFormatTest
> test failed with exception: null
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.flink.api.java.io.TextInputFormatTest.testNestedFileRead(TextInputFormatTest.java:133)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4276) TextInputFormatTest.testNestedFileRead fails on Windows OS

2016-08-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-4276.
---
   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed in a2219b0b6258077fedc355d71d1846e3142d3e15

> TextInputFormatTest.testNestedFileRead fails on Windows OS
> --
>
> Key: FLINK-4276
> URL: https://issues.apache.org/jira/browse/FLINK-4276
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> Stack-trace i got when running the test on W10:
> Running org.apache.flink.api.java.io.TextInputFormatTest
> test failed with exception: null
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.flink.api.java.io.TextInputFormatTest.testNestedFileRead(TextInputFormatTest.java:133)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2312: [FLINK-4276] Fix TextInputFormatTest#testNestedFil...

2016-08-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2312


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2318: [FLINK-4291] Add log entry for scheduled reporters

2016-08-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2318


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4291) No log entry for unscheduled reporters

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411783#comment-15411783
 ] 

ASF GitHub Bot commented on FLINK-4291:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2318


> No log entry for unscheduled reporters
> --
>
> Key: FLINK-4291
> URL: https://issues.apache.org/jira/browse/FLINK-4291
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> When a non-Scheduled reporter is configured no log message is printed. It 
> would be nice if we would print a log message for every instantiated 
> reporter, not just Scheduled ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4276) TextInputFormatTest.testNestedFileRead fails on Windows OS

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411782#comment-15411782
 ] 

ASF GitHub Bot commented on FLINK-4276:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2312


> TextInputFormatTest.testNestedFileRead fails on Windows OS
> --
>
> Key: FLINK-4276
> URL: https://issues.apache.org/jira/browse/FLINK-4276
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats, Tests
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> Stack-trace i got when running the test on W10:
> Running org.apache.flink.api.java.io.TextInputFormatTest
> test failed with exception: null
> java.lang.AssertionError
> at org.junit.Assert.fail(Assert.java:86)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertTrue(Assert.java:52)
> at 
> org.apache.flink.api.java.io.TextInputFormatTest.testNestedFileRead(TextInputFormatTest.java:133)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4291) No log entry for unscheduled reporters

2016-08-08 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-4291.
---
   Resolution: Fixed
 Assignee: Chesnay Schepler
Fix Version/s: 1.2.0

Fixed in f1e9daece516db98aad15818e90e4e3bf78a6e13

> No log entry for unscheduled reporters
> --
>
> Key: FLINK-4291
> URL: https://issues.apache.org/jira/browse/FLINK-4291
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Affects Versions: 1.1.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
> Fix For: 1.2.0
>
>
> When a non-Scheduled reporter is configured no log message is printed. It 
> would be nice if we would print a log message for every instantiated 
> reporter, not just Scheduled ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2261: [FLINK-4226] Typo: Define Keys using Field Expressions ex...

2016-08-08 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2261
  
Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4226) Typo: Define Keys using Field Expressions example should use window and not reduce

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411786#comment-15411786
 ] 

ASF GitHub Bot commented on FLINK-4226:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2261
  
Thank you!


> Typo: Define Keys using Field Expressions example should use window and not 
> reduce
> --
>
> Key: FLINK-4226
> URL: https://issues.apache.org/jira/browse/FLINK-4226
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.3
>Reporter: Ahmad Ragab
>Priority: Trivial
>  Labels: documentation
> Fix For: 1.2.0
>
>
> ...
> {code:java}
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").window(/*window specification*/)
> // or, as a case class, which is less typing
> case class WC(word: String, count: Int)
> val words: DataStream[WC] = // [...]
> val wordCounts = words.keyBy("word").reduce(/*window specification*/)
> {code}
> Should be: 
> val wordCounts = words.keyBy("word").-reduce- window(/*window specification*/)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4330) Consider removing min()/minBy()/max()/maxBy()/sum() utility methods from the DataStream API

2016-08-08 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-4330:
-

 Summary: Consider removing min()/minBy()/max()/maxBy()/sum() 
utility methods from the DataStream API
 Key: FLINK-4330
 URL: https://issues.apache.org/jira/browse/FLINK-4330
 Project: Flink
  Issue Type: Sub-task
  Components: DataStream API
Reporter: Robert Metzger
 Fix For: 2.0.0


I think we should consider removing the min()/minBy()/max()/maxBy()/sum() 
utility methods from the DataStream API. They make the maintenance of the code 
unnecessary complicated, and don't add enough value for the users (who can not 
access the window metadata).

If we are keeping the methods, we should consolidate the min/minBy methods: the 
difference is subtle, and minBy can subsume the min method.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2323: [FLINK-2090] toString of CollectionInputFormat takes long...

2016-08-08 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2323
  
@mushketyk Are you going to update this pull request?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411840#comment-15411840
 ] 

ASF GitHub Bot commented on FLINK-2090:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2323
  
@mushketyk Are you going to update this pull request?


> toString of CollectionInputFormat takes long time when the collection is huge
> -
>
> Key: FLINK-2090
> URL: https://issues.apache.org/jira/browse/FLINK-2090
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2337: [FLINK-3042] [FLINK-3060] [types] Define a way to let typ...

2016-08-08 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2337
  
@twalthr and me had an offline review of this.

The general approach is "+1"
We decided to see if the "generic input variable mapping" can be part of 
the `TypeInformation`, rather than the `TypeInfoFactory`. That would make for 
cleaner code.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3042) Define a way to let types create their own TypeInformation

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411858#comment-15411858
 ] 

ASF GitHub Bot commented on FLINK-3042:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2337
  
@twalthr and me had an offline review of this.

The general approach is "+1"
We decided to see if the "generic input variable mapping" can be part of 
the `TypeInformation`, rather than the `TypeInfoFactory`. That would make for 
cleaner code.



> Define a way to let types create their own TypeInformation
> --
>
> Key: FLINK-3042
> URL: https://issues.apache.org/jira/browse/FLINK-3042
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.10.0
>Reporter: Stephan Ewen
>Assignee: Timo Walther
> Fix For: 1.0.0
>
>
> Currently, introducing new Types that should have specific TypeInformation 
> requires
>   - Either integration with the TypeExtractor
>   - Or manually constructing the TypeInformation (potentially at every place) 
> and using type hints everywhere.
> I propose to add a way to allow classes to create their own TypeInformation 
> (like a static method "createTypeInfo()").
> To support generic nested types (like Optional / Either), the type extractor 
> would provide a Map of what generic variables map to what types (deduced from 
> the input). The class can use that to create the correct nested 
> TypeInformation (possibly by calling the TypeExtractor again, passing the Map 
> of generic bindings).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4271) There is no way to set parallelism of operators produced by CoGroupedStreams

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411852#comment-15411852
 ] 

ASF GitHub Bot commented on FLINK-4271:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2305
  
I am at the point where I would suggest to actually break the API and apply 
a proper fix. Tough step, but the alternatives seem even worse.

I wrote a [discuss] mail to the dev list, let's see if someone objects, 
otherwise I would merge this in the next days (adding the japicmp exception 
rule).


> There is no way to set parallelism of operators produced by CoGroupedStreams
> 
>
> Key: FLINK-4271
> URL: https://issues.apache.org/jira/browse/FLINK-4271
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Wenlong Lyu
>Assignee: Jark Wu
>
> Currently, CoGroupStreams package the map/keyBy/window operators with a human 
> friendly interface, like: 
> dataStreamA.cogroup(streamB).where(...).equalsTo().window().apply(), both the 
> intermediate operators and final window operators can not be accessed by 
> users, and we cannot set attributes of the operators, which make co-group 
> hard to use in production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4331) Flink is not able to serialize scala classes / Task Not Serializable

2016-08-08 Thread Pushpendra Jaiswal (JIRA)
Pushpendra Jaiswal created FLINK-4331:
-

 Summary: Flink is not able to serialize scala classes / Task Not 
Serializable
 Key: FLINK-4331
 URL: https://issues.apache.org/jira/browse/FLINK-4331
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.1.0
Reporter: Pushpendra Jaiswal


I have scala class having 2 fields which are vals but flink is saying it 
doesn't have setters. Thus task is not serializable.

I tried setters using var but then it says duplicate setter. vals are public 
then why it is asking for setters. Flink version 1.1.0

class Impression(val map: Map[String, String],val keySet:Set[String])
==

  val preAggregate = stream
.filter(impression => {
true
})
 .map(impression => {
  val xmap = impression.map
  val values = valFunction(xmap)
  new ImpressionRecord(impression, values._1, values._2, values._3)
})
class Impression does not contain a setter for field map 19:54:49.995 [main] 
INFO o.a.f.a.java.typeutils.TypeExtractor - class Impression is not a valid 
POJO type 19:54:49.997 [main] DEBUG o.a.flink.api.scala.ClosureCleaner$ - 
accessedFields: Map(class -> Set()) Exception in thread "main" 
org.apache.flink.api.common.InvalidProgramException: Task not serializable at 
org.apache.flink.api.scala.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:172)
 at ) Caused by: java.io.NotSerializableException: 
org.apache.flink.streaming.api.scala.DataStream at 
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) at 
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at 
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) at 
java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) at 
java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at 
java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at 
org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:301)
 at 
org.apache.flink.api.scala.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:170)
 ... 18 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2305: [FLINK-4271] [DataStreamAPI] Enable CoGroupedStreams and ...

2016-08-08 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2305
  
I am at the point where I would suggest to actually break the API and apply 
a proper fix. Tough step, but the alternatives seem even worse.

I wrote a [discuss] mail to the dev list, let's see if someone objects, 
otherwise I would merge this in the next days (adding the japicmp exception 
rule).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411965#comment-15411965
 ] 

ASF GitHub Bot commented on FLINK-2090:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2323
  
@StephanEwen Sorry, somehow I missed your comment. I'll update the PR today.


> toString of CollectionInputFormat takes long time when the collection is huge
> -
>
> Key: FLINK-2090
> URL: https://issues.apache.org/jira/browse/FLINK-2090
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2323: [FLINK-2090] toString of CollectionInputFormat takes long...

2016-08-08 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2323
  
@StephanEwen Sorry, somehow I missed your comment. I'll update the PR today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4331) Flink is not able to serialize scala classes / Task Not Serializable

2016-08-08 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412032#comment-15412032
 ] 

Stephan Ewen commented on FLINK-4331:
-

There are two different issues here:

  - The getters/setters which only have to do with an object being a POJO or 
not (that is only relevant for certain API features)
  - The serializability

The code is not serializable because it references a DataStream outside the 
function.

> Flink is not able to serialize scala classes / Task Not Serializable
> 
>
> Key: FLINK-4331
> URL: https://issues.apache.org/jira/browse/FLINK-4331
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Affects Versions: 1.1.0
>Reporter: Pushpendra Jaiswal
>
> I have scala class having 2 fields which are vals but flink is saying it 
> doesn't have setters. Thus task is not serializable.
> I tried setters using var but then it says duplicate setter. vals are public 
> then why it is asking for setters. Flink version 1.1.0
> class Impression(val map: Map[String, String],val keySet:Set[String])
> ==
>   val preAggregate = stream
> .filter(impression => {
> true
> })
>  .map(impression => {
>   val xmap = impression.map
>   val values = valFunction(xmap)
>   new ImpressionRecord(impression, values._1, values._2, values._3)
> })
> class Impression does not contain a setter for field map 19:54:49.995 [main] 
> INFO o.a.f.a.java.typeutils.TypeExtractor - class Impression is not a valid 
> POJO type 19:54:49.997 [main] DEBUG o.a.flink.api.scala.ClosureCleaner$ - 
> accessedFields: Map(class -> Set()) Exception in thread "main" 
> org.apache.flink.api.common.InvalidProgramException: Task not serializable at 
> org.apache.flink.api.scala.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:172)
>  at ) Caused by: java.io.NotSerializableException: 
> org.apache.flink.streaming.api.scala.DataStream at 
> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) 
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) 
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) 
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at 
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547) 
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508) 
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) 
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) at 
> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) at 
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:301)
>  at 
> org.apache.flink.api.scala.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:170)
>  ... 18 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-08 Thread Milosz Tanski (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412040#comment-15412040
 ] 

Milosz Tanski commented on FLINK-4326:
--

This would also be valuable for people trying to run on linux systemd to run 
the service (which is pretty much all new linux distros). And that way the 
users can use system policies for logging (since systemd automatically collects 
stdout).

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4333) Name mixup in Savepoint versions

2016-08-08 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-4333:

Fix Version/s: 1.2.0

> Name mixup in Savepoint versions
> 
>
> Key: FLINK-4333
> URL: https://issues.apache.org/jira/browse/FLINK-4333
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Trivial
> Fix For: 1.2.0
>
>
> The {{SavepointV0}} is serialized with the {{SavepointV1Serializer}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4332) Savepoint Serializer mixed read()/readFully()

2016-08-08 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4332:
---

 Summary: Savepoint Serializer mixed read()/readFully()
 Key: FLINK-4332
 URL: https://issues.apache.org/jira/browse/FLINK-4332
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.1.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Priority: Critical
 Fix For: 1.2.0, 1.1.1


The {{SavepointV1Serializer}} accidentally used {{InputStream.read(byte[], int, 
int)}} where it should use {{InputStream.readFully(byte[], int, int)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4333) Name mixup in Savepoint versions

2016-08-08 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-4333:

Affects Version/s: (was: 1.2.0)

> Name mixup in Savepoint versions
> 
>
> Key: FLINK-4333
> URL: https://issues.apache.org/jira/browse/FLINK-4333
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Trivial
> Fix For: 1.2.0
>
>
> The {{SavepointV0}} is serialized with the {{SavepointV1Serializer}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4333) Name mixup in Savepoint versions

2016-08-08 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4333:
---

 Summary: Name mixup in Savepoint versions
 Key: FLINK-4333
 URL: https://issues.apache.org/jira/browse/FLINK-4333
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.1.0, 1.2.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Priority: Trivial


The {{SavepointV0}} is serialized with the {{SavepointV1Serializer}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1707) Add an Affinity Propagation Library Method

2016-08-08 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412194#comment-15412194
 ] 

Vasia Kalavri commented on FLINK-1707:
--

Thank you for the example [~joseprupi]. I'll try to get back to you soon!

> Add an Affinity Propagation Library Method
> --
>
> Key: FLINK-1707
> URL: https://issues.apache.org/jira/browse/FLINK-1707
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Vasia Kalavri
>Assignee: Josep Rubió
>Priority: Minor
>  Labels: requires-design-doc
> Attachments: Binary_Affinity_Propagation_in_Flink_design_doc.pdf
>
>
> This issue proposes adding the an implementation of the Affinity Propagation 
> algorithm as a Gelly library method and a corresponding example.
> The algorithm is described in paper [1] and a description of a vertex-centric 
> implementation can be found is [2].
> [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
> [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf
> Design doc:
> https://docs.google.com/document/d/1QULalzPqMVICi8jRVs3S0n39pell2ZVc7RNemz_SGA4/edit?usp=sharing
> Example spreadsheet:
> https://docs.google.com/spreadsheets/d/1CurZCBP6dPb1IYQQIgUHVjQdyLxK0JDGZwlSXCzBcvA/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4330) Consider removing min()/minBy()/max()/maxBy()/sum() utility methods from the DataStream API

2016-08-08 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412244#comment-15412244
 ] 

Stephan Ewen commented on FLINK-4330:
-

I the {{DataSet}} API, these methods made sense. There was also a clear 
distinction between what {{max()}} and {{maxBy()}} does.
This does not work as well in the {{DataStream}} API due to some missing 
features.

I think we should deprecate and remove them by the time we have a suitable 
replacement via the Table API.

> Consider removing min()/minBy()/max()/maxBy()/sum() utility methods from the 
> DataStream API
> ---
>
> Key: FLINK-4330
> URL: https://issues.apache.org/jira/browse/FLINK-4330
> Project: Flink
>  Issue Type: Sub-task
>  Components: DataStream API
>Reporter: Robert Metzger
> Fix For: 2.0.0
>
>
> I think we should consider removing the min()/minBy()/max()/maxBy()/sum() 
> utility methods from the DataStream API. They make the maintenance of the 
> code unnecessary complicated, and don't add enough value for the users (who 
> can not access the window metadata).
> If we are keeping the methods, we should consolidate the min/minBy methods: 
> the difference is subtle, and minBy can subsume the min method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2332: [FLINK-2055] Implement Streaming HBaseSink

2016-08-08 Thread delding
Github user delding commented on a diff in the pull request:

https://github.com/apache/flink/pull/2332#discussion_r73939778
  
--- Diff: 
flink-streaming-connectors/flink-connector-hbase/src/main/java/org/apache/flink/streaming/connectors/hbase/HBaseSink.java
 ---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.hbase;
+
+import com.google.common.base.Strings;
+import com.google.common.collect.Lists;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
+import org.apache.flink.util.Preconditions;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.client.Durability;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Mutation;
+import org.apache.hadoop.hbase.client.Put;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+
+public class HBaseSink extends RichSinkFunction {
+   private static final long serialVersionUID = 1L;
+
+   private static final Logger LOG = 
LoggerFactory.getLogger(HBaseSink.class);
+
+   private transient HBaseClient client;
+   private String tableName;
+   private HBaseMapper mapper;
+   private boolean writeToWAL = true;
+
+   public HBaseSink(String tableName, HBaseMapper mapper) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(tableName), 
"Table name cannot be null or empty");
+   Preconditions.checkArgument(mapper != null, "HBase mapper 
cannot be null");
+   this.tableName = tableName;
+   this.mapper = mapper;
+   }
+
+   public HBaseSink writeToWAL(boolean writeToWAL) {
+   this.writeToWAL = writeToWAL;
+   return this;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   try {
+   // use config files found in the classpath
+   client = new HBaseClient(HBaseConfiguration.create(), 
tableName);
+   } catch (IOException e) {
+   throw new RuntimeException("HBase sink preparation 
failed.", e);
+   }
+   }
+
+   @Override
+   public void invoke(IN value) throws Exception {
+   byte[] rowKey = mapper.rowKey(value);
+   List mutations = Lists.newArrayList();
+   for (HBaseMapper.HBaseColumn column : mapper.columns(value)) {
+   Mutation mutation;
+   if (column.isStandard()) {
--- End diff --

Thanks for these helpful comments :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2055) Implement Streaming HBaseSink

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412334#comment-15412334
 ] 

ASF GitHub Bot commented on FLINK-2055:
---

Github user delding commented on a diff in the pull request:

https://github.com/apache/flink/pull/2332#discussion_r73939778
  
--- Diff: 
flink-streaming-connectors/flink-connector-hbase/src/main/java/org/apache/flink/streaming/connectors/hbase/HBaseSink.java
 ---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.hbase;
+
+import com.google.common.base.Strings;
+import com.google.common.collect.Lists;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
+import org.apache.flink.util.Preconditions;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.client.Durability;
+import org.apache.hadoop.hbase.client.Increment;
+import org.apache.hadoop.hbase.client.Mutation;
+import org.apache.hadoop.hbase.client.Put;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+
+public class HBaseSink extends RichSinkFunction {
+   private static final long serialVersionUID = 1L;
+
+   private static final Logger LOG = 
LoggerFactory.getLogger(HBaseSink.class);
+
+   private transient HBaseClient client;
+   private String tableName;
+   private HBaseMapper mapper;
+   private boolean writeToWAL = true;
+
+   public HBaseSink(String tableName, HBaseMapper mapper) {
+   Preconditions.checkArgument(!Strings.isNullOrEmpty(tableName), 
"Table name cannot be null or empty");
+   Preconditions.checkArgument(mapper != null, "HBase mapper 
cannot be null");
+   this.tableName = tableName;
+   this.mapper = mapper;
+   }
+
+   public HBaseSink writeToWAL(boolean writeToWAL) {
+   this.writeToWAL = writeToWAL;
+   return this;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   try {
+   // use config files found in the classpath
+   client = new HBaseClient(HBaseConfiguration.create(), 
tableName);
+   } catch (IOException e) {
+   throw new RuntimeException("HBase sink preparation 
failed.", e);
+   }
+   }
+
+   @Override
+   public void invoke(IN value) throws Exception {
+   byte[] rowKey = mapper.rowKey(value);
+   List mutations = Lists.newArrayList();
+   for (HBaseMapper.HBaseColumn column : mapper.columns(value)) {
+   Mutation mutation;
+   if (column.isStandard()) {
--- End diff --

Thanks for these helpful comments :-)


> Implement Streaming HBaseSink
> -
>
> Key: FLINK-2055
> URL: https://issues.apache.org/jira/browse/FLINK-2055
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming, Streaming Connectors
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Hilmi Yildirim
>
> As per : 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart

2016-08-08 Thread Shannon Carey (JIRA)
Shannon Carey created FLINK-4334:


 Summary: Shaded Hadoop1 jar not fully excluded in Quickstart
 Key: FLINK-4334
 URL: https://issues.apache.org/jira/browse/FLINK-4334
 Project: Flink
  Issue Type: Bug
  Components: Quickstarts
Affects Versions: 1.0.3, 1.0.2, 1.0.1, 1.1.0
Reporter: Shannon Carey


The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink 
1.0.0 (see 
https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837),
 but the quickstart POMs both refer to it as flink-shaded-hadoop1.

If using "-Pbuild-jar", the problem is not encountered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoo...

2016-08-08 Thread rehevkor5
GitHub user rehevkor5 opened a pull request:

https://github.com/apache/flink/pull/2341

[FLINK-4334] [quickstarts] Correctly exclude hadoop1 in quickstart POMs

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rehevkor5/flink 
fix_hadoop1_not_excluded_in_quickstarts

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2341


commit 8cb7419a17359f0d61d632aae97f672d497ccc53
Author: Shannon Carey 
Date:   2016-08-08T20:04:23Z

[FLINK-4334] [quickstarts] Correctly exclude hadoop1 in quickstart POMs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412412#comment-15412412
 ] 

ASF GitHub Bot commented on FLINK-4334:
---

GitHub user rehevkor5 opened a pull request:

https://github.com/apache/flink/pull/2341

[FLINK-4334] [quickstarts] Correctly exclude hadoop1 in quickstart POMs

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rehevkor5/flink 
fix_hadoop1_not_excluded_in_quickstarts

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2341.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2341


commit 8cb7419a17359f0d61d632aae97f672d497ccc53
Author: Shannon Carey 
Date:   2016-08-08T20:04:23Z

[FLINK-4334] [quickstarts] Correctly exclude hadoop1 in quickstart POMs




> Shaded Hadoop1 jar not fully excluded in Quickstart
> ---
>
> Key: FLINK-4334
> URL: https://issues.apache.org/jira/browse/FLINK-4334
> Project: Flink
>  Issue Type: Bug
>  Components: Quickstarts
>Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3
>Reporter: Shannon Carey
>
> The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink 
> 1.0.0 (see 
> https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837),
>  but the quickstart POMs both refer to it as flink-shaded-hadoop1.
> If using "-Pbuild-jar", the problem is not encountered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-08 Thread Elias Levy (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412490#comment-15412490
 ] 

Elias Levy commented on FLINK-4326:
---

Not just systemd, but also upstart (my particular case, within an Amazon Linux 
AMI), deamontools, daemon, and many other supervisory tools.

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2323: [FLINK-2090] toString of CollectionInputFormat takes long...

2016-08-08 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2323
  
@StephanEwen I've updated the PR according to your review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge

2016-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412528#comment-15412528
 ] 

ASF GitHub Bot commented on FLINK-2090:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2323
  
@StephanEwen I've updated the PR according to your review.


> toString of CollectionInputFormat takes long time when the collection is huge
> -
>
> Key: FLINK-2090
> URL: https://issues.apache.org/jira/browse/FLINK-2090
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4335) Add jar id, and job parameters information to job status rest call

2016-08-08 Thread Zhenzhong Xu (JIRA)
Zhenzhong Xu created FLINK-4335:
---

 Summary: Add jar id, and job parameters information to job status 
rest call
 Key: FLINK-4335
 URL: https://issues.apache.org/jira/browse/FLINK-4335
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Reporter: Zhenzhong Xu
Priority: Minor


>From declarative, reconcilation based job management perspective, there is a 
>need to identify the job jar id, and all job parameters for a running job to 
>determine if the current job is up to date. 

I think these information needs to be available through the job manager rest 
call (/jobs/$id).





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4336) Expose ability to take a savepoint from job manager rest api

2016-08-08 Thread Zhenzhong Xu (JIRA)
Zhenzhong Xu created FLINK-4336:
---

 Summary: Expose ability to take a savepoint from job manager rest 
api
 Key: FLINK-4336
 URL: https://issues.apache.org/jira/browse/FLINK-4336
 Project: Flink
  Issue Type: Improvement
  Components: Webfrontend
Reporter: Zhenzhong Xu
Priority: Minor


There is a need to interact with job manager rest api to manage savepoint 
snapshots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4335) Add jar id, and job parameters information to job status rest call

2016-08-08 Thread Zhenzhong Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenzhong Xu updated FLINK-4335:

Description: 
>From declarative, reconcilation based job management perspective, there is a 
>need to identify the job jar id, and all job parameters for a running job to 
>determine if the current job is up to date. 

I think these information needs to be available through the job manager rest 
call (/jobs/$id).

* Jar ID
* Job entry class
* parallelism
* all user defined parameters


  was:
>From declarative, reconcilation based job management perspective, there is a 
>need to identify the job jar id, and all job parameters for a running job to 
>determine if the current job is up to date. 

I think these information needs to be available through the job manager rest 
call (/jobs/$id).




> Add jar id, and job parameters information to job status rest call
> --
>
> Key: FLINK-4335
> URL: https://issues.apache.org/jira/browse/FLINK-4335
> Project: Flink
>  Issue Type: Improvement
>  Components: Webfrontend
>Reporter: Zhenzhong Xu
>Priority: Minor
>
> From declarative, reconcilation based job management perspective, there is a 
> need to identify the job jar id, and all job parameters for a running job to 
> determine if the current job is up to date. 
> I think these information needs to be available through the job manager rest 
> call (/jobs/$id).
> * Jar ID
> * Job entry class
> * parallelism
> * all user defined parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)