[jira] [Commented] (FLINK-3675) YARN ship folder incosistent behavior

2016-06-27 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15351055#comment-15351055
 ] 

Stefano Baghino commented on FLINK-3675:


Of course, thanks for clarifying this to me. :)

> YARN ship folder incosistent behavior
> -
>
> Key: FLINK-3675
> URL: https://issues.apache.org/jira/browse/FLINK-3675
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 1.1.0
>
>
> After [some discussion on the user mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-YARN-ship-folder-td5458.html]
>  it came up that the {{flink/lib}} folder is always supposed to be shipped to 
> the YARN cluster so that all the nodes have access to its contents.
> Currently however, the Flink long-running YARN session actually ships the 
> folder because it's explicitly specified in the {{yarn-session.sh}} script, 
> while running a single job on YARN does not automatically ship it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3675) YARN ship folder incosistent behavior

2016-06-27 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350992#comment-15350992
 ] 

Stefano Baghino commented on FLINK-3675:


I must have misread something, I thought adding shipping folders while also 
shipping {{lib}} was option 1. However, that was the behavior I was suggesting. 
:) Thanks for specifying that the Flink assembly is always shipped, I thought 
it had to be included in the {{lib}} folder for it to be found.

> YARN ship folder incosistent behavior
> -
>
> Key: FLINK-3675
> URL: https://issues.apache.org/jira/browse/FLINK-3675
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 1.1.0
>
>
> After [some discussion on the user mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-YARN-ship-folder-td5458.html]
>  it came up that the {{flink/lib}} folder is always supposed to be shipped to 
> the YARN cluster so that all the nodes have access to its contents.
> Currently however, the Flink long-running YARN session actually ships the 
> folder because it's explicitly specified in the {{yarn-session.sh}} script, 
> while running a single job on YARN does not automatically ship it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3675) YARN ship folder incosistent behavior

2016-06-27 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350970#comment-15350970
 ] 

Stefano Baghino edited comment on FLINK-3675 at 6/27/16 1:17 PM:
-

The {{lib}} folder includes important JARs, like the Flink assembly and the 
logging libraries, I would make optional folders to add up to the default one, 
so that this (or these) can be used to ship additional libraries while the 
default ones are shipped by default.


was (Author: stefanobaghino):
The `lib` folder includes important JARs, like the Flink assembly and the 
logging libraries, I would make optional folders to add up to the default one, 
so that this (or these) can be used to ship additional libraries while the 
default ones are shipped by default.

> YARN ship folder incosistent behavior
> -
>
> Key: FLINK-3675
> URL: https://issues.apache.org/jira/browse/FLINK-3675
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 1.1.0
>
>
> After [some discussion on the user mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-YARN-ship-folder-td5458.html]
>  it came up that the {{flink/lib}} folder is always supposed to be shipped to 
> the YARN cluster so that all the nodes have access to its contents.
> Currently however, the Flink long-running YARN session actually ships the 
> folder because it's explicitly specified in the {{yarn-session.sh}} script, 
> while running a single job on YARN does not automatically ship it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3675) YARN ship folder incosistent behavior

2016-06-27 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350970#comment-15350970
 ] 

Stefano Baghino commented on FLINK-3675:


The `lib` folder includes important JARs, like the Flink assembly and the 
logging libraries, I would make optional folders to add up to the default one, 
so that this (or these) can be used to ship additional libraries while the 
default ones are shipped by default.

> YARN ship folder incosistent behavior
> -
>
> Key: FLINK-3675
> URL: https://issues.apache.org/jira/browse/FLINK-3675
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 1.1.0
>
>
> After [some discussion on the user mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-YARN-ship-folder-td5458.html]
>  it came up that the {{flink/lib}} folder is always supposed to be shipped to 
> the YARN cluster so that all the nodes have access to its contents.
> Currently however, the Flink long-running YARN session actually ships the 
> folder because it's explicitly specified in the {{yarn-session.sh}} script, 
> while running a single job on YARN does not automatically ship it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-05-24 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3239:
---
Assignee: Vijay Srinivasaraghavan  (was: Stefano Baghino)

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Vijay Srinivasaraghavan
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3699) Allow per-job Kerberos authentication

2016-05-24 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297875#comment-15297875
 ] 

Stefano Baghino commented on FLINK-3699:


Hi Eron; I concur with your opinion. Thanks for taking the time and making the 
effort to organize the work to be done in order to improve this aspect of 
Flink. Unfortunately I'm not able to work on this issue right now, so I'm 
switching it to unassigned. This issue can be used to track progress toward 
this goal while the much finer grained tasks you reported are being worked on.

> Allow per-job Kerberos authentication 
> --
>
> Key: FLINK-3699
> URL: https://issues.apache.org/jira/browse/FLINK-3699
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Scheduler, TaskManager, YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>  Labels: kerberos, security, yarn
>
> Currently, authentication in a secure ("Kerberized") environment is performed 
> once as a standalone cluster or a YARN session is started up. This means that 
> jobs submitted will all be executed with the privileges of the user that 
> started up the cluster. This is reasonable in a lot of situations but 
> disallows a fine control over ACLs when Flink is involved.
> Adding a way for each job submission to be independently authenticated would 
> allow each job to run with the privileges of a specific user, enabling much 
> more granular control over ACLs, in particular in the context of existing 
> secure cluster setups.
> So far, a known workaround to this limitation (at least when running on YARN) 
> is to run a per-job cluster as a specific user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3699) Allow per-job Kerberos authentication

2016-05-24 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3699:
---
Assignee: (was: Stefano Baghino)

> Allow per-job Kerberos authentication 
> --
>
> Key: FLINK-3699
> URL: https://issues.apache.org/jira/browse/FLINK-3699
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Scheduler, TaskManager, YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>  Labels: kerberos, security, yarn
>
> Currently, authentication in a secure ("Kerberized") environment is performed 
> once as a standalone cluster or a YARN session is started up. This means that 
> jobs submitted will all be executed with the privileges of the user that 
> started up the cluster. This is reasonable in a lot of situations but 
> disallows a fine control over ACLs when Flink is involved.
> Adding a way for each job submission to be independently authenticated would 
> allow each job to run with the privileges of a specific user, enabling much 
> more granular control over ACLs, in particular in the context of existing 
> secure cluster setups.
> So far, a known workaround to this limitation (at least when running on YARN) 
> is to run a per-job cluster as a specific user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3852) Use a StreamExecutionEnvironment in the quickstart job skeleton

2016-04-29 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263763#comment-15263763
 ] 

Stefano Baghino commented on FLINK-3852:


I agree, but would it make sense to have a quickstart for both the DataSet and 
DataStream APIs instead of replacing it completely?

> Use a StreamExecutionEnvironment in the quickstart job skeleton
> ---
>
> Key: FLINK-3852
> URL: https://issues.apache.org/jira/browse/FLINK-3852
> Project: Flink
>  Issue Type: Task
>  Components: Quickstarts
>Reporter: Robert Metzger
>  Labels: starter
>
> The Job skeleton created by the maven archetype "quickstart" is still setting 
> up an ExecutionEnvironment, not a StreamExecutionEnvironment.
> These days, most users are using Flink for streaming, so we should reflect 
> that in the quickstart as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-29 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263664#comment-15263664
 ] 

Stefano Baghino commented on FLINK-3239:


Beware that in my prototype the {{krb.conf.path}} does not point to the keytab 
configuration file location but to the Kerberos configuration file (the one 
that holds information about the realm(s) and so on). In a final solution that 
could of course be optional as there is a series of default location where that 
file can be looked up.
Regarding the keytab configuration file, in my experiments I found it was 
needed at all levels (job, JobManager, TaskManager and connector) for the job 
to be compiled to Flink's DAG representation and then executed. I'm not really 
sure of why is it, to be fair.
Of course I may be wrong, so feel free to experiment yourself with my code.

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-28 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262368#comment-15262368
 ] 

Stefano Baghino commented on FLINK-3239:


Thanks for reminding me [~vijikarthi], I've attached the patch.

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-28 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3239:
---
Attachment: flink3239-prototype.patch

This is the first rough prototype to make Kafka work on a secure cluster. It is 
based on the 1.0.x branch and works with YARN only. To use it, you have to set 
three new configuration keys; see the code for more information.

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-04-28 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262292#comment-15262292
 ] 

Stefano Baghino commented on FLINK-3239:


I've been able to come to a working prototype. The big issue with that is that 
due to Kafka's lack of support for the Hadoop delegation token (tracked by 
[KAFKA-1696|https://issues.apache.org/jira/browse/KAFKA-1696]) it requires a 
kinit to be performed on each node of the cluster. I believe an optimal 
solution should be postponed up until 
[KAFKA-1696|https://issues.apache.org/jira/browse/KAFKA-1696] is resolved.

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3788) Local variable values are not distributed to job runners

2016-04-19 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15247924#comment-15247924
 ] 

Stefano Baghino commented on FLINK-3788:


Yes, I experimented this very problem when trying Flink for the first time. I 
believe it depends on how the {{App}} trait is implemented in Scala 
([DelayedInit 
Scaladoc|http://www.scala-lang.org/api/current/#scala.DelayedInit]), perhaps 
this won't be an issue with Scala 2.12+.

> Local variable values are not distributed to job runners
> 
>
> Key: FLINK-3788
> URL: https://issues.apache.org/jira/browse/FLINK-3788
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API
>Affects Versions: 1.0.0, 1.0.1
> Environment: Scala 2.11.8
> Sun JDK 1.8.0_65 or OpenJDK 1.8.0_77
> Fedora 25, 4.6.0-0.rc2.git3.1.fc25.x86_64
>Reporter: Andreas C. Osowski
> Attachments: FLINK-3788.tgz
>
>
> Variable values of non-elementary types aren't caught and distributed to job 
> runners, causing them to remain 'null' and causing NPEs upon access when 
> running on a cluster. Running locally through `flink-clients` works fine.
> Changing parallelism or disabling the closure cleaner don't seem to have any 
> effect.
> Minimal example, also see the attached archive.
> {code:java}
> case class IntWrapper(a1: Int)
> val wrapped = IntWrapper(42)
> env.readTextFile("myTextFile.txt").map(line => wrapped.toString).collect
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (FLINK-3774) Flink configuration is not correctly forwarded to PlanExecutor in ScalaShellRemoteEnvironment

2016-04-18 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3774:
---
Comment: was deleted

(was: I've tested it on the 1.0 and the issue is there as well.)

> Flink configuration is not correctly forwarded to PlanExecutor in 
> ScalaShellRemoteEnvironment
> -
>
> Key: FLINK-3774
> URL: https://issues.apache.org/jira/browse/FLINK-3774
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Priority: Minor
> Fix For: 1.1.0
>
>
> Currently, the {{ScalaShellRemoteEnvironment}} does not correctly forwards 
> the Flink configuration to the {{PlanExecutor}}. Therefore, it is not 
> possible to use the Scala shell in combination with an HA cluster which needs 
> the configuration parameters set in the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3774) Flink configuration is not correctly forwarded to PlanExecutor in ScalaShellRemoteEnvironment

2016-04-18 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245310#comment-15245310
 ] 

Stefano Baghino edited comment on FLINK-3774 at 4/18/16 8:36 AM:
-

I've tested it on the 1.0 and the issue is there as well.


was (Author: stefanobaghino):
I've tested it on the 1.0 as well and the issue is there as well.

> Flink configuration is not correctly forwarded to PlanExecutor in 
> ScalaShellRemoteEnvironment
> -
>
> Key: FLINK-3774
> URL: https://issues.apache.org/jira/browse/FLINK-3774
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Priority: Minor
> Fix For: 1.1.0
>
>
> Currently, the {{ScalaShellRemoteEnvironment}} does not correctly forwards 
> the Flink configuration to the {{PlanExecutor}}. Therefore, it is not 
> possible to use the Scala shell in combination with an HA cluster which needs 
> the configuration parameters set in the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3774) Flink configuration is not correctly forwarded to PlanExecutor in ScalaShellRemoteEnvironment

2016-04-18 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245310#comment-15245310
 ] 

Stefano Baghino commented on FLINK-3774:


I've tested it on the 1.0 as well and the issue is there as well.

> Flink configuration is not correctly forwarded to PlanExecutor in 
> ScalaShellRemoteEnvironment
> -
>
> Key: FLINK-3774
> URL: https://issues.apache.org/jira/browse/FLINK-3774
> Project: Flink
>  Issue Type: Bug
>  Components: Scala Shell
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Priority: Minor
> Fix For: 1.1.0
>
>
> Currently, the {{ScalaShellRemoteEnvironment}} does not correctly forwards 
> the Flink configuration to the {{PlanExecutor}}. Therefore, it is not 
> possible to use the Scala shell in combination with an HA cluster which needs 
> the configuration parameters set in the configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3729) Several SQL tests fail on Windows OS

2016-04-11 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235391#comment-15235391
 ] 

Stefano Baghino commented on FLINK-3729:


Perhaps the {{Table.explain(boolean)}} method could use 
{{System.lineSeparator()}} instead of hard-coded UNIX newlines ({{\n}}) when 
building the explain string ({{org.apache.flink.api.table.Table.scala:286}}). 
Would that work?

> Several SQL tests fail on Windows OS
> 
>
> Key: FLINK-3729
> URL: https://issues.apache.org/jira/browse/FLINK-3729
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.0.1
>Reporter: Chesnay Schepler
>
> The Table API SqlExplain(Test/ITCase) fail categorically on Windows due to 
> different line-endings. These tests generate an string representation of an 
> abstract syntax tree; problem is there is a difference in line-endings.
> The expected ones contain LF, the actual one CRLF.
> The tests should be either changed to either
> * include CRLF line-endings in the expected string when run on windows
> * always use LF line-endings regardless of OS
> * use a compare method that is aware of this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3699) Allow per-job Kerberos authentication

2016-04-07 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino reassigned FLINK-3699:
--

Assignee: Stefano Baghino

> Allow per-job Kerberos authentication 
> --
>
> Key: FLINK-3699
> URL: https://issues.apache.org/jira/browse/FLINK-3699
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Scheduler, TaskManager, YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Stefano Baghino
>  Labels: kerberos, security, yarn
>
> Currently, authentication in a secure ("Kerberized") environment is performed 
> once as a standalone cluster or a YARN session is started up. This means that 
> jobs submitted will all be executed with the privileges of the user that 
> started up the cluster. This is reasonable in a lot of situations but 
> disallows a fine control over ACLs when Flink is involved.
> Adding a way for each job submission to be independently authenticated would 
> allow each job to run with the privileges of a specific user, enabling much 
> more granular control over ACLs, in particular in the context of existing 
> secure cluster setups.
> So far, a known workaround to this limitation (at least when running on YARN) 
> is to run a per-job cluster as a specific user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3699) Allow per-job Kerberos authentication

2016-04-05 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3699:
--

 Summary: Allow per-job Kerberos authentication 
 Key: FLINK-3699
 URL: https://issues.apache.org/jira/browse/FLINK-3699
 Project: Flink
  Issue Type: Improvement
  Components: JobManager, Scheduler, TaskManager, YARN Client
Affects Versions: 1.0.0
Reporter: Stefano Baghino


Currently, authentication in a secure ("Kerberized") environment is performed 
once as a standalone cluster or a YARN session is started up. This means that 
jobs submitted will all be executed with the privileges of the user that 
started up the cluster. This is reasonable in a lot of situations but disallows 
a fine control over ACLs when Flink is involved.

Adding a way for each job submission to be independently authenticated would 
allow each job to run with the privileges of a specific user, enabling much 
more granular control over ACLs, in particular in the context of existing 
secure cluster setups.

So far, a known workaround to this limitation (at least when running on YARN) 
is to run a per-job cluster as a specific user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3678) Make Flink logs directory configurable

2016-03-29 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3678:
--

 Summary: Make Flink logs directory configurable
 Key: FLINK-3678
 URL: https://issues.apache.org/jira/browse/FLINK-3678
 Project: Flink
  Issue Type: Improvement
  Components: Start-Stop Scripts
Affects Versions: 1.0.0
Reporter: Stefano Baghino
Assignee: Stefano Baghino
Priority: Minor
 Fix For: 1.0.1


Currently Flink logs are stored under {{$FLINK_HOME/log}} and the user cannot 
configure an alternative storage location. It would be nice to add a 
configuration key in the {{flink-conf.yaml}} and edit the {{bin/flink}} launch 
script accordingly to get the value if present or default to the current 
behavior if no value is provided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3675) YARN ship folder incosistent behavior

2016-03-29 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3675:
--

 Summary: YARN ship folder incosistent behavior
 Key: FLINK-3675
 URL: https://issues.apache.org/jira/browse/FLINK-3675
 Project: Flink
  Issue Type: Bug
  Components: YARN Client
Affects Versions: 1.0.0
Reporter: Stefano Baghino


After [some discussion on the user mailing 
list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-and-YARN-ship-folder-td5458.html]
 it came up that the {{flink/lib}} folder is always supposed to be shipped to 
the YARN cluster so that all the nodes have access to its contents.

Currently however, the Flink long-running YARN session actually ships the 
folder because it's explicitly specified in the {{yarn-session.sh}} script, 
while running a single job on YARN does not automatically ship it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-3668) Potential null deference in HadoopInputFormatBase#createInputSplits()

2016-03-25 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212095#comment-15212095
 ] 

Stefano Baghino edited comment on FLINK-3668 at 3/25/16 5:09 PM:
-

Maybe I don't have a clear picture, but judging from the snippet you posted it 
looks like when the {{InterruptedException}} is caught an {{IOException}} is 
thrown, so the following line wouldn't be accessible at that point, right?


was (Author: stefanobaghino):
Maybe I don't have a clear picture, but judging from the snippet you posted it 
looks like that when the {{InterruptedException}} is caught an {{IOException}} 
is thrown, so the following line wouldn't be accessible at that point, right?

> Potential null deference in HadoopInputFormatBase#createInputSplits()
> -
>
> Key: FLINK-3668
> URL: https://issues.apache.org/jira/browse/FLINK-3668
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> Here is the code:
> {code}
> List splits;
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> HadoopInputSplit[] hadoopInputSplits = new 
> HadoopInputSplit[splits.size()];
> {code}
> If InterruptedException is caught, splits would be null.
> Yet, the next line accesses splits.size() without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3668) Potential null deference in HadoopInputFormatBase#createInputSplits()

2016-03-25 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212095#comment-15212095
 ] 

Stefano Baghino commented on FLINK-3668:


Maybe I don't have a clear picture, but judging from the snippet you posted it 
looks like that when the {{InterruptedException}} is caught an {{IOException}} 
is thrown, so the following line wouldn't be accessible at that point, right?

> Potential null deference in HadoopInputFormatBase#createInputSplits()
> -
>
> Key: FLINK-3668
> URL: https://issues.apache.org/jira/browse/FLINK-3668
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> Here is the code:
> {code}
> List splits;
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> HadoopInputSplit[] hadoopInputSplits = new 
> HadoopInputSplit[splits.size()];
> {code}
> If InterruptedException is caught, splits would be null.
> Yet, the next line accesses splits.size() without null check.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3653) recovery.zookeeper.storageDir is not documented on the configuration page

2016-03-22 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3653:
--

 Summary: recovery.zookeeper.storageDir is not documented on the 
configuration page
 Key: FLINK-3653
 URL: https://issues.apache.org/jira/browse/FLINK-3653
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0.0
Reporter: Stefano Baghino
Assignee: Stefano Baghino
Priority: Minor
 Fix For: 1.1.0


The {{recovery.zookeeper.storageDir}} option is documented in the HA page but 
is missing from the configuration page. Since it's required for HA I think it 
would be a good idea to have it on both pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-03-19 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino reassigned FLINK-3239:
--

Assignee: Stefano Baghino

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Stefano Baghino
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-03-18 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199162#comment-15199162
 ] 

Stefano Baghino commented on FLINK-3239:


If no one is working on this I would assign this to myself.

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3518) Stale docs for quickstart setup

2016-02-26 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168681#comment-15168681
 ] 

Stefano Baghino commented on FLINK-3518:


Hi [~rmetzger], thanks for pointing it out, I didn't know. Should I leave the 
explicit stop or throw away my PR?

> Stale docs for quickstart setup
> ---
>
> Key: FLINK-3518
> URL: https://issues.apache.org/jira/browse/FLINK-3518
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Stefano Baghino
>Priority: Trivial
> Fix For: 1.0.0
>
>
> As reported on [this post on the mailing 
> list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/suggestion-for-Quickstart-td5168.html]
>  the quickstart documentation page can uses some fixes:
> * download link (wrong version) should be updated
> * usage line should be updated (due to changes in 
> [FLINK-2021|https://issues.apache.org/jira/browse/FLINK-2021])
> * the page should explicitly state to stop the local environment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3518) Stale docs for quickstart setup

2016-02-26 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3518:
--

 Summary: Stale docs for quickstart setup
 Key: FLINK-3518
 URL: https://issues.apache.org/jira/browse/FLINK-3518
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0.0
Reporter: Stefano Baghino
Assignee: Stefano Baghino
Priority: Trivial
 Fix For: 1.0.0


As reported on [this post on the mailing 
list|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/suggestion-for-Quickstart-td5168.html]
 the quickstart documentation page can uses some fixes:

* download link (wrong version) should be updated
* usage line should be updated (due to changes in 
[FLINK-2021|https://issues.apache.org/jira/browse/FLINK-2021])
* the page should explicitly state to stop the local environment



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3438) ExternalProcessRunner fails to detect ClassNotFound exception because of locale settings

2016-02-23 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158844#comment-15158844
 ] 

Stefano Baghino commented on FLINK-3438:


I'm starting to think it would be better to make a search for all the calls to 
{{contains}} in the code. :D

> ExternalProcessRunner fails to detect ClassNotFound exception because of 
> locale settings
> 
>
> Key: FLINK-3438
> URL: https://issues.apache.org/jira/browse/FLINK-3438
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Priority: Minor
> Fix For: 1.0.0
>
>
> ExternalProcessRunner tries to detect a ClassNotFoundException in the run 
> process by comparing its output with a fixed string of test; this means that 
> localized text reporting said exception is not interpreted as such.
> To reproduce:
> * test the `ExternalProcessRunnerTest.testClassNotFound` setting the 
> following system properties on the JVM: {{-Duser.country=IT 
> -Duser.language=it}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3438) ExternalProcessRunner fails to detect ClassNotFound exception because of locale settings

2016-02-17 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3438:
---
Description: 
ExternalProcessRunner tries to detect a ClassNotFoundException in the run 
process by comparing its output with a fixed string of test; this means that 
localized text reporting said exception is not interpreted as such.

To reproduce:
* test the `ExternalProcessRunnerTest.testClassNotFound` setting the following 
system properties on the JVM: {{-Duser.country=IT -Duser.language=it}}

  was:
ExternalProcessRunner tries to detect a ClassNotFoundException in the run 
process by comparing its output with a fixed string of test; this means that 
localized text reporting said exception is not interpreted as such.

To reproduce:
* test the `ExternalProcessRunnerTest.testClassNotFound` setting the following 
system properties on the JVM: `-Duser.country=IT -Duser.language=it`


> ExternalProcessRunner fails to detect ClassNotFound exception because of 
> locale settings
> 
>
> Key: FLINK-3438
> URL: https://issues.apache.org/jira/browse/FLINK-3438
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Priority: Minor
> Fix For: 1.0.0
>
>
> ExternalProcessRunner tries to detect a ClassNotFoundException in the run 
> process by comparing its output with a fixed string of test; this means that 
> localized text reporting said exception is not interpreted as such.
> To reproduce:
> * test the `ExternalProcessRunnerTest.testClassNotFound` setting the 
> following system properties on the JVM: {{-Duser.country=IT 
> -Duser.language=it}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3438) ExternalProcessRunner fails to detect ClassNotFound exception because of locale settings

2016-02-17 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150782#comment-15150782
 ] 

Stefano Baghino commented on FLINK-3438:


This is my temporary fix to make the tests work: 
[https://github.com/radicalbit/flink/commit/23183d27de3368342bfe570bc37513dd4bbf5424]

If this solution is acceptable (but honestly I doubt it), I can open a PR.

> ExternalProcessRunner fails to detect ClassNotFound exception because of 
> locale settings
> 
>
> Key: FLINK-3438
> URL: https://issues.apache.org/jira/browse/FLINK-3438
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Priority: Minor
> Fix For: 1.0.0
>
>
> ExternalProcessRunner tries to detect a ClassNotFoundException in the run 
> process by comparing its output with a fixed string of test; this means that 
> localized text reporting said exception is not interpreted as such.
> To reproduce:
> * test the `ExternalProcessRunnerTest.testClassNotFound` setting the 
> following system properties on the JVM: `-Duser.country=IT -Duser.language=it`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3438) ExternalProcessRunner fails to detect ClassNotFound exception because of locale settings

2016-02-17 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3438:
--

 Summary: ExternalProcessRunner fails to detect ClassNotFound 
exception because of locale settings
 Key: FLINK-3438
 URL: https://issues.apache.org/jira/browse/FLINK-3438
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Stefano Baghino
Priority: Minor
 Fix For: 1.0.0


ExternalProcessRunner tries to detect a ClassNotFoundException in the run 
process by comparing its output with a fixed string of test; this means that 
localized text reporting said exception is not interpreted as such.

To reproduce:
* test the `ExternalProcessRunnerTest.testClassNotFound` setting the following 
system properties on the JVM: `-Duser.country=IT -Duser.language=it`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1159) Case style anonymous functions not supported by Scala API

2016-02-17 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150174#comment-15150174
 ] 

Stefano Baghino commented on FLINK-1159:


If that's ok with [~aljoscha] I would assign this issue to me as I've started 
working on a solution 
([here|https://github.com/radicalbit/flink/commits/1159-implicit]) that 
resulted from a mailing list discussion with [~till.rohrmann] and 
[~StephanEwen] 
([here|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Case-style-anonymous-functions-not-supported-by-Scala-API-td10052.html]).

> Case style anonymous functions not supported by Scala API
> -
>
> Key: FLINK-1159
> URL: https://issues.apache.org/jira/browse/FLINK-1159
> Project: Flink
>  Issue Type: Bug
>  Components: Scala API
>Reporter: Till Rohrmann
>Assignee: Aljoscha Krettek
>
> In Scala it is very common to define anonymous functions of the following form
> {code}
> {
> case foo: Bar => foobar(foo)
> case _ => throw new RuntimeException()
> }
> {code}
> These case style anonymous functions are not supported yet by the Scala API. 
> Thus, one has to write redundant code to name the function parameter.
> What works is the following pattern, but it is not intuitive for someone 
> coming from Scala:
> {code}
> dataset.map{
>   _ match{
> case foo:Bar => ...
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3412) Remove implicit conversions JavaStream / ScalaStream

2016-02-16 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148498#comment-15148498
 ] 

Stefano Baghino commented on FLINK-3412:


I like the approach suggested by [~till.rohrmann], I've made a prototype here: 
[https://github.com/radicalbit/flink/commit/bff5870d5578de9d1aaffc648cacffac79da81a3]

> Remove implicit conversions JavaStream / ScalaStream
> 
>
> Key: FLINK-3412
> URL: https://issues.apache.org/jira/browse/FLINK-3412
> Project: Flink
>  Issue Type: Bug
>  Components: Scala API
>Affects Versions: 0.10.2
>Reporter: Stephan Ewen
> Fix For: 1.0.0
>
>
> I think the implicit conversions between the Java DataStream and the Scala 
> DataStream are dangerous.
> Because conversions exist in both directions, it is possible to write methods 
> that look like calling functions on the JavaStream, but instead convert it to 
> a Scala stream and call a different method.
> I just accidentally implemented an infinite recursion that way (via two 
> hidden implicit conversions).
> Making the conversions explicit (with a {{wrap()}} function like in the batch 
> API, we add minimally more code internally (nothing is different for users), 
> but avoid such accidental errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3412) Remove implicit conversions JavaStream / ScalaStream

2016-02-16 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148417#comment-15148417
 ] 

Stefano Baghino commented on FLINK-3412:


This one looks easy, I can take care of it immediately. Shall I assign it to me 
[~StephanEwen]?

> Remove implicit conversions JavaStream / ScalaStream
> 
>
> Key: FLINK-3412
> URL: https://issues.apache.org/jira/browse/FLINK-3412
> Project: Flink
>  Issue Type: Bug
>  Components: Scala API
>Affects Versions: 0.10.2
>Reporter: Stephan Ewen
> Fix For: 1.0.0
>
>
> I think the implicit conversions between the Java DataStream and the Scala 
> DataStream are dangerous.
> Because conversions exist in both directions, it is possible to write methods 
> that look like calling functions on the JavaStream, but instead convert it to 
> a Scala stream and call a different method.
> I just accidentally implemented an infinite recursion that way (via two 
> hidden implicit conversions).
> Making the conversions explicit (with a {{wrap()}} function like in the batch 
> API, we add minimally more code internally (nothing is different for users), 
> but avoid such accidental errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-2609) Automatic type registration is only called from the batch execution environment

2016-02-11 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142610#comment-15142610
 ] 

Stefano Baghino edited comment on FLINK-2609 at 2/11/16 11:28 AM:
--

[~rmetzger] can you elaborate on this issue in particular? I think I can work 
on it but I'm not sure of what you mean exactly. I'm starting to have a better 
understanding of Flink internals but I'm not very familiar with Kryo in 
particular (apart from knowing it's a serialization framework used by Flink to 
serialize generic types). Feel free to point me to any relevant documentation. 
Thank you in advance.


was (Author: stefanobaghino):
[~rmetzger] can you elaborate on this issue in particular? I think I can work 
on it but I'm not sure of what you mean exactly. I'm starting to have a better 
understanding of Flink interals but I'm not very familiar with Kryo in 
particular (apart from knowing it's a serialization framework used by Flink to 
serialize generic types). Feel free to point me to any relevant documentation. 
Thank you in advance.

> Automatic type registration is only called from the batch execution 
> environment
> ---
>
> Key: FLINK-2609
> URL: https://issues.apache.org/jira/browse/FLINK-2609
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>
> Kryo types in the streaming API are quite expensive to serialize because they 
> are not automatically registered at Kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2609) Automatic type registration is only called from the batch execution environment

2016-02-11 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142610#comment-15142610
 ] 

Stefano Baghino commented on FLINK-2609:


[~rmetzger] can you elaborate on this issue in particular? I think I can work 
on it but I'm not sure of what you mean exactly. I'm starting to have a better 
understanding of Flink interals but I'm not very familiar with Kryo in 
particular (apart from knowing it's a serialization framework used by Flink to 
serialize generic types). Feel free to point me to any relevant documentation. 
Thank you in advance.

> Automatic type registration is only called from the batch execution 
> environment
> ---
>
> Key: FLINK-2609
> URL: https://issues.apache.org/jira/browse/FLINK-2609
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>
> Kryo types in the streaming API are quite expensive to serialize because they 
> are not automatically registered at Kryo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2021) Rework examples to use ParameterTool

2016-02-06 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino reassigned FLINK-2021:
--

Assignee: Stefano Baghino

> Rework examples to use ParameterTool
> 
>
> Key: FLINK-2021
> URL: https://issues.apache.org/jira/browse/FLINK-2021
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Stefano Baghino
>Priority: Minor
>  Labels: starter
>
> In FLINK-1525, we introduced the {{ParameterTool}}.
> We should port the examples to use the tool.
> The examples could look like this (we should maybe discuss it first on the 
> mailing lists):
> {code}
> public static void main(String[] args) throws Exception {
> ParameterTool pt = ParameterTool.fromArgs(args);
> boolean fileOutput = pt.getNumberOfParameters() == 2;
> String textPath = null;
> String outputPath = null;
> if(fileOutput) {
> textPath = pt.getRequired("input");
> outputPath = pt.getRequired("output");
> }
> // set up the execution environment
> final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> env.getConfig().setUserConfig(pt);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3353) CSV-related tests may fail depending on locale

2016-02-06 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3353:
--

 Summary: CSV-related tests may fail depending on locale
 Key: FLINK-3353
 URL: https://issues.apache.org/jira/browse/FLINK-3353
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.0.0
Reporter: Stefano Baghino
Assignee: Stefano Baghino
Priority: Trivial
 Fix For: 1.0.0


As I've been running some tests, three suites 
({{KMeansWithBroadcastSetITCase.java}}, {{ScalaCsvReaderWithPOJOITCase.scala}} 
and {{CsvReaderITCase.java}}) kept failing locally because the expected results 
(string literals) were matched against an object rendered as a string using the 
{{String.format}} method, a method whose result depends on the default Locale; 
as my Locale (Italian) renders doubles with a comma instead of a dot as the 
decimal separator, the representation of doubles diverged from the expected 
one, thus making my tests fail, despite the results actually being correct.

As the result is hard-coded, it makes sense to explicitly use the US locale to 
represent those object. I'll open a PR with my solution ASAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-3289) Double reference to flink-contrib

2016-01-26 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino reassigned FLINK-3289:
--

Assignee: Stefano Baghino

> Double reference to flink-contrib
> -
>
> Key: FLINK-3289
> URL: https://issues.apache.org/jira/browse/FLINK-3289
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Assignee: Stefano Baghino
>Priority: Trivial
> Fix For: 1.0.0
>
>
> A [commit 
> |https://github.com/apache/flink/commit/f94112fbbaaf2ecc6a9ecb314a5565203ce779a7#diff-b57e0c6ee4a76b887f0f6a00398aa33dL78]
>  to solve FLINK-1452 introduced the {{flink-contrib}} sub-project in the 
> documentation. This other 
> [commit|https://github.com/apache/flink/commit/e9bf13d8626099a1d6ddb6ebe98c50be848fe79e#diff-6ac553dce1ab24b343bc66cc6b5d80bfR100]
>  to solve FLINK-1712 duplicated the {{flink-contrib}} line to specify it as a 
> container of early-stage project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3289) Double reference to flink-contrib

2016-01-25 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3289:
--

 Summary: Double reference to flink-contrib
 Key: FLINK-3289
 URL: https://issues.apache.org/jira/browse/FLINK-3289
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.0.0
Reporter: Stefano Baghino
Priority: Trivial
 Fix For: 1.0.0


A [commit 
|https://github.com/apache/flink/commit/f94112fbbaaf2ecc6a9ecb314a5565203ce779a7#diff-b57e0c6ee4a76b887f0f6a00398aa33dL78]
 to solve FLINK-1452 introduced the {{flink-contrib}} sub-project in the 
documentation. This other 
[commit|https://github.com/apache/flink/commit/e9bf13d8626099a1d6ddb6ebe98c50be848fe79e#diff-6ac553dce1ab24b343bc66cc6b5d80bfR100]
 to solve FLINK-1712 duplicated the {{flink-contrib}} line to specify it as a 
container of early-stage project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3289) Double reference to flink-contrib

2016-01-25 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115627#comment-15115627
 ] 

Stefano Baghino commented on FLINK-3289:


Since the last line is very recent my understanding is that {{flink-contrib}} 
both contains third-party addons and early-stage projects, so I would merge the 
two lines. I can quickly take care of this if that's alright.

> Double reference to flink-contrib
> -
>
> Key: FLINK-3289
> URL: https://issues.apache.org/jira/browse/FLINK-3289
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>Priority: Trivial
> Fix For: 1.0.0
>
>
> A [commit 
> |https://github.com/apache/flink/commit/f94112fbbaaf2ecc6a9ecb314a5565203ce779a7#diff-b57e0c6ee4a76b887f0f6a00398aa33dL78]
>  to solve FLINK-1452 introduced the {{flink-contrib}} sub-project in the 
> documentation. This other 
> [commit|https://github.com/apache/flink/commit/e9bf13d8626099a1d6ddb6ebe98c50be848fe79e#diff-6ac553dce1ab24b343bc66cc6b5d80bfR100]
>  to solve FLINK-1712 duplicated the {{flink-contrib}} line to specify it as a 
> container of early-stage project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2021) Rework examples to use new ParameterTool

2016-01-20 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108268#comment-15108268
 ] 

Stefano Baghino commented on FLINK-2021:


I've noticed that the debate took place in the ML with a positive response but 
that no work has been done on this issue so far. Would it be ok if I take care 
of this issue?

> Rework examples to use new ParameterTool
> 
>
> Key: FLINK-2021
> URL: https://issues.apache.org/jira/browse/FLINK-2021
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Priority: Minor
>  Labels: starter
>
> In FLINK-1525, we introduced the {{ParameterTool}}.
> We should port the examples to use the tool.
> The examples could look like this (we should maybe discuss it first on the 
> mailing lists):
> {code}
> public static void main(String[] args) throws Exception {
> ParameterTool pt = ParameterTool.fromArgs(args);
> boolean fileOutput = pt.getNumberOfParameters() == 2;
> String textPath = null;
> String outputPath = null;
> if(fileOutput) {
> textPath = pt.getRequired("input");
> outputPath = pt.getRequired("output");
> }
> // set up the execution environment
> final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> env.getConfig().setUserConfig(pt);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2021) Rework examples to use ParameterTool

2016-01-20 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108537#comment-15108537
 ] 

Stefano Baghino commented on FLINK-2021:


Sounds fine, thanks for the tips, I'll start working on it.

> Rework examples to use ParameterTool
> 
>
> Key: FLINK-2021
> URL: https://issues.apache.org/jira/browse/FLINK-2021
> Project: Flink
>  Issue Type: Improvement
>  Components: Examples
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Priority: Minor
>  Labels: starter
>
> In FLINK-1525, we introduced the {{ParameterTool}}.
> We should port the examples to use the tool.
> The examples could look like this (we should maybe discuss it first on the 
> mailing lists):
> {code}
> public static void main(String[] args) throws Exception {
> ParameterTool pt = ParameterTool.fromArgs(args);
> boolean fileOutput = pt.getNumberOfParameters() == 2;
> String textPath = null;
> String outputPath = null;
> if(fileOutput) {
> textPath = pt.getRequired("input");
> outputPath = pt.getRequired("output");
> }
> // set up the execution environment
> final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
> env.getConfig().setUserConfig(pt);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-992) Create CollectionDataSets by reading (client) local files.

2016-01-18 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15104929#comment-15104929
 ] 

Stefano Baghino commented on FLINK-992:
---

Sure, thanks for the reply. I just wanted to check out if I could be of help 
with this issue.

> Create CollectionDataSets by reading (client) local files.
> --
>
> Key: FLINK-992
> URL: https://issues.apache.org/jira/browse/FLINK-992
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API, Python API
>Reporter: Fabian Hueske
>Assignee: niraj rai
>Priority: Minor
>  Labels: starter
>
> {{CollectionDataSets}} are a nice way to feed data into programs.
> We could add support to read a client-local file at program construction time 
> using a FileInputFormat, put its data into a CollectionDataSet, and ship its 
> data together with the program.
> This would remove the need to upload small files into DFS which are used 
> together with some large input (stored in DFS).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2380) Allow to configure default FS for file inputs

2016-01-15 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101879#comment-15101879
 ] 

Stefano Baghino commented on FLINK-2380:


If this is still of interest for the community, I'd like to work on it.

> Allow to configure default FS for file inputs
> -
>
> Key: FLINK-2380
> URL: https://issues.apache.org/jira/browse/FLINK-2380
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager
>Affects Versions: 0.9, 0.10.0
>Reporter: Ufuk Celebi
>Priority: Minor
>  Labels: starter
> Fix For: 1.0.0
>
>
> File inputs use "file://" as default prefix. A user asked to make this 
> configurable, e.g. "hdfs://" as default.
> (I'm not sure whether this is already possible or not. I will check and 
> either close or implement this for the user.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3010) Add link to powered-by wiki page to project website

2016-01-15 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101842#comment-15101842
 ] 

Stefano Baghino commented on FLINK-3010:


Looking at the site, I'd say this issue has been closed.

> Add link to powered-by wiki page to project website
> ---
>
> Key: FLINK-3010
> URL: https://issues.apache.org/jira/browse/FLINK-3010
> Project: Flink
>  Issue Type: Improvement
>  Components: Project Website
>Reporter: Fabian Hueske
>Priority: Minor
>  Labels: starter
>
> We recently started a powered-by-Flink page in the Flink wiki:
> https://cwiki.apache.org/confluence/display/FLINK/Powered+by+Flink
> We should link to that page from the project website to give it some exposure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-2380) Allow to configure default FS for file inputs

2016-01-15 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101879#comment-15101879
 ] 

Stefano Baghino edited comment on FLINK-2380 at 1/15/16 3:06 PM:
-

Is this still of interest for the community? If so, I'd like to work on it.


was (Author: stefanobaghino):
If this is still of interest for the community, I'd like to work on it.

> Allow to configure default FS for file inputs
> -
>
> Key: FLINK-2380
> URL: https://issues.apache.org/jira/browse/FLINK-2380
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager
>Affects Versions: 0.9, 0.10.0
>Reporter: Ufuk Celebi
>Priority: Minor
>  Labels: starter
> Fix For: 1.0.0
>
>
> File inputs use "file://" as default prefix. A user asked to make this 
> configurable, e.g. "hdfs://" as default.
> (I'm not sure whether this is already possible or not. I will check and 
> either close or implement this for the user.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2380) Allow to configure default FS for file inputs

2016-01-15 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15101935#comment-15101935
 ] 

Stefano Baghino commented on FLINK-2380:


Sure, please go on. :)

> Allow to configure default FS for file inputs
> -
>
> Key: FLINK-2380
> URL: https://issues.apache.org/jira/browse/FLINK-2380
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager
>Affects Versions: 0.9, 0.10.0
>Reporter: Ufuk Celebi
>Priority: Minor
>  Labels: starter
> Fix For: 1.0.0
>
>
> File inputs use "file://" as default prefix. A user asked to make this 
> configurable, e.g. "hdfs://" as default.
> (I'm not sure whether this is already possible or not. I will check and 
> either close or implement this for the user.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-992) Create CollectionDataSets by reading (client) local files.

2016-01-15 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102061#comment-15102061
 ] 

Stefano Baghino commented on FLINK-992:
---

What is the status of this issue?

> Create CollectionDataSets by reading (client) local files.
> --
>
> Key: FLINK-992
> URL: https://issues.apache.org/jira/browse/FLINK-992
> Project: Flink
>  Issue Type: New Feature
>  Components: DataSet API, Python API
>Reporter: Fabian Hueske
>Assignee: niraj rai
>Priority: Minor
>  Labels: starter
>
> {{CollectionDataSets}} are a nice way to feed data into programs.
> We could add support to read a client-local file at program construction time 
> using a FileInputFormat, put its data into a CollectionDataSet, and ship its 
> data together with the program.
> This would remove the need to upload small files into DFS which are used 
> together with some large input (stored in DFS).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1947) Make Avro and Tachyon test logging less verbose

2016-01-15 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15102000#comment-15102000
 ] 

Stefano Baghino commented on FLINK-1947:


I've just run {{mvn test}} on {{flink-batch-connectors}}. It looks like 
{{AvroExternalJarProgramITCase}} doesn't emit any output on stdout despite the 
logging not being disabled.

{quote}
---
 T E S T S
---
Running org.apache.flink.api.avro.AvroExternalJarProgramITCase
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.88 sec - in 
org.apache.flink.api.avro.AvroExternalJarProgramITCase

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
{quote}

Is this issue still valid?

> Make Avro and Tachyon test logging less verbose
> ---
>
> Key: FLINK-1947
> URL: https://issues.apache.org/jira/browse/FLINK-1947
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Priority: Minor
>  Labels: starter
>
> Currently, the {{AvroExternalJarProgramITCase}} and the Tachyon test cases 
> write the cluster status messages to stdout. I think these messages are not 
> needed and only clutter the test output. Therefore, we should maybe suppress 
> these messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)