Question about type cast in ExpressionReducer's reduce procedure

2017-06-28 Thread 郭健
Hi all,
I am implementing a STR_TO_DATE scalar SQL function to flink, and 
found return type casted from java.sql.Date to Integer in Flink’s 
ExpressionReducer:
https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/codegen/ExpressionReducer.scala#L56

// we need to cast here for RexBuilder.makeLiteral


  case (SqlTypeName.DATE, e) =>


Some(


  
rexBuilder.makeCast(typeFactory.createTypeFromTypeInfo(BasicTypeInfo.INT_TYPE_INFO),
 e)


)



so str_to_date('01,5,2013','%d,%m,%Y')" must return an Integer, 
which conflicted with my implementation.

My question is: why should we do this? I have seen in comments the 
reason to do this here is: “we need to cast here for RexBuilder.makeLiteral”, 
But is it reasonale to change user function’s return Type? Should we restore 
the origin return type after the reduce?


Thanks,
Aegeaner




Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Bowen Li
+1.

AWS EMR eco system is using Scala 2.11, and breaks with Scala 2.10. We had
to build several Flink components (e.g. flink-kinesis-connector) ourselves
in order to run on EMR. Defaulting to Scala 2.11 will greatly reduce
adoption cost for Flink on EMR


On Wed, Jun 28, 2017 at 9:34 AM, Till Rohrmann  wrote:

> I'm +1 for changing the profile and to start a discussion to drop Scala
> 2.10.
>
> Scala 2.10 is already quite old and the current stable version is 2.12. I
> would be surprised to see many people still using Scala 2.10.
>
> Cheers,
> Till
>
> On Wed, Jun 28, 2017 at 4:53 PM, Ted Yu  wrote:
>
> > Here is the KIP that drops support for Scala 2.10 in Kafka 0.11 :
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11
> >
> > FYI
> >
> > On Wed, Jun 28, 2017 at 7:23 AM, Piotr Nowojski  >
> > wrote:
> >
> > > Yes, I know and I’m proposing to change this in parent pom by default
> to
> > > scala-2.11.
> > >
> > > Changing parent pom every time anyone wants to touch/build in Intellij
> > > Kafka 0.11 connector is not a great idea. This would require a
> developer
> > to
> > > constantly stash those changes or commit and revert them before
> creating
> > a
> > > pull request.
> > >
> > > Piotrek
> > >
> > > > On Jun 28, 2017, at 3:49 PM, Greg Hogan  wrote:
> > > >
> > > > You don't need to use the build profile in IntelliJ, just change
> > > > scala.version and scala.binary.version in the parent pom (recent
> > > > refactorings made this possible without changing every pom).
> > > >
> > > > What is the benefit for changing the default without dropping older
> > > > versions when contributions are still limited to the functionality of
> > the
> > > > old version?
> > > >
> > > > On Wed, Jun 28, 2017 at 8:36 AM, Piotr Nowojski <
> > pi...@data-artisans.com
> > > >
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I propose to switch to Scala 2.11 as a default and to have a Scala
> > 2.10
> > > >> build profile. Now it is other way around. The reason for that is
> poor
> > > >> support for build profiles in Intellij, I was unable to make it work
> > > after
> > > >> I added Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala
> > > 2.10).
> > > >>
> > > >> As a side note, maybe we should also consider dropping Scala 2.10
> > > support?
> > > >>
> > > >> Piotrek
> > >
> > >
> >
>


[jira] [Created] (FLINK-7035) Flink Kinesis connector forces hard coding the AWS Region

2017-06-28 Thread Matt Pouttu-clarke (JIRA)
Matt Pouttu-clarke created FLINK-7035:
-

 Summary: Flink Kinesis connector forces hard coding the AWS Region
 Key: FLINK-7035
 URL: https://issues.apache.org/jira/browse/FLINK-7035
 Project: Flink
  Issue Type: Bug
  Components: Kinesis Connector
Affects Versions: 1.3.1, 1.2.0
 Environment: All AWS Amazon Linux nodes (including EMR) and AWS Lambda 
functions.
Reporter: Matt Pouttu-clarke


Hard coding the region frequently causes cross-region network access which can 
in the worst cases could cause brown-out of AWS services and nodes, violation 
of availability requirements per region, security and compliance issues, and 
extremely poor performance for the end user's jobs.  All AWS nodes and services 
are aware of the region they are running in, please see: 
https://aws.amazon.com/blogs/developer/determining-an-applications-current-region/

Need to change the following line of code to use Regions.getCurrentRegion() 
rather than throwing an exception.  Also, code examples should be changed to 
reflect correct practices.
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtil.java#L174



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7034) GraphiteReporter cannot recover from lost connection

2017-06-28 Thread Aleksandr (JIRA)
Aleksandr created FLINK-7034:


 Summary: GraphiteReporter cannot recover from lost connection
 Key: FLINK-7034
 URL: https://issues.apache.org/jira/browse/FLINK-7034
 Project: Flink
  Issue Type: Bug
  Components: Metrics
Affects Versions: 1.3.0
Reporter: Aleksandr
Priority: Blocker


Now Flink uses metric version 1.3.0 in which there is a 
[Bug|https://github.com/dropwizard/metrics/issues/694]. I think you should use 
version 1.3.1 or higher



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink-shaded pull request #5: [FLINK-7026] Add flink-shaded-asm-5 module

2017-06-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink-shaded/pull/5


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7033) Update Flink's MapR documentation to include default trust store for secure servers

2017-06-28 Thread Aniket (JIRA)
Aniket created FLINK-7033:
-

 Summary: Update Flink's MapR documentation to include default 
trust store for secure servers
 Key: FLINK-7033
 URL: https://issues.apache.org/jira/browse/FLINK-7033
 Project: Flink
  Issue Type: Improvement
Reporter: Aniket
Priority: Minor


As discussed at 
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/MapR-libraries-shading-issue-td13988.html]
 and at 
[https://community.mapr.com/message/60673-re-flink-with-mapr-shading-issues], 
we will need to update the Flink's MapR documentation to include an extra JVM 
arg as "-Djavax.net.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7032) Intellij is constantly changing language level of sub projects back to 1.6

2017-06-28 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-7032:
-

 Summary: Intellij is constantly changing language level of sub 
projects back to 1.6 
 Key: FLINK-7032
 URL: https://issues.apache.org/jira/browse/FLINK-7032
 Project: Flink
  Issue Type: Improvement
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski


Every time I do maven reimport projects, Intellij is switching back to 1.6 
language level. I tracked down this issue to misconfiguration in our pom.xml 
file. It correctly configure maven-compiler-plugin:

{code:xml}



org.apache.maven.plugins
maven-compiler-plugin
3.1

${java.version}
${java.version}


-Xlint:all


{code}

where ${java.version} is set to 1.7 in the properties, but it forgets to 
overwrite the following properties from apache-18.pom:

{code:xml}
  
1.6
1.6
  
{code}

It seems like compiling from console using maven ignores those values, but they 
are confusing Intellij.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7031) Document Gelly examples

2017-06-28 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-7031:
-

 Summary: Document Gelly examples
 Key: FLINK-7031
 URL: https://issues.apache.org/jira/browse/FLINK-7031
 Project: Flink
  Issue Type: New Feature
  Components: Documentation
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor
 Fix For: 1.4.0


The components comprising the Gelly examples runner (inputs, outputs, drivers, 
and soon transforms) were initially developed for internal Gelly use. As such, 
the Gelly documentation covers execution of the drivers but does not document 
the design and structure. The runner has become sufficiently advanced and 
integral to the development of new Gelly algorithms to warrant a page of 
documentation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Till Rohrmann
I'm +1 for changing the profile and to start a discussion to drop Scala
2.10.

Scala 2.10 is already quite old and the current stable version is 2.12. I
would be surprised to see many people still using Scala 2.10.

Cheers,
Till

On Wed, Jun 28, 2017 at 4:53 PM, Ted Yu  wrote:

> Here is the KIP that drops support for Scala 2.10 in Kafka 0.11 :
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11
>
> FYI
>
> On Wed, Jun 28, 2017 at 7:23 AM, Piotr Nowojski 
> wrote:
>
> > Yes, I know and I’m proposing to change this in parent pom by default to
> > scala-2.11.
> >
> > Changing parent pom every time anyone wants to touch/build in Intellij
> > Kafka 0.11 connector is not a great idea. This would require a developer
> to
> > constantly stash those changes or commit and revert them before creating
> a
> > pull request.
> >
> > Piotrek
> >
> > > On Jun 28, 2017, at 3:49 PM, Greg Hogan  wrote:
> > >
> > > You don't need to use the build profile in IntelliJ, just change
> > > scala.version and scala.binary.version in the parent pom (recent
> > > refactorings made this possible without changing every pom).
> > >
> > > What is the benefit for changing the default without dropping older
> > > versions when contributions are still limited to the functionality of
> the
> > > old version?
> > >
> > > On Wed, Jun 28, 2017 at 8:36 AM, Piotr Nowojski <
> pi...@data-artisans.com
> > >
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I propose to switch to Scala 2.11 as a default and to have a Scala
> 2.10
> > >> build profile. Now it is other way around. The reason for that is poor
> > >> support for build profiles in Intellij, I was unable to make it work
> > after
> > >> I added Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala
> > 2.10).
> > >>
> > >> As a side note, maybe we should also consider dropping Scala 2.10
> > support?
> > >>
> > >> Piotrek
> >
> >
>


[GitHub] flink-shaded issue #5: [FLINK-7026] Add flink-shaded-asm-5 module

2017-06-28 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink-shaded/pull/5
  
+1 to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Ted Yu
Here is the KIP that drops support for Scala 2.10 in Kafka 0.11 :

https://cwiki.apache.org/confluence/display/KAFKA/KIP-119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11

FYI

On Wed, Jun 28, 2017 at 7:23 AM, Piotr Nowojski 
wrote:

> Yes, I know and I’m proposing to change this in parent pom by default to
> scala-2.11.
>
> Changing parent pom every time anyone wants to touch/build in Intellij
> Kafka 0.11 connector is not a great idea. This would require a developer to
> constantly stash those changes or commit and revert them before creating a
> pull request.
>
> Piotrek
>
> > On Jun 28, 2017, at 3:49 PM, Greg Hogan  wrote:
> >
> > You don't need to use the build profile in IntelliJ, just change
> > scala.version and scala.binary.version in the parent pom (recent
> > refactorings made this possible without changing every pom).
> >
> > What is the benefit for changing the default without dropping older
> > versions when contributions are still limited to the functionality of the
> > old version?
> >
> > On Wed, Jun 28, 2017 at 8:36 AM, Piotr Nowojski  >
> > wrote:
> >
> >> Hi,
> >>
> >> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10
> >> build profile. Now it is other way around. The reason for that is poor
> >> support for build profiles in Intellij, I was unable to make it work
> after
> >> I added Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala
> 2.10).
> >>
> >> As a side note, maybe we should also consider dropping Scala 2.10
> support?
> >>
> >> Piotrek
>
>


Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Greg Hogan
You don't need to use the build profile in IntelliJ, just change
scala.version and scala.binary.version in the parent pom (recent
refactorings made this possible without changing every pom).

What is the benefit for changing the default without dropping older
versions when contributions are still limited to the functionality of the
old version?

On Wed, Jun 28, 2017 at 8:36 AM, Piotr Nowojski 
wrote:

> Hi,
>
> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10
> build profile. Now it is other way around. The reason for that is poor
> support for build profiles in Intellij, I was unable to make it work after
> I added Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).
>
> As a side note, maybe we should also consider dropping Scala 2.10 support?
>
> Piotrek


Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Piotr Nowojski
I have created an issue for this:

https://issues.apache.org/jira/browse/FLINK-7030 


and PR:

https://github.com/apache/flink/pull/4209 


Piotrek

> On Jun 28, 2017, at 2:45 PM, Aljoscha Krettek  wrote:
> 
> +1 For changing the default. I think a lot of systems don’t even support 2.10 
> anymore.
> 
>> On 28. Jun 2017, at 14:38, Ted Yu  wrote:
>> 
>> +1 on using Scale 2.11 as default. 
>>  Original message From: Piotr Nowojski 
>>  Date: 6/28/17  5:36 AM  (GMT-08:00) To: 
>> dev@flink.apache.org Subject: Switch to Scala 2.11 as a default build 
>> profile 
>> Hi,
>> 
>> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 
>> build profile. Now it is other way around. The reason for that is poor 
>> support for build profiles in Intellij, I was unable to make it work after I 
>> added Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).
>> 
>> As a side note, maybe we should also consider dropping Scala 2.10 support?
>> 
>> Piotrek
> 



[jira] [Created] (FLINK-7030) Build with scala-2.11 by default

2017-06-28 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-7030:
-

 Summary: Build with scala-2.11 by default
 Key: FLINK-7030
 URL: https://issues.apache.org/jira/browse/FLINK-7030
 Project: Flink
  Issue Type: Improvement
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski


As proposed recently on the dev mailing list.

I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build 
profile. Now it is the other way around. The reason for that is poor support 
for build profiles in Intellij, I was unable to make it work after I added 
Kafka 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] FLIP-21 - Improve object Copying/Reuse Mode for Streaming Runtime

2017-06-28 Thread Aljoscha Krettek
+1 for changing the default if so many people encountered problems with 
serialisation costs.

The first two modes don’t require any code changes, correct? Only the last one 
would require changes to the stream input processors.

We should also keep this issue in mind: 
https://issues.apache.org/jira/browse/FLINK-3974 
 i.e. we always need to make 
shallow copies of the StreamRecord.

Best,
Aljoscha

> On 27. Jun 2017, at 21:01, Zhenzhong Xu  wrote:
> 
> Stephan,
> 
> Fully supporting this FLIP. We originally encountered pretty big surprises 
> observing the object copy behavior causing significant performance 
> degradation for our massively parallel use case. 
> 
> In our case, (I think most appropriately SHOULD be the assumptions for all 
> streaming use case), is to assume object immutability for all the records 
> throughout processing pipeline, as a result, I don't see a need to 
> distinguish different object reuse behaviors for different (chained) 
> operators (or to the very extreme even the need to support COPY_PER_OPERATOR 
> other than we probably have to support for backward compatibility). I am also 
> a fan of failing fast if user asserts incorrect assumptions.
> 
> One feedback on the FLIP-21 itself, I am not very clear on the difference 
> between DEFAULT and FULL_REUSE enumeration, aren't them exactly the same 
> thing in new proposal? However, the model figures seem to indicate they are 
> slightly different? Can you elaborate a bit more?
> 
> Z. 
> 
> 
> On 2017-06-27 11:14 (-0700), Greg Hogan  > wrote: 
>> Hi Stephan,
>> 
>> Would this be an appropriate time to discuss allowing reuse to be a 
>> per-operator configuration? Object reuse for chained operators has lead to 
>> considerable surprise for some users of the DataSet API. This came up during 
>> the rework of the object reuse documentation for the DataSet API. With 
>> annotations a Function could mark whether input/iterator or output/collected 
>> objects should be copied or reused.
>> 
>> My distant observation is that is is safer to locally assert reuse at the 
>> operator level than to assume or guarantee the safety of object reuse across 
>> an entire program. It could also be handy to mix operators receiving 
>> copyable objects with operators not requiring copyable objects.
>> 
>> Greg
>> 
>> 
>>> On Jun 27, 2017, at 1:21 PM, Stephan Ewen  wrote:
>>> 
>>> Hi all!
>>> 
>>> I would like to propose the following FLIP:
>>> 
>>> FLIP-21 - Improve object Copying/Reuse Mode for Streaming Runtime:
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71012982
>>> 
>>> The FLIP is motivated by the fact that many users run into an unnecessary
>>> kind of performance problem caused by an old design artifact.
>>> 
>>> The required change should be reasonably small, and would help many users
>>> and Flink's general standing.
>>> 
>>> Happy to hear thoughts!
>>> 
>>> Stephan
>>> 
>>> ==
>>> 
>>> FLIP text is below. Pictures with illustrations are only in the Wiki, not
>>> supported on the mailing list.
>>> -
>>> 
>>> Motivation
>>> 
>>> The default behavior of the streaming runtime is to copy every element
>>> between chained operators.
>>> 
>>> That operation was introduced for “safety” reasons, to avoid the number 
>>> of
>>> cases where users can create incorrect programs by reusing mutable objects
>>> (a discouraged pattern, but possible). For example when using state
>>> backends that keep the state as objects on heap, reusing mutable objects
>>> can theoretically create cases where the same object is used in multiple
>>> state mappings.
>>> 
>>> The effect is that many people that try Flink get much lower performance
>>> than they could possibly get. From empirical evidence, almost all users
>>> that I (Stephan) have been in touch with eventually run into this issue
>>> eventually.
>>> 
>>> There are multiple observations about that design:
>>> 
>>> 
>>>  -
>>> 
>>>  Object copies are extremely costly. While some simple copy virtually for
>>>  free (types reliably detected as immutable are not copied at all), many
>>>  real pipelines use types like Avro, Thrift, JSON, etc, which are very
>>>  expensive to copy.
>>> 
>>> 
>>> 
>>>  -
>>> 
>>>  Keyed operations currently only occur after shuffles. The operations are
>>>  hence the first in a pipeline and will never have a reused object anyways.
>>>  That means for the most critical operation, this pre-caution is 
>>> unnecessary.
>>> 
>>> 
>>> 
>>>  -
>>> 
>>>  The mode is inconsistent with the contract of the DataSet API, which
>>>  does not copy at each step
>>> 
>>> 
>>> 
>>>  -
>>> 
>>>  To prevent these copies, users can select {{enableObjectReuse()}}, which
>>>  is misleading, since it does 

Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Aljoscha Krettek
+1 For changing the default. I think a lot of systems don’t even support 2.10 
anymore.

> On 28. Jun 2017, at 14:38, Ted Yu  wrote:
> 
> +1 on using Scale 2.11 as default. 
>  Original message From: Piotr Nowojski 
>  Date: 6/28/17  5:36 AM  (GMT-08:00) To: 
> dev@flink.apache.org Subject: Switch to Scala 2.11 as a default build profile 
> Hi,
> 
> I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build 
> profile. Now it is other way around. The reason for that is poor support for 
> build profiles in Intellij, I was unable to make it work after I added Kafka 
> 0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).
> 
> As a side note, maybe we should also consider dropping Scala 2.10 support?
> 
> Piotrek



Re: Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Ted Yu
+1 on using Scale 2.11 as default. 
 Original message From: Piotr Nowojski 
 Date: 6/28/17  5:36 AM  (GMT-08:00) To: 
dev@flink.apache.org Subject: Switch to Scala 2.11 as a default build profile 
Hi,

I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build 
profile. Now it is other way around. The reason for that is poor support for 
build profiles in Intellij, I was unable to make it work after I added Kafka 
0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).

As a side note, maybe we should also consider dropping Scala 2.10 support?

Piotrek

Switch to Scala 2.11 as a default build profile

2017-06-28 Thread Piotr Nowojski
Hi,

I propose to switch to Scala 2.11 as a default and to have a Scala 2.10 build 
profile. Now it is other way around. The reason for that is poor support for 
build profiles in Intellij, I was unable to make it work after I added Kafka 
0.11 dependency (Kafka 0.11 dropped support for Scala 2.10).

As a side note, maybe we should also consider dropping Scala 2.10 support?

Piotrek

[GitHub] flink-shaded pull request #5: [FLINK-7026] Add flink-shaded-asm-5 module

2017-06-28 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink-shaded/pull/5

[FLINK-7026] Add flink-shaded-asm-5 module



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink-shaded 7026

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink-shaded/pull/5.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5


commit 2bae90ea80f335cb55cddc754aeeb33168430937
Author: zentol 
Date:   2017-06-28T11:13:34Z

[FLINK-7026] Add flink-shaded-asm-5 module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7029) Documentation for WindowFunction is confusing

2017-06-28 Thread Felix Neutatz (JIRA)
Felix Neutatz created FLINK-7029:


 Summary: Documentation for WindowFunction is confusing
 Key: FLINK-7029
 URL: https://issues.apache.org/jira/browse/FLINK-7029
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Felix Neutatz
Priority: Trivial


Hi,

in the [example of the WindowFunction in the 
documentation|https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html#windowfunction---the-generic-case]
 we use WindowFunction, String, String, TimeWindow>. That 
means that our key data-type is a String. For me, this is highly confusing, 
since we can only have a String data type for the key, if we implement a custom 
key selector. Usually people, especially beginners, will use something like 
keyBy(ID), keyBy("attributeName"), which will always return a tuple e.g. a 
Tuple1. It would be great if somebody could change this to a tuple key 
type in  the example. I am sure this might help beginners to understand that by 
default the key type is a tuple.

Moreover, another suggestion would be that we overwrite keyBy() in a way that 
if we just get one attribute, we return this type directly instead of wrapping 
it in a Tuple1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7028) Replace asm dependencies

2017-06-28 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7028:
---

 Summary: Replace asm dependencies
 Key: FLINK-7028
 URL: https://issues.apache.org/jira/browse/FLINK-7028
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7027) Replace netty dependencies

2017-06-28 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7027:
---

 Summary: Replace netty dependencies
 Key: FLINK-7027
 URL: https://issues.apache.org/jira/browse/FLINK-7027
 Project: Flink
  Issue Type: Sub-task
  Components: Build System, Network
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7026) Add shaded asm dependency

2017-06-28 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-7026:
---

 Summary: Add shaded asm dependency
 Key: FLINK-7026
 URL: https://issues.apache.org/jira/browse/FLINK-7026
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [DISCUSS] Backwards compatibility policy.

2017-06-28 Thread Sebastian Schelter
I haven't closely followed the discussion so far, but isn't it Apache
policy that major versions should stay backwards compatible to all previous
releases with the same major version?

-s

2017-06-28 12:26 GMT+02:00 Kostas Kloudas :

> I agree that 1.1 compatibility is the most important “pain point", as
> compatibility with the rest of the versions follows a more “systematic”
> approach.
>
> I think that discarding compatibility with 1.1 will clear some parts
> of the codebase significantly.
>
> Kostas
>
> > On Jun 27, 2017, at 6:03 PM, Stephan Ewen  wrote:
> >
> > I think that this discussion is probably motivated especially by the
> > "legacy state" handling of Flink 1.1.
> > The biggest gain in codebase and productivity would be won only by
> dropping
> > 1.1 compatibility in Flink 1.4.
> >
> > My gut feeling is that this is reasonable. We support two versions back,
> > which means that users can skip one upgrade, but not two.
> >
> > From what I can tell, users are usually eager to upgrade. They don't do
> it
> > immediately, but as soon as the new release is a bit battle tested.
> >
> > I would expect skipping two entire versions to be rare enough to be okay
> > with a solution which is a bit more effort for the user:
> > You can upgrade from Flink 1.1. to 1.4 by loading the 1.1 savepoint into
> > Flink 1.2, take a savepoint (1.2 format), and resume that in Flink 1.4.
> >
> > Greetings,
> > Stephan
> >
> >
> > On Tue, Jun 27, 2017 at 12:01 PM, Stefan Richter <
> > s.rich...@data-artisans.com> wrote:
> >
> >> For many parts of the code, I would agree with Aljoscha. However, I can
> >> also see notable exceptions, such as maintaining support for the legacy
> >> state from Flink <=1.1. For example, I think dropping support for this
> can
> >> simplify new developments such as fast local recovery or state
> replication
> >> quiet a bit because this is a special case that runs through a lot of
> code
> >> from backend to JM. So besides this general discussion about a backwards
> >> compatible policy, do you think it could make sense to start another
> >> concrete discussion about if we still must or want backwards
> compatibility
> >> to Flink 1.1 in Flink 1.4?
> >>
> >>> Am 29.05.2017 um 12:08 schrieb Aljoscha Krettek :
> >>>
> >>> Normally, I’m the first one to suggest removing everything that is not
> >> absolutely necessary in order to have a clean code base. On this issue,
> >> though, I think we should support restoring from old Savepoints as far
> back
> >> as possible if it does not make the code completely unmaintainable. Some
> >> users might jump versions and always forcing them to go though every
> >> version from their old version to the current version doesn’t seem
> feasible
> >> and might put off some users.
> >>>
> >>> So far, I think the burden of supporting restore from 1.1 is still
> small
> >> enough and with each new version the changes between versions become
> less
> >> and less. The changes from 1.2 to the upcoming 1.3 are quite minimal, I
> >> think.
> >>>
> >>> Best,
> >>> Aljoscha
>  On 24. May 2017, at 17:58, Ted Yu  wrote:
> 
>  bq. about having LTS versions once a year
> 
>  +1 to the above.
> 
>  There may be various reasons users don't want to upgrade (after new
>  releases come out). We should give such users enough flexibility on
> the
>  upgrade path.
> 
>  Cheers
> 
>  On Wed, May 24, 2017 at 8:39 AM, Kostas Kloudas <
> >> k.klou...@data-artisans.com
> > wrote:
> 
> > Hi all,
> >
> > For the proposal of having a third party tool, I agree with Ted.
> > Maintaining
> > it is a big and far from trivial effort.
> >
> > Now for the window of backwards compatibility, I would argue that
> even
> >> if
> > for some users 4 months (1 release) is not enough to bump their Flink
> > version,
> > the proposed policy guarantees that there will always be a path from
> >> any
> > old
> > version to any subsequent one.
> >
> > Finally, for the proposal about having LTS versions once a year, I am
> >> not
> > sure if this will reduce or create more overhead. If I understand the
> >> plan
> > correctly, this would mean that the community will have to maintain
> > 2 or 3 LTS versions and the last two major ones, right?
> >
> >> On May 22, 2017, at 7:31 PM, Ted Yu  wrote:
> >>
> >> For #2, it is difficult to achieve:
> >>
> >> a. maintaining savepoint migration is non-trivial and should be
> >> reviewed
> > by
> >> domain experts
> >> b. how to certify such third-party tool
> >>
> >> Cheers
> >>
> >> On Mon, May 22, 2017 at 3:04 AM, 施晓罡 
> wrote:
> >>
> >>> Hi all,
> >>>
> >>> Currently, we work a lot in the maintenance of compatibility.
> >>> There exist much 

Re: [DISCUSS] Backwards compatibility policy.

2017-06-28 Thread Kostas Kloudas
I agree that 1.1 compatibility is the most important “pain point", as 
compatibility with the rest of the versions follows a more “systematic” 
approach.

I think that discarding compatibility with 1.1 will clear some parts 
of the codebase significantly.

Kostas

> On Jun 27, 2017, at 6:03 PM, Stephan Ewen  wrote:
> 
> I think that this discussion is probably motivated especially by the
> "legacy state" handling of Flink 1.1.
> The biggest gain in codebase and productivity would be won only by dropping
> 1.1 compatibility in Flink 1.4.
> 
> My gut feeling is that this is reasonable. We support two versions back,
> which means that users can skip one upgrade, but not two.
> 
> From what I can tell, users are usually eager to upgrade. They don't do it
> immediately, but as soon as the new release is a bit battle tested.
> 
> I would expect skipping two entire versions to be rare enough to be okay
> with a solution which is a bit more effort for the user:
> You can upgrade from Flink 1.1. to 1.4 by loading the 1.1 savepoint into
> Flink 1.2, take a savepoint (1.2 format), and resume that in Flink 1.4.
> 
> Greetings,
> Stephan
> 
> 
> On Tue, Jun 27, 2017 at 12:01 PM, Stefan Richter <
> s.rich...@data-artisans.com> wrote:
> 
>> For many parts of the code, I would agree with Aljoscha. However, I can
>> also see notable exceptions, such as maintaining support for the legacy
>> state from Flink <=1.1. For example, I think dropping support for this can
>> simplify new developments such as fast local recovery or state replication
>> quiet a bit because this is a special case that runs through a lot of code
>> from backend to JM. So besides this general discussion about a backwards
>> compatible policy, do you think it could make sense to start another
>> concrete discussion about if we still must or want backwards compatibility
>> to Flink 1.1 in Flink 1.4?
>> 
>>> Am 29.05.2017 um 12:08 schrieb Aljoscha Krettek :
>>> 
>>> Normally, I’m the first one to suggest removing everything that is not
>> absolutely necessary in order to have a clean code base. On this issue,
>> though, I think we should support restoring from old Savepoints as far back
>> as possible if it does not make the code completely unmaintainable. Some
>> users might jump versions and always forcing them to go though every
>> version from their old version to the current version doesn’t seem feasible
>> and might put off some users.
>>> 
>>> So far, I think the burden of supporting restore from 1.1 is still small
>> enough and with each new version the changes between versions become less
>> and less. The changes from 1.2 to the upcoming 1.3 are quite minimal, I
>> think.
>>> 
>>> Best,
>>> Aljoscha
 On 24. May 2017, at 17:58, Ted Yu  wrote:
 
 bq. about having LTS versions once a year
 
 +1 to the above.
 
 There may be various reasons users don't want to upgrade (after new
 releases come out). We should give such users enough flexibility on the
 upgrade path.
 
 Cheers
 
 On Wed, May 24, 2017 at 8:39 AM, Kostas Kloudas <
>> k.klou...@data-artisans.com
> wrote:
 
> Hi all,
> 
> For the proposal of having a third party tool, I agree with Ted.
> Maintaining
> it is a big and far from trivial effort.
> 
> Now for the window of backwards compatibility, I would argue that even
>> if
> for some users 4 months (1 release) is not enough to bump their Flink
> version,
> the proposed policy guarantees that there will always be a path from
>> any
> old
> version to any subsequent one.
> 
> Finally, for the proposal about having LTS versions once a year, I am
>> not
> sure if this will reduce or create more overhead. If I understand the
>> plan
> correctly, this would mean that the community will have to maintain
> 2 or 3 LTS versions and the last two major ones, right?
> 
>> On May 22, 2017, at 7:31 PM, Ted Yu  wrote:
>> 
>> For #2, it is difficult to achieve:
>> 
>> a. maintaining savepoint migration is non-trivial and should be
>> reviewed
> by
>> domain experts
>> b. how to certify such third-party tool
>> 
>> Cheers
>> 
>> On Mon, May 22, 2017 at 3:04 AM, 施晓罡  wrote:
>> 
>>> Hi all,
>>> 
>>> Currently, we work a lot in the maintenance of compatibility.
>>> There exist much code in runtime to support the migration of
>> savepoints
>>> (most of which are deprecated), making it hard to focus on the
>> current
>>> implementation.
>>> When more versions are released, much more efforts will be needed if
>> we
>>> try to make these released versions compatible.
>>> 
>>> I agree with Tzu-Li that we should provide a method to let users
>> upgrade
>>> Flink in a reasonable pace.
>>> But i am against the proposal that we only offer 

[GitHub] flink-shaded pull request #3: [FLINK-7007] Add README.md

2017-06-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink-shaded/pull/3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink-shaded pull request #4: Add flink-shaded-netty module

2017-06-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink-shaded/pull/4


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-7025) Using NullByteKeySelector for Unbounded ProcTime NonPartitioned Over

2017-06-28 Thread sunjincheng (JIRA)
sunjincheng created FLINK-7025:
--

 Summary: Using NullByteKeySelector for Unbounded ProcTime 
NonPartitioned Over
 Key: FLINK-7025
 URL: https://issues.apache.org/jira/browse/FLINK-7025
 Project: Flink
  Issue Type: Bug
Reporter: sunjincheng
Assignee: sunjincheng


Currently we added `Cleanup State` feature. But It not work well if we enabled 
the stateCleaning on Unbounded ProcTime NonPartitioned Over window, Because in 
`ProcessFunctionWithCleanupState` we has using the keyed state.

So, In this JIRA. I'll change the  `Unbounded ProcTime NonPartitioned Over` to 
`partitioned Over` by using NullByteKeySelector. OR created a 
`NonKeyedProcessFunctionWithCleanupState`. But I think the first way is 
simpler. What do you think? [~fhueske]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)