[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-12-20 Thread Haohui Mai (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17462498#comment-17462498
 ] 

Haohui Mai commented on FLINK-24456:


IMO anything on-par with the current table API is sufficient

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-11-13 Thread Haohui Mai (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17443250#comment-17443250
 ] 

Haohui Mai commented on FLINK-24456:


It just needs to be on par with the datastream api.

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Assignee: ZhuoYu Chen
>Priority: Minor
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (FLINK-24485) COUNT(DISTINCT) should support binary field

2021-10-08 Thread Haohui Mai (Jira)
Haohui Mai created FLINK-24485:
--

 Summary: COUNT(DISTINCT) should support binary field
 Key: FLINK-24485
 URL: https://issues.apache.org/jira/browse/FLINK-24485
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.14.0
Reporter: Haohui Mai


Current the SQL API fails when doing {{COUNT(DISTINCT)}} on a binary field. In 
our use case we store the UUID as a 16-byte binary string.

While it is possible to work around to do a base64 encoding on the string but 
it should be relatively straightforward to implement the native solution to 
gain the optimal speed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-10-08 Thread Haohui Mai (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426117#comment-17426117
 ] 

Haohui Mai commented on FLINK-24456:


[~MartijnVisser] feel free to take it forward

> Support bounded offset in the Kafka table connector
> ---
>
> Key: FLINK-24456
> URL: https://issues.apache.org/jira/browse/FLINK-24456
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kafka, Table SQL / Ecosystem
>Reporter: Haohui Mai
>Priority: Minor
>
> The {{setBounded}} API in the DataStream connector of Kafka is particularly 
> useful when writing tests. Unfortunately the table connector of Kafka lacks 
> the same API.
> It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24456) Support bounded offset in the Kafka table connector

2021-10-05 Thread Haohui Mai (Jira)
Haohui Mai created FLINK-24456:
--

 Summary: Support bounded offset in the Kafka table connector
 Key: FLINK-24456
 URL: https://issues.apache.org/jira/browse/FLINK-24456
 Project: Flink
  Issue Type: Improvement
Reporter: Haohui Mai


The {{setBounded}} API in the DataStream connector of Kafka is particularly 
useful when writing tests. Unfortunately the table connector of Kafka lacks the 
same API.

It would be good to have this API added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-8240) Create unified interfaces to configure and instatiate TableSources

2017-12-19 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296435#comment-16296435
 ] 

Haohui Mai commented on FLINK-8240:
---

It seems that it is a great use case of layered table sources / converters, 
thus I'm not fully sure that all tables should be built using {{TableFactory}} 
yet.

Popping up one level, I have a relevant question -- assuming that we need to 
implement the {{CREATE EXTERNAL TABLE}} statement. How will the statement look 
like? Here is an example of Hive's {{CREATE EXTERNAL TABLE}} statement:

{code}
CREATE EXTERNAL TABLE weatherext ( wban INT, date STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
LOCATION ‘ /hive/data/weatherext’;
{code}

It seems that combinations of {{ROW FORMAT}} and {{LOCATION}} are the 
effectively same as what you proposed -- but it does not seem to force all 
table sources to be aware of the compositions of connector / converter (i.e., 
{{TableFactory}}, at least at the API level.

Thoughts?

> Create unified interfaces to configure and instatiate TableSources
> --
>
> Key: FLINK-8240
> URL: https://issues.apache.org/jira/browse/FLINK-8240
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> At the moment every table source has different ways for configuration and 
> instantiation. Some table source are tailored to a specific encoding (e.g., 
> {{KafkaAvroTableSource}}, {{KafkaJsonTableSource}}) or only support one 
> encoding for reading (e.g., {{CsvTableSource}}). Each of them might implement 
> a builder or support table source converters for external catalogs.
> The table sources should have a unified interface for discovery, defining 
> common properties, and instantiation. The {{TableSourceConverters}} provide a 
> similar functionality but use an external catalog. We might generialize this 
> interface.
> In general a table source declaration depends on the following parts:
> {code}
> - Source
>   - Type (e.g. Kafka, Custom)
>   - Properties (e.g. topic, connection info)
> - Encoding
>   - Type (e.g. Avro, JSON, CSV)
>   - Schema (e.g. Avro class, JSON field names/types)
> - Rowtime descriptor/Proctime
>   - Watermark strategy and Watermark properties
>   - Time attribute info
> - Bucketization
> {code}
> This issue needs a design document before implementation. Any discussion is 
> very welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7980) Bump joda-time to 2.9.9

2017-11-08 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245232#comment-16245232
 ] 

Haohui Mai commented on FLINK-7980:
---

IMO it is a better idea to get rid of joda-time as all APIs in joda-time has 
been supported in Java 8.

> Bump joda-time to 2.9.9
> ---
>
> Key: FLINK-7980
> URL: https://issues.apache.org/jira/browse/FLINK-7980
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.4.0
>Reporter: Hai Zhou UTC+8
>Assignee: Hai Zhou UTC+8
> Fix For: 1.5.0
>
>
> joda-time is version 2.5(Oct, 2014), bumping to 2.9.9(the latest version). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7548) Support watermark generation for TableSource

2017-10-23 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215758#comment-16215758
 ] 

Haohui Mai commented on FLINK-7548:
---

Sorry for the late response. The APIs should work for our use cases as long as 
the timestamps can be extracted through an expression.

I think [~ykt836] brought up a good point -- it might be tricky to implement 
projection push down in this case. What would be our strategies there?

> Support watermark generation for TableSource
> 
>
> Key: FLINK-7548
> URL: https://issues.apache.org/jira/browse/FLINK-7548
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Jark Wu
>Assignee: Fabian Hueske
>Priority: Blocker
> Fix For: 1.4.0
>
>
> As discussed in FLINK-7446, currently the TableSource only support to define 
> rowtime field, but not support to extract watermarks from the rowtime field. 
> We can provide a new interface called {{DefinedWatermark}}, which has two 
> methods {{getRowtimeAttribute}} (can only be an existing field) and 
> {{getWatermarkGenerator}}. The {{DefinedRowtimeAttribute}} will be marked 
> deprecated.
> How to support periodic and punctuated watermarks and support some built-in 
> strategies needs further discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7051) Bump up Calcite version to 1.14

2017-10-15 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205312#comment-16205312
 ] 

Haohui Mai commented on FLINK-7051:
---

I plan to work on it this week to make sure it happens before the Flink 1.4 
release.



> Bump up Calcite version to 1.14
> ---
>
> Key: FLINK-7051
> URL: https://issues.apache.org/jira/browse/FLINK-7051
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>Priority: Critical
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.14 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6569) flink-table KafkaJsonTableSource example doesn't work

2017-10-10 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai closed FLINK-6569.
-
   Resolution: Invalid
Fix Version/s: (was: 1.4.0)

> flink-table KafkaJsonTableSource example doesn't work
> -
>
> Key: FLINK-6569
> URL: https://issues.apache.org/jira/browse/FLINK-6569
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Haohui Mai
>
> The code example uses 
> {code}
> TypeInformation typeInfo = Types.ROW(
>   new String[] { "id", "name", "score" },
>   new TypeInformation[] { Types.INT(), Types.STRING(), Types.DOUBLE() }
> );
> {code}
> the correct way of using it is something like
> {code}
> TypeInformation typeInfo = Types.ROW_NAMED(
> new String[] { "id", "zip", "date" },
> Types.LONG, Types.INT, Types.SQL_DATE);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7787) Remove guava dependency in the cassandra connector

2017-10-09 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7787:
-

 Summary: Remove guava dependency in the cassandra connector
 Key: FLINK-7787
 URL: https://issues.apache.org/jira/browse/FLINK-7787
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


As discovered in FLINK-6225, the cassandra connector uses the future classes in 
the guava library. We can get rid of the dependency by using the equivalent 
classes provided by Java 8.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7743) Remove the restriction of minimum memory of JM

2017-09-30 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7743:
-

 Summary: Remove the restriction of minimum memory of JM
 Key: FLINK-7743
 URL: https://issues.apache.org/jira/browse/FLINK-7743
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


Per discussion on 
http://mail-archives.apache.org/mod_mbox/flink-user/201709.mbox/%3c4f77255e-1ddb-4e99-a667-73941b110...@apache.org%3E

It might be great to remove the restriction of the minimum heap size of the JM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7594) Add a SQL CLI client

2017-09-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156221#comment-16156221
 ] 

Haohui Mai commented on FLINK-7594:
---

We internally has a project (AthenaX) for this requirement and we are in the 
process of open sourcing it. We are happy to contribute it directly to the 
flink repository as well.


> Add a SQL CLI client
> 
>
> Key: FLINK-7594
> URL: https://issues.apache.org/jira/browse/FLINK-7594
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> At the moment a user can only specify queries within a Java/Scala program 
> which is nice for integrating table programs or parts of it with DataSet or 
> DataStream API. With more connectors coming up, it is time to also provide a 
> programming-free SQL client. The SQL client should consist of a CLI interface 
> and maybe also a REST API. The concrete design is still up for discussion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6563) Expose time indicator attributes in the KafkaTableSource

2017-09-05 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154348#comment-16154348
 ] 

Haohui Mai commented on FLINK-6563:
---

Thanks for the PR. I'll take a look at the PR later today.

> Expose time indicator attributes in the KafkaTableSource
> 
>
> Key: FLINK-6563
> URL: https://issues.apache.org/jira/browse/FLINK-6563
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Critical
> Fix For: 1.4.0
>
>
> This is a follow up for FLINK-5884.
> After FLINK-5884 requires the {{TableSource}} interfaces to expose the 
> processing time and the event time for the data stream. This jira proposes to 
> expose these two information in the Kafka table source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7398) Table API operators/UDFs must not store Logger

2017-08-21 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136280#comment-16136280
 ] 

Haohui Mai commented on FLINK-7398:
---

+1 on logging trait. I'll submit a PR.

Adding a checkstyle rule is also a good idea.


> Table API operators/UDFs must not store Logger
> --
>
> Key: FLINK-7398
> URL: https://issues.apache.org/jira/browse/FLINK-7398
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Aljoscha Krettek
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
> Attachments: Example.png
>
>
> Table API operators and UDFs store a reference to the (slf4j) {{Logger}} in 
> an instance field (c.f. 
> https://github.com/apache/flink/blob/f37988c19adc30d324cde83c54f2fa5d36efb9e7/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/FlatMapRunner.scala#L39).
>  This means that the {{Logger}} will be serialised with the UDF and sent to 
> the cluster. This in itself does not sound right and leads to problems when 
> the slf4j configuration on the Client is different from the cluster 
> environment.
> This is an example of a user running into that problem: 
> https://lists.apache.org/thread.html/01dd44007c0122d60c3fd2b2bb04fd2d6d2114bcff1e34d1d2079522@%3Cuser.flink.apache.org%3E.
>  Here, they have Logback on the client but the Logback classes are not 
> available on the cluster and so deserialisation of the UDFs fails with a 
> {{ClassNotFoundException}}.
> This is a rough list of the involved classes:
> {code}
> src/main/scala/org/apache/flink/table/catalog/ExternalCatalogSchema.scala:43: 
>  private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/catalog/ExternalTableSourceUtil.scala:45:
>   private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/codegen/calls/BuiltInMethods.scala:28:  
> val LOG10 = Types.lookupMethod(classOf[Math], "log10", classOf[Double])
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupAggregate.scala:62:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupWindowAggregate.scala:59:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamOverAggregate.scala:51:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/FlinkConventions.scala:28:  
> val LOGICAL: Convention = new Convention.Impl("LOGICAL", 
> classOf[FlinkLogicalRel])
> src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala:38:  val 
> LOGICAL_OPT_RULES: RuleSet = RuleSets.ofList(
> src/main/scala/org/apache/flink/table/runtime/aggregate/AggregateAggFunction.scala:36:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetAggFunction.scala:43:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetFinalAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetPreAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggregatePreProcessor.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggReduceGroupFunction.scala:66:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideWindowAggReduceGroupFunction.scala:56:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideTimeWindowAggReduceGroupFunction.scala:64:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleCountWindowAggReduceGroupFunction.scala:46:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetWindowAggMapFunction.scala:52:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleTimeWindowAggReduceGroupFunction.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/GroupAggProcessFunction.scala:48:
>   val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/ProcTimeBoundedRowsOver.scala:66:
>   val 

[jira] [Commented] (FLINK-6692) The flink-dist jar contains unshaded netty jar

2017-08-14 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125856#comment-16125856
 ] 

Haohui Mai commented on FLINK-6692:
---

Sounds good. Thanks for the effort!

> The flink-dist jar contains unshaded netty jar
> --
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.4.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-6692) The flink-dist jar contains unshaded netty jar

2017-08-14 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved FLINK-6692.
---
Resolution: Fixed

> The flink-dist jar contains unshaded netty jar
> --
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.4.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7398) Table API operators/UDFs must not store Logger

2017-08-10 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121265#comment-16121265
 ] 

Haohui Mai commented on FLINK-7398:
---

Putting in a companion object is an viable option. My worry, however, is that 
the bug will come back again as it is nontrivial to spot these usages reliably.

Rewriting a lot of code just to fix this issue does not seem very productive. 
[~jark] are there any additional benefits of reimplementing the runtime in Java 
that we might not be aware of?

> Table API operators/UDFs must not store Logger
> --
>
> Key: FLINK-7398
> URL: https://issues.apache.org/jira/browse/FLINK-7398
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Aljoscha Krettek
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> Table API operators and UDFs store a reference to the (slf4j) {{Logger}} in 
> an instance field (c.f. 
> https://github.com/apache/flink/blob/f37988c19adc30d324cde83c54f2fa5d36efb9e7/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/FlatMapRunner.scala#L39).
>  This means that the {{Logger}} will be serialised with the UDF and sent to 
> the cluster. This in itself does not sound right and leads to problems when 
> the slf4j configuration on the Client is different from the cluster 
> environment.
> This is an example of a user running into that problem: 
> https://lists.apache.org/thread.html/01dd44007c0122d60c3fd2b2bb04fd2d6d2114bcff1e34d1d2079522@%3Cuser.flink.apache.org%3E.
>  Here, they have Logback on the client but the Logback classes are not 
> available on the cluster and so deserialisation of the UDFs fails with a 
> {{ClassNotFoundException}}.
> This is a rough list of the involved classes:
> {code}
> src/main/scala/org/apache/flink/table/catalog/ExternalCatalogSchema.scala:43: 
>  private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/catalog/ExternalTableSourceUtil.scala:45:
>   private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/codegen/calls/BuiltInMethods.scala:28:  
> val LOG10 = Types.lookupMethod(classOf[Math], "log10", classOf[Double])
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupAggregate.scala:62:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupWindowAggregate.scala:59:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamOverAggregate.scala:51:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/FlinkConventions.scala:28:  
> val LOGICAL: Convention = new Convention.Impl("LOGICAL", 
> classOf[FlinkLogicalRel])
> src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala:38:  val 
> LOGICAL_OPT_RULES: RuleSet = RuleSets.ofList(
> src/main/scala/org/apache/flink/table/runtime/aggregate/AggregateAggFunction.scala:36:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetAggFunction.scala:43:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetFinalAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetPreAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggregatePreProcessor.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggReduceGroupFunction.scala:66:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideWindowAggReduceGroupFunction.scala:56:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideTimeWindowAggReduceGroupFunction.scala:64:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleCountWindowAggReduceGroupFunction.scala:46:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetWindowAggMapFunction.scala:52:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleTimeWindowAggReduceGroupFunction.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apach

[jira] [Commented] (FLINK-7398) Table API operators/UDFs must not store Logger

2017-08-09 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120536#comment-16120536
 ] 

Haohui Mai commented on FLINK-7398:
---

Good catch!

I think we can fix it for once but I'm more worried that it is going to be a 
recurring issue if we don't have a way to reliably detect and fix it.

> Table API operators/UDFs must not store Logger
> --
>
> Key: FLINK-7398
> URL: https://issues.apache.org/jira/browse/FLINK-7398
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Aljoscha Krettek
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> Table API operators and UDFs store a reference to the (slf4j) {{Logger}} in 
> an instance field (c.f. 
> https://github.com/apache/flink/blob/f37988c19adc30d324cde83c54f2fa5d36efb9e7/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/FlatMapRunner.scala#L39).
>  This means that the {{Logger}} will be serialised with the UDF and sent to 
> the cluster. This in itself does not sound right and leads to problems when 
> the slf4j configuration on the Client is different from the cluster 
> environment.
> This is an example of a user running into that problem: 
> https://lists.apache.org/thread.html/01dd44007c0122d60c3fd2b2bb04fd2d6d2114bcff1e34d1d2079522@%3Cuser.flink.apache.org%3E.
>  Here, they have Logback on the client but the Logback classes are not 
> available on the cluster and so deserialisation of the UDFs fails with a 
> {{ClassNotFoundException}}.
> This is a rough list of the involved classes:
> {code}
> src/main/scala/org/apache/flink/table/catalog/ExternalCatalogSchema.scala:43: 
>  private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/catalog/ExternalTableSourceUtil.scala:45:
>   private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/codegen/calls/BuiltInMethods.scala:28:  
> val LOG10 = Types.lookupMethod(classOf[Math], "log10", classOf[Double])
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupAggregate.scala:62:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupWindowAggregate.scala:59:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamOverAggregate.scala:51:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/FlinkConventions.scala:28:  
> val LOGICAL: Convention = new Convention.Impl("LOGICAL", 
> classOf[FlinkLogicalRel])
> src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala:38:  val 
> LOGICAL_OPT_RULES: RuleSet = RuleSets.ofList(
> src/main/scala/org/apache/flink/table/runtime/aggregate/AggregateAggFunction.scala:36:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetAggFunction.scala:43:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetFinalAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetPreAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggregatePreProcessor.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggReduceGroupFunction.scala:66:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideWindowAggReduceGroupFunction.scala:56:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideTimeWindowAggReduceGroupFunction.scala:64:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleCountWindowAggReduceGroupFunction.scala:46:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetWindowAggMapFunction.scala:52:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleTimeWindowAggReduceGroupFunction.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/GroupAggProcessFunction.scala:48:
>   val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/ProcT

[jira] [Assigned] (FLINK-7398) Table API operators/UDFs must not store Logger

2017-08-09 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-7398:
-

Assignee: Haohui Mai

> Table API operators/UDFs must not store Logger
> --
>
> Key: FLINK-7398
> URL: https://issues.apache.org/jira/browse/FLINK-7398
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.4.0, 1.3.2
>Reporter: Aljoscha Krettek
>Assignee: Haohui Mai
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> Table API operators and UDFs store a reference to the (slf4j) {{Logger}} in 
> an instance field (c.f. 
> https://github.com/apache/flink/blob/f37988c19adc30d324cde83c54f2fa5d36efb9e7/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/FlatMapRunner.scala#L39).
>  This means that the {{Logger}} will be serialised with the UDF and sent to 
> the cluster. This in itself does not sound right and leads to problems when 
> the slf4j configuration on the Client is different from the cluster 
> environment.
> This is an example of a user running into that problem: 
> https://lists.apache.org/thread.html/01dd44007c0122d60c3fd2b2bb04fd2d6d2114bcff1e34d1d2079522@%3Cuser.flink.apache.org%3E.
>  Here, they have Logback on the client but the Logback classes are not 
> available on the cluster and so deserialisation of the UDFs fails with a 
> {{ClassNotFoundException}}.
> This is a rough list of the involved classes:
> {code}
> src/main/scala/org/apache/flink/table/catalog/ExternalCatalogSchema.scala:43: 
>  private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/catalog/ExternalTableSourceUtil.scala:45:
>   private val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/codegen/calls/BuiltInMethods.scala:28:  
> val LOG10 = Types.lookupMethod(classOf[Math], "log10", classOf[Double])
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupAggregate.scala:62:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamGroupWindowAggregate.scala:59:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamOverAggregate.scala:51:
>   private val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/plan/nodes/FlinkConventions.scala:28:  
> val LOGICAL: Convention = new Convention.Impl("LOGICAL", 
> classOf[FlinkLogicalRel])
> src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala:38:  val 
> LOGICAL_OPT_RULES: RuleSet = RuleSets.ofList(
> src/main/scala/org/apache/flink/table/runtime/aggregate/AggregateAggFunction.scala:36:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetAggFunction.scala:43:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetFinalAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetPreAggFunction.scala:44:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggregatePreProcessor.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggReduceGroupFunction.scala:66:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideWindowAggReduceGroupFunction.scala:56:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSlideTimeWindowAggReduceGroupFunction.scala:64:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleCountWindowAggReduceGroupFunction.scala:46:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetWindowAggMapFunction.scala:52:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetTumbleTimeWindowAggReduceGroupFunction.scala:55:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/GroupAggProcessFunction.scala:48:
>   val LOG: Logger = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/ProcTimeBoundedRowsOver.scala:66:
>   val LOG = LoggerFactory.getLogger(this.getClass)
> src/main/scala/org/apache/flink/table/runtime/aggregate/ProcTimeBoundedRangeOver.scala:61:
>   val LOG 

[jira] [Commented] (FLINK-7357) HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY HOP window

2017-08-09 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120511#comment-16120511
 ] 

Haohui Mai commented on FLINK-7357:
---

I haven't started yet. Please go ahead.

> HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY 
> HOP window
> -
>
> Key: FLINK-7357
> URL: https://issues.apache.org/jira/browse/FLINK-7357
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Rong Rong
>Assignee: Haohui Mai
>
> The following SQL does not compile:
> {code:title=invalid_having_hop_start_sql}
> SELECT 
>   c AS k, 
>   COUNT(a) AS v, 
>   HOP_START(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS 
> windowStart, 
>   HOP_END(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS windowEnd 
> FROM 
>   T1 
> GROUP BY 
>   HOP(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE), 
>   c 
> HAVING 
>   SUM(b) > 1
> {code}
> While individually keeping HAVING clause or HOP_START field compiles and runs 
> without issue.
> more details: 
> https://github.com/apache/flink/compare/master...walterddr:having_does_not_work_with_hop_start_end



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-7159) Semantics of OVERLAPS in Table API diverge from the SQL standard

2017-08-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved FLINK-7159.
---
Resolution: Fixed

This has been resolved as a part of FLINK-6429.

> Semantics of OVERLAPS in Table API diverge from the SQL standard
> 
>
> Key: FLINK-7159
> URL: https://issues.apache.org/jira/browse/FLINK-7159
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> According to http://web.cecs.pdx.edu/~len/sql1999.pdf
> ISO/IEC 9075-2:1999 (E) ©ISO/IEC, 8.12 
> {noformat}
> The result of the  is the result of the following 
> expression:
> ( S1 > S2 AND NOT ( S1 >= T2 AND T1 >= T2 ) )
> OR
> ( S2 > S1 AND NOT ( S2 >= T1 AND T2 >= T1 ) )
> OR
> ( S1 = S2 AND ( T1 <> T2 OR T1 = T2 ) )
> {noformat}
> The Table API diverges from this semantic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7051) Bump up Calcite version to 1.14

2017-08-08 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-7051:
-

Assignee: Haohui Mai

> Bump up Calcite version to 1.14
> ---
>
> Key: FLINK-7051
> URL: https://issues.apache.org/jira/browse/FLINK-7051
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.14 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7392) Enable more predicate push-down in joins

2017-08-08 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7392:
-

 Summary: Enable more predicate push-down in joins
 Key: FLINK-7392
 URL: https://issues.apache.org/jira/browse/FLINK-7392
 Project: Flink
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai


This is a follow-up of FLINK-6429.

As a quick workaround to prevent pushing down projections for time indicators, 
FLINK-6429 reverts the behavior of {{ProjectJoinTransposeRule}} back to the one 
in Calcite 1.12.

As [~jark] suggested in FLINK-6429, we can selectively disable the push down 
for time indicators in {{ProjectJoinTransposeRule}}. This jira tracks the 
effort of implement the suggestion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6429) Bump up Calcite version to 1.13

2017-08-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117500#comment-16117500
 ] 

Haohui Mai commented on FLINK-6429:
---

bq. In order to change the default behavior of the rule, the only thing we need 
to do is to create a new `ProjectJoinTransposeRule` with a custom 
`PushProjector.ExprCondition` which filters prevent time indicator rex nodes.

That's a good suggestion. I created the rules based on Calcite 1.12 in the 
current PR (hopefully unblock other changes). I'll incorporate the suggestion 
in a later PR.

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6429) Bump up Calcite version to 1.13

2017-08-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117328#comment-16117328
 ] 

Haohui Mai commented on FLINK-6429:
---

It's difficult to say this is a pure Calcite bug. Technically SQL allows 
special characters like '-' in the column name while the Table API rejects them.

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-7357) HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY HOP window

2017-08-03 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-7357:
-

Assignee: Haohui Mai

> HOP_START() HOP_END() does not work when using HAVING clause with GROUP BY 
> HOP window
> -
>
> Key: FLINK-7357
> URL: https://issues.apache.org/jira/browse/FLINK-7357
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.1
>Reporter: Rong Rong
>Assignee: Haohui Mai
>
> The following SQL does not compile:
> {code:title=invalid_having_hop_start_sql}
> SELECT 
>   c AS k, 
>   COUNT(a) AS v, 
>   HOP_START(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS 
> windowStart, 
>   HOP_END(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE) AS windowEnd 
> FROM 
>   T1 
> GROUP BY 
>   HOP(rowtime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE), 
>   c 
> HAVING 
>   SUM(b) > 1
> {code}
> While individually keeping HAVING clause or HOP_START field compiles and runs 
> without issue.
> more details: 
> https://github.com/apache/flink/compare/master...walterddr:having_does_not_work_with_hop_start_end



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7344) Migrate usage of joda-time to the Java 8 DateTime API

2017-08-02 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7344:
-

 Summary: Migrate usage of joda-time to the Java 8 DateTime API
 Key: FLINK-7344
 URL: https://issues.apache.org/jira/browse/FLINK-7344
 Project: Flink
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai


As the minimum Java version of Flink has been upgraded to 1.8, it is a good 
time to migrate all usage of the joda-time package to the native Java 8 
DateTime API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7237) Remove DateTimeUtils from Flink once Calcite is upgraded to 1.14

2017-07-19 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7237:
-

 Summary: Remove DateTimeUtils from Flink once Calcite is upgraded 
to 1.14
 Key: FLINK-7237
 URL: https://issues.apache.org/jira/browse/FLINK-7237
 Project: Flink
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7236) Bump up the Calcite version to 1.14

2017-07-19 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7236:
-

 Summary: Bump up the Calcite version to 1.14
 Key: FLINK-7236
 URL: https://issues.apache.org/jira/browse/FLINK-7236
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


This is the umbrella task to coordinate tasks to upgrade Calcite to 1.14.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7235) Backport CALCITE-1884 to the Flink repository before Calcite 1.14

2017-07-19 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7235:
-

 Summary: Backport CALCITE-1884 to the Flink repository before 
Calcite 1.14
 Key: FLINK-7235
 URL: https://issues.apache.org/jira/browse/FLINK-7235
 Project: Flink
  Issue Type: Sub-task
Reporter: Haohui Mai
Assignee: Haohui Mai


We need to backport CALCITE-1884 in order to unblock upgrading Calcite to 1.13.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6429) Bump up Calcite version to 1.13

2017-07-18 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091283#comment-16091283
 ] 

Haohui Mai commented on FLINK-6429:
---

Sorry for the delay. There are several issues that need to be addressed in the 
upgrade. I'll work on them first. I also have a WIP that I will upload shortly.

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (FLINK-6429) Bump up Calcite version to 1.13

2017-07-18 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6429:
-

Assignee: Haohui Mai  (was: Timo Walther)

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7159) Semantics of OVERLAPS in Table API diverge from the SQL standard

2017-07-11 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-7159:
-

 Summary: Semantics of OVERLAPS in Table API diverge from the SQL 
standard
 Key: FLINK-7159
 URL: https://issues.apache.org/jira/browse/FLINK-7159
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


According to http://web.cecs.pdx.edu/~len/sql1999.pdf

ISO/IEC 9075-2:1999 (E) ©ISO/IEC, 8.12 

{noformat}
The result of the  is the result of the following 
expression:
( S1 > S2 AND NOT ( S1 >= T2 AND T1 >= T2 ) )
OR
( S2 > S1 AND NOT ( S2 >= T1 AND T2 >= T1 ) )
OR
( S1 = S2 AND ( T1 <> T2 OR T1 = T2 ) )
{noformat}

The Table API diverges from this semantic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6429) Bump up Calcite version to 1.13

2017-07-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083277#comment-16083277
 ] 

Haohui Mai commented on FLINK-6429:
---

[~twalthr] that would be great. I found a few issues and I'm adding it to this 
jira

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6429) Bump up Calcite version to 1.13

2017-07-11 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081820#comment-16081820
 ] 

Haohui Mai commented on FLINK-6429:
---

I have a patch and will upload it shortly.

> Bump up Calcite version to 1.13
> ---
>
> Key: FLINK-6429
> URL: https://issues.apache.org/jira/browse/FLINK-6429
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> This is an umbrella issue for all tasks that need to be done once Apache 
> Calcite 1.13 is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6568) flink-table doesn't work without flink-streaming-scala dependency

2017-06-16 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai closed FLINK-6568.
-
Resolution: Duplicate

> flink-table doesn't work without flink-streaming-scala dependency
> -
>
> Key: FLINK-6568
> URL: https://issues.apache.org/jira/browse/FLINK-6568
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Haohui Mai
> Fix For: 1.4.0
>
>
> In my user application, I got errors because I didn't have the 
> flink-streaming-scala dependency defined in my user code (and I'm using Java).
> The documentation should be updated or flink-streaming-scala should not be a 
> provided dependency.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-6749) Table API / SQL Docs: SQL Page

2017-06-12 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved FLINK-6749.
---
Resolution: Fixed

> Table API / SQL Docs: SQL Page
> --
>
> Key: FLINK-6749
> URL: https://issues.apache.org/jira/browse/FLINK-6749
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Fabian Hueske
>Assignee: Haohui Mai
>
> Update and refine {{./docs/dev/table/sql.md}} in feature branch 
> https://github.com/apache/flink/tree/tableDocs



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-6693) Support DATE_FORMAT function in the Table / SQL API

2017-06-05 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6693:
--
Summary: Support DATE_FORMAT function in the Table / SQL API  (was: Support 
DATE_FORMAT function in the SQL API)

> Support DATE_FORMAT function in the Table / SQL API
> ---
>
> Key: FLINK-6693
> URL: https://issues.apache.org/jira/browse/FLINK-6693
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> It would be quite handy to support the {{DATE_FORMAT}} function in Flink to 
> support various date / time related operations:
> The specification of the {{DATE_FORMAT}} function can be found in 
> https://prestodb.io/docs/current/functions/datetime.html.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6810) Add Some built-in Scalar Function supported

2017-06-05 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037492#comment-16037492
 ] 

Haohui Mai commented on FLINK-6810:
---

Thanks for creating these jira. They are quite useful.

Is it possible to share the timeline of when they will be implemented? Thanks.

> Add Some built-in Scalar Function supported
> ---
>
> Key: FLINK-6810
> URL: https://issues.apache.org/jira/browse/FLINK-6810
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>
> In this JIRA, will create some sub-task for add specific scalar function, 
> such as mathematical-function {{LOG}}, date-functions
>  {{DATEADD}},string-functions {{LPAD}}, etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-5252) Migrate all YARN related configurations to ConfigOptions

2017-06-05 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-5252:
-

Assignee: Haohui Mai

> Migrate all YARN related configurations to ConfigOptions
> 
>
> Key: FLINK-5252
> URL: https://issues.apache.org/jira/browse/FLINK-5252
> Project: Flink
>  Issue Type: Sub-task
>  Components: YARN
> Environment: {{flip-6}} feature branch
>Reporter: Stephan Ewen
>Assignee: Haohui Mai
>  Labels: flip-6
>
> Since a few months, we started using a more elaborate way of defining 
> configuration options, together with default values and deprecated keys.
> The options are defined via the {{ConfigOptions}} class. A good example how 
> to use it is in the {{HigHAvailabilityOptions}}.
> All Yarn configuration options should be defined that way as we..



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6780) ExternalTableSource should add time attributes in the row type

2017-05-31 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6780:
--
Summary: ExternalTableSource should add time attributes in the row type  
(was: ExternalTableSource fails to add the processing time and the event time 
attribute in the row type)

> ExternalTableSource should add time attributes in the row type
> --
>
> Key: FLINK-6780
> URL: https://issues.apache.org/jira/browse/FLINK-6780
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Critical
>
> We observed that all streaming queries that refer to external tables fail 
> when the Volcano planner converting {{LogicalTableScan}} to 
> {{FlinkLogicalTableSourceScan}}:
> {noformat}
> Type mismatch:
> rowtype of new rel:
> RecordType(, TIMESTAMP(3) NOT NULL proctime) NOT NULL
> rowtype of set:
> RecordType(, ...) NOT NULL
> {noformat}
> Tables that are registered through 
> {{StreamTableEnvironment#registerTableSource()}} do not suffer from this 
> problem as {{StreamTableSourceTable}} adds the processing time / event time 
> attribute automatically.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5354) Split up Table API documentation into multiple pages

2017-05-30 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030728#comment-16030728
 ] 

Haohui Mai commented on FLINK-5354:
---

Thanks [~fhueske]. I will work on the SQL page.

> Split up Table API documentation into multiple pages 
> -
>
> Key: FLINK-5354
> URL: https://issues.apache.org/jira/browse/FLINK-5354
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Table API & SQL
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> The Table API documentation page is quite large at the moment. We should 
> split it up into multiple pages:
> Here is my suggestion:
> - Overview (Datatypes, Config, Registering Tables, Examples)
> - TableSources and Sinks
> - Table API
> - SQL
> - Functions



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6749) Table API / SQL Docs: SQL Page

2017-05-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6749:
-

Assignee: Haohui Mai

> Table API / SQL Docs: SQL Page
> --
>
> Key: FLINK-6749
> URL: https://issues.apache.org/jira/browse/FLINK-6749
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Fabian Hueske
>Assignee: Haohui Mai
>
> Update and refine {{./docs/dev/table/sql.md}} in feature branch 
> https://github.com/apache/flink/tree/tableDocs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6780) ExternalTableSource fails to add the processing time and the event time attribute in the row type

2017-05-30 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6780:
-

 Summary: ExternalTableSource fails to add the processing time and 
the event time attribute in the row type
 Key: FLINK-6780
 URL: https://issues.apache.org/jira/browse/FLINK-6780
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
Priority: Critical


We observed that all streaming queries that refer to external tables fail when 
the Volcano planner converting {{LogicalTableScan}} to 
{{FlinkLogicalTableSourceScan}}:

{noformat}
Type mismatch:
rowtype of new rel:
RecordType(, TIMESTAMP(3) NOT NULL proctime) NOT NULL
rowtype of set:
RecordType(, ...) NOT NULL
{noformat}

Tables that are registered through 
{{StreamTableEnvironment#registerTableSource()}} do not suffer from this 
problem as {{StreamTableSourceTable}} adds the processing time / event time 
attribute automatically.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6692) The flink-dist jar contains unshaded netty jar

2017-05-24 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023587#comment-16023587
 ] 

Haohui Mai commented on FLINK-6692:
---

The issue also presents in Flink 1.2.0.

> The flink-dist jar contains unshaded netty jar
> --
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6693) Support DATE_FORMAT function in the SQL API

2017-05-23 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6693:
-

 Summary: Support DATE_FORMAT function in the SQL API
 Key: FLINK-6693
 URL: https://issues.apache.org/jira/browse/FLINK-6693
 Project: Flink
  Issue Type: New Feature
  Components: Table API & SQL
Reporter: Haohui Mai
Assignee: Haohui Mai


It would be quite handy to support the {{DATE_FORMAT}} function in Flink to 
support various date / time related operations:

The specification of the {{DATE_FORMAT}} function can be found in 
https://prestodb.io/docs/current/functions/datetime.html.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6672) Support CAST(timestamp AS BIGINT)

2017-05-23 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6672:
-

Assignee: Haohui Mai

> Support CAST(timestamp AS BIGINT)
> -
>
> Key: FLINK-6672
> URL: https://issues.apache.org/jira/browse/FLINK-6672
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> It is not possible to cast a TIMESTAMP, TIME, or DATE to BIGINT, INT, INT in 
> SQL. The Table API and the code generation support this, but the SQL 
> validation seems to prohibit it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6692) The flink-dist jar contains unshaded netty jar

2017-05-23 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6692:
--
Summary: The flink-dist jar contains unshaded netty jar  (was: The 
flink-dist jar contains unshaded nettyjar)

> The flink-dist jar contains unshaded netty jar
> --
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6692) The flink-dist jar contains unshaded nettyjar

2017-05-23 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6692:
-

 Summary: The flink-dist jar contains unshaded nettyjar
 Key: FLINK-6692
 URL: https://issues.apache.org/jira/browse/FLINK-6692
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 1.3.0


The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:

{noformat}
io/netty/handler/codec/http/router/
io/netty/handler/codec/http/router/BadClientSilencer.class
io/netty/handler/codec/http/router/MethodRouted.class
io/netty/handler/codec/http/router/Handler.class
io/netty/handler/codec/http/router/Router.class
io/netty/handler/codec/http/router/DualMethodRouter.class
io/netty/handler/codec/http/router/Routed.class
io/netty/handler/codec/http/router/AbstractHandler.class
io/netty/handler/codec/http/router/KeepAliveWrite.class
io/netty/handler/codec/http/router/DualAbstractHandler.class
io/netty/handler/codec/http/router/MethodRouter.class
{noformat}

{noformat}
org/jboss/netty/util/internal/jzlib/InfBlocks.class
org/jboss/netty/util/internal/jzlib/InfCodes.class
org/jboss/netty/util/internal/jzlib/InfTree.class
org/jboss/netty/util/internal/jzlib/Inflate$1.class
org/jboss/netty/util/internal/jzlib/Inflate.class
org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
org/jboss/netty/util/internal/jzlib/JZlib.class
org/jboss/netty/util/internal/jzlib/StaticTree.class
org/jboss/netty/util/internal/jzlib/Tree.class
org/jboss/netty/util/internal/jzlib/ZStream$1.class
org/jboss/netty/util/internal/jzlib/ZStream.class
{noformat}

Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6692) The flink-dist jar contains unshaded nettyjar

2017-05-23 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6692:
--
Component/s: Build System

> The flink-dist jar contains unshaded nettyjar
> -
>
> Key: FLINK-6692
> URL: https://issues.apache.org/jira/browse/FLINK-6692
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> The {{flink-dist}} jar contains unshaded netty 3 and netty 4 classes:
> {noformat}
> io/netty/handler/codec/http/router/
> io/netty/handler/codec/http/router/BadClientSilencer.class
> io/netty/handler/codec/http/router/MethodRouted.class
> io/netty/handler/codec/http/router/Handler.class
> io/netty/handler/codec/http/router/Router.class
> io/netty/handler/codec/http/router/DualMethodRouter.class
> io/netty/handler/codec/http/router/Routed.class
> io/netty/handler/codec/http/router/AbstractHandler.class
> io/netty/handler/codec/http/router/KeepAliveWrite.class
> io/netty/handler/codec/http/router/DualAbstractHandler.class
> io/netty/handler/codec/http/router/MethodRouter.class
> {noformat}
> {noformat}
> org/jboss/netty/util/internal/jzlib/InfBlocks.class
> org/jboss/netty/util/internal/jzlib/InfCodes.class
> org/jboss/netty/util/internal/jzlib/InfTree.class
> org/jboss/netty/util/internal/jzlib/Inflate$1.class
> org/jboss/netty/util/internal/jzlib/Inflate.class
> org/jboss/netty/util/internal/jzlib/JZlib$WrapperType.class
> org/jboss/netty/util/internal/jzlib/JZlib.class
> org/jboss/netty/util/internal/jzlib/StaticTree.class
> org/jboss/netty/util/internal/jzlib/Tree.class
> org/jboss/netty/util/internal/jzlib/ZStream$1.class
> org/jboss/netty/util/internal/jzlib/ZStream.class
> {noformat}
> Is it an expected behavior?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (FLINK-6595) Nested SQL queries do not expose proctime / rowtime attributes

2017-05-18 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved FLINK-6595.
---
Resolution: Duplicate

> Nested SQL queries do not expose proctime / rowtime attributes
> --
>
> Key: FLINK-6595
> URL: https://issues.apache.org/jira/browse/FLINK-6595
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> We found out that the group windows cannot be applied with nested queries 
> out-of-the-box:
> {noformat}
> SELECT * FROM (
>   (SELECT ...)
> UNION ALL)
>   (SELECT ...)
> ) GROUP BY foo, TUMBLE(proctime, ...)
> {noformat}
> Flink complains about {{proctime}} is undefined.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6595) Nested SQL queries do not expose proctime / rowtime attributes

2017-05-18 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016602#comment-16016602
 ] 

Haohui Mai commented on FLINK-6595:
---

This is addressed by FLINK-6483 via exposing the proctime in the row. Closing 
as a duplicate.

> Nested SQL queries do not expose proctime / rowtime attributes
> --
>
> Key: FLINK-6595
> URL: https://issues.apache.org/jira/browse/FLINK-6595
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> We found out that the group windows cannot be applied with nested queries 
> out-of-the-box:
> {noformat}
> SELECT * FROM (
>   (SELECT ...)
> UNION ALL)
>   (SELECT ...)
> ) GROUP BY foo, TUMBLE(proctime, ...)
> {noformat}
> Flink complains about {{proctime}} is undefined.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6605) Allow users to specify a default name for processing time

2017-05-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013242#comment-16013242
 ] 

Haohui Mai commented on FLINK-6605:
---

Discussed with [~fhueske] offline. The proposed solution is to add a 
configuration in {{TableConfig}} to specify a default column name for 
processing time in order to unblock 1.3.

Note that this jira only covers the case of processing time. The case of event 
time is much more complicated which will be covered in another jira.

> Allow users to specify a default name for processing time
> -
>
> Key: FLINK-6605
> URL: https://issues.apache.org/jira/browse/FLINK-6605
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> FLINK-5884 enables users to specify column names for both processing time and 
> event time. FLINK-6595 and FLINK-6584 breaks as chained / nested queries will 
> no longer have an attribute of processing time / event time.
> This jira proposes to add a default name for the processing time in order to 
> unbreak FLINK-6595 and FLINK-6584.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6605) Allow users to specify a default name for processing time

2017-05-16 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6605:
-

 Summary: Allow users to specify a default name for processing time
 Key: FLINK-6605
 URL: https://issues.apache.org/jira/browse/FLINK-6605
 Project: Flink
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai


FLINK-5884 enables users to specify column names for both processing time and 
event time. FLINK-6595 and FLINK-6584 breaks as chained / nested queries will 
no longer have an attribute of processing time / event time.

This jira proposes to add a default name for the processing time in order to 
unbreak FLINK-6595 and FLINK-6584.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6605) Allow users to specify a default name for processing time

2017-05-16 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6605:
--
Fix Version/s: 1.3.0
  Component/s: Table API & SQL

> Allow users to specify a default name for processing time
> -
>
> Key: FLINK-6605
> URL: https://issues.apache.org/jira/browse/FLINK-6605
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> FLINK-5884 enables users to specify column names for both processing time and 
> event time. FLINK-6595 and FLINK-6584 breaks as chained / nested queries will 
> no longer have an attribute of processing time / event time.
> This jira proposes to add a default name for the processing time in order to 
> unbreak FLINK-6595 and FLINK-6584.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6595) Nested SQL queries do not expose proctime / rowtime attributes

2017-05-15 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6595:
--
Component/s: Table API & SQL

> Nested SQL queries do not expose proctime / rowtime attributes
> --
>
> Key: FLINK-6595
> URL: https://issues.apache.org/jira/browse/FLINK-6595
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> We found out that the group windows cannot be applied with nested queries 
> out-of-the-box:
> {noformat}
> SELECT * FROM (
>   (SELECT ...)
> UNION ALL)
>   (SELECT ...)
> ) GROUP BY foo, TUMBLE(proctime, ...)
> {noformat}
> Flink complains about {{proctime}} is undefined.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6595) Nested SQL queries do not expose proctime / rowtime attributes

2017-05-15 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6595:
-

 Summary: Nested SQL queries do not expose proctime / rowtime 
attributes
 Key: FLINK-6595
 URL: https://issues.apache.org/jira/browse/FLINK-6595
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 1.3.0


We found out that the group windows cannot be applied with nested queries 
out-of-the-box:

{noformat}
SELECT * FROM (
  (SELECT ...)
UNION ALL)
  (SELECT ...)
) GROUP BY foo, TUMBLE(proctime, ...)
{noformat}

Flink complains about {{proctime}} is undefined.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6574) Support nested catalogs in ExternalCatalog

2017-05-15 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6574:
--
Summary: Support nested catalogs in ExternalCatalog  (was: External catalog 
should support a single level catalog)

> Support nested catalogs in ExternalCatalog
> --
>
> Key: FLINK-6574
> URL: https://issues.apache.org/jira/browse/FLINK-6574
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Critical
> Fix For: 1.3.0
>
>
> We found out that the current external catalog requires three layers of 
> references for any tables. For example, the SQL would look like the following 
> when referencing external table:
> {noformat}
> SELECT * FROM catalog.db.table
> {noformat}
> It would be great to support only two layers of indirections which is closer 
> to many of the deployment on Presto / Hive today.
> {noformat}
> SELECT * FROM db.table
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6574) External catalog should support a single level catalog

2017-05-15 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6574:
--
Priority: Critical  (was: Major)

> External catalog should support a single level catalog
> --
>
> Key: FLINK-6574
> URL: https://issues.apache.org/jira/browse/FLINK-6574
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Critical
> Fix For: 1.3.0
>
>
> We found out that the current external catalog requires three layers of 
> references for any tables. For example, the SQL would look like the following 
> when referencing external table:
> {noformat}
> SELECT * FROM catalog.db.table
> {noformat}
> It would be great to support only two layers of indirections which is closer 
> to many of the deployment on Presto / Hive today.
> {noformat}
> SELECT * FROM db.table
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6574) External catalog should support a single level catalog

2017-05-15 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011516#comment-16011516
 ] 

Haohui Mai commented on FLINK-6574:
---

Raising the priority to critical as it introduces API changes of 1.3.

> External catalog should support a single level catalog
> --
>
> Key: FLINK-6574
> URL: https://issues.apache.org/jira/browse/FLINK-6574
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Critical
> Fix For: 1.3.0
>
>
> We found out that the current external catalog requires three layers of 
> references for any tables. For example, the SQL would look like the following 
> when referencing external table:
> {noformat}
> SELECT * FROM catalog.db.table
> {noformat}
> It would be great to support only two layers of indirections which is closer 
> to many of the deployment on Presto / Hive today.
> {noformat}
> SELECT * FROM db.table
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6529) Rework the shading model in Flink

2017-05-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008891#comment-16008891
 ] 

Haohui Mai commented on FLINK-6529:
---

Will do. Thanks!

> Rework the shading model in Flink
> -
>
> Key: FLINK-6529
> URL: https://issues.apache.org/jira/browse/FLINK-6529
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Stephan Ewen
>Assignee: Haohui Mai
>Priority: Critical
>
> h2. Problem
> Currently, Flink shades dependencies like ASM and Guava into all jars of 
> projects that reference it and relocate the classes.
> There are some drawbacks to that approach, let's discuss them at the example 
> of ASM:
>   - The ASM classes are for example in {{flink-core}}, {{flink-java}}, 
> {{flink-scala}}, {{flink-runtime}}, etc.
>   - Users that reference these dependencies have the classes multiple times 
> in the classpath. That is unclean (works, through, because the classes are 
> identical). The same happens when building the final dist. jar.
>   - Some of these dependencies require to include license files in the shaded 
> jar. It is hard to impossible to build a good automatic solution for that, 
> partly due to Maven's very poor cross-project path support
>   - Scala does not support shading really well. Scala classes have references 
> to classes in more places than just the class names (apparently for Scala 
> reflect support). Referencing a Scala project with shaded ASM still requires 
> to add a reference to unshaded ASM (at least as a compile dependency).
> h2. Proposal
> I propose that we build and deploy a {{asm-flink-shaded}} version of ASM and 
> directly program against the relocated namespaces. Since we never use classes 
> that we relocate in public interfaces, Flink users will never see the 
> relocated class names. Internally, it does not hurt to use them.
>   - Proper maven dependency management, no hidden (shaded) dependencies
>   - one copy of each dependency
>   - proper Scala interoperability
>   - no clumsy license management (license is in the deployed 
> {{asm-flink-shaded}})



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6574) External catalog should support a single level catalog

2017-05-12 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6574:
-

 Summary: External catalog should support a single level catalog
 Key: FLINK-6574
 URL: https://issues.apache.org/jira/browse/FLINK-6574
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: Haohui Mai
Assignee: Haohui Mai
 Fix For: 1.3.0


We found out that the current external catalog requires three layers of 
references for any tables. For example, the SQL would look like the following 
when referencing external table:

{noformat}
SELECT * FROM catalog.db.table
{noformat}

It would be great to support only two layers of indirections which is closer to 
many of the deployment on Presto / Hive today.

{noformat}
SELECT * FROM db.table
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6568) flink-table doesn't work without flink-streaming-scala dependency

2017-05-12 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008821#comment-16008821
 ] 

Haohui Mai commented on FLINK-6568:
---

Hit the same issue multiple times when testing out 1.3. Are we good to just 
update the documentation?

> flink-table doesn't work without flink-streaming-scala dependency
> -
>
> Key: FLINK-6568
> URL: https://issues.apache.org/jira/browse/FLINK-6568
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> In my user application, I got errors because I didn't have the 
> flink-streaming-scala dependency defined in my user code (and I'm using Java).
> The documentation should be updated or flink-streaming-scala should not be a 
> provided dependency.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6569) flink-table KafkaJsonTableSource example doesn't work

2017-05-12 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6569:
-

Assignee: Haohui Mai

> flink-table KafkaJsonTableSource example doesn't work
> -
>
> Key: FLINK-6569
> URL: https://issues.apache.org/jira/browse/FLINK-6569
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> The code example uses 
> {code}
> TypeInformation typeInfo = Types.ROW(
>   new String[] { "id", "name", "score" },
>   new TypeInformation[] { Types.INT(), Types.STRING(), Types.DOUBLE() }
> );
> {code}
> the correct way of using it is something like
> {code}
> TypeInformation typeInfo = Types.ROW_NAMED(
> new String[] { "id", "zip", "date" },
> Types.LONG, Types.INT, Types.SQL_DATE);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6568) flink-table doesn't work without flink-streaming-scala dependency

2017-05-12 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6568:
-

Assignee: Haohui Mai

> flink-table doesn't work without flink-streaming-scala dependency
> -
>
> Key: FLINK-6568
> URL: https://issues.apache.org/jira/browse/FLINK-6568
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Robert Metzger
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> In my user application, I got errors because I didn't have the 
> flink-streaming-scala dependency defined in my user code (and I'm using Java).
> The documentation should be updated or flink-streaming-scala should not be a 
> provided dependency.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6563) Expose time indicator attributes in the KafkaTableSource

2017-05-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6563:
--
   Priority: Critical  (was: Major)
Component/s: Table API & SQL

> Expose time indicator attributes in the KafkaTableSource
> 
>
> Key: FLINK-6563
> URL: https://issues.apache.org/jira/browse/FLINK-6563
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>Priority: Critical
>
> This is a follow up for FLINK-5884.
> After FLINK-5884 requires the {{TableSource}} interfaces to expose the 
> processing time and the event time for the data stream. This jira proposes to 
> expose these two information in the Kafka table source.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6563) Expose time indicator attributes in the KafkaTableSource

2017-05-11 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6563:
-

 Summary: Expose time indicator attributes in the KafkaTableSource
 Key: FLINK-6563
 URL: https://issues.apache.org/jira/browse/FLINK-6563
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


This is a follow up for FLINK-5884.

After FLINK-5884 requires the {{TableSource}} interfaces to expose the 
processing time and the event time for the data stream. This jira proposes to 
expose these two information in the Kafka table source.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6562) Support implicit table references for nested fields in SQL

2017-05-11 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6562:
-

 Summary: Support implicit table references for nested fields in SQL
 Key: FLINK-6562
 URL: https://issues.apache.org/jira/browse/FLINK-6562
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


Currently nested fields can only be accessed through fully qualified 
identifiers. For example, users need to specify the following query for the 
table {{f}} that has a nested field {{foo.bar}}

{noformat}
SELECT f.foo.bar FROM f
{noformat}

Other query engines like Hive / Presto supports implicit table references. For 
example:

{noformat}
SELECT foo.bar FROM f
{noformat}

This jira proposes to support the latter syntax in the SQL API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6562) Support implicit table references for nested fields in SQL

2017-05-11 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6562:
--
Component/s: Table API & SQL

> Support implicit table references for nested fields in SQL
> --
>
> Key: FLINK-6562
> URL: https://issues.apache.org/jira/browse/FLINK-6562
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> Currently nested fields can only be accessed through fully qualified 
> identifiers. For example, users need to specify the following query for the 
> table {{f}} that has a nested field {{foo.bar}}
> {noformat}
> SELECT f.foo.bar FROM f
> {noformat}
> Other query engines like Hive / Presto supports implicit table references. 
> For example:
> {noformat}
> SELECT foo.bar FROM f
> {noformat}
> This jira proposes to support the latter syntax in the SQL API.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6529) Rework the shading model in Flink

2017-05-10 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005350#comment-16005350
 ] 

Haohui Mai commented on FLINK-6529:
---

That makes sense as we are being hit by this problem quite hard (especially 
guava). I can take this up.

> Rework the shading model in Flink
> -
>
> Key: FLINK-6529
> URL: https://issues.apache.org/jira/browse/FLINK-6529
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Stephan Ewen
>Assignee: Haohui Mai
>Priority: Critical
>
> h2. Problem
> Currently, Flink shades dependencies like ASM and Guava into all jars of 
> projects that reference it and relocate the classes.
> There are some drawbacks to that approach, let's discuss them at the example 
> of ASM:
>   - The ASM classes are for example in {{flink-core}}, {{flink-java}}, 
> {{flink-scala}}, {{flink-runtime}}, etc.
>   - Users that reference these dependencies have the classes multiple times 
> in the classpath. That is unclean (works, through, because the classes are 
> identical). The same happens when building the final dist. jar.
>   - Some of these dependencies require to include license files in the shaded 
> jar. It is hard to impossible to build a good automatic solution for that, 
> partly due to Maven's very poor cross-project path support
>   - Scala does not support shading really well. Scala classes have references 
> to classes in more places than just the class names (apparently for Scala 
> reflect support). Referencing a Scala project with shaded ASM still requires 
> to add a reference to unshaded ASM (at least as a compile dependency).
> h2. Proposal
> I propose that we build and deploy a {{asm-flink-shaded}} version of ASM and 
> directly program against the relocated namespaces. Since we never use classes 
> that we relocate in public interfaces, Flink users will never see the 
> relocated class names. Internally, it does not hurt to use them.
>   - Proper maven dependency management, no hidden (shaded) dependencies
>   - one copy of each dependency
>   - proper Scala interoperability
>   - no clumsy license management (license is in the deployed 
> {{asm-flink-shaded}})



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6529) Rework the shading model in Flink

2017-05-10 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6529:
-

Assignee: Haohui Mai

> Rework the shading model in Flink
> -
>
> Key: FLINK-6529
> URL: https://issues.apache.org/jira/browse/FLINK-6529
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.3.0, 1.2.1
>Reporter: Stephan Ewen
>Assignee: Haohui Mai
>Priority: Critical
>
> h2. Problem
> Currently, Flink shades dependencies like ASM and Guava into all jars of 
> projects that reference it and relocate the classes.
> There are some drawbacks to that approach, let's discuss them at the example 
> of ASM:
>   - The ASM classes are for example in {{flink-core}}, {{flink-java}}, 
> {{flink-scala}}, {{flink-runtime}}, etc.
>   - Users that reference these dependencies have the classes multiple times 
> in the classpath. That is unclean (works, through, because the classes are 
> identical). The same happens when building the final dist. jar.
>   - Some of these dependencies require to include license files in the shaded 
> jar. It is hard to impossible to build a good automatic solution for that, 
> partly due to Maven's very poor cross-project path support
>   - Scala does not support shading really well. Scala classes have references 
> to classes in more places than just the class names (apparently for Scala 
> reflect support). Referencing a Scala project with shaded ASM still requires 
> to add a reference to unshaded ASM (at least as a compile dependency).
> h2. Proposal
> I propose that we build and deploy a {{asm-flink-shaded}} version of ASM and 
> directly program against the relocated namespaces. Since we never use classes 
> that we relocate in public interfaces, Flink users will never see the 
> relocated class names. Internally, it does not hurt to use them.
>   - Proper maven dependency management, no hidden (shaded) dependencies
>   - one copy of each dependency
>   - proper Scala interoperability
>   - no clumsy license management (license is in the deployed 
> {{asm-flink-shaded}})



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6335) Parse DISTINCT over grouped window in stream SQL

2017-05-05 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997937#comment-15997937
 ] 

Haohui Mai commented on FLINK-6335:
---

No problem -- will do it next time.

> Parse DISTINCT over grouped window in stream SQL
> 
>
> Key: FLINK-6335
> URL: https://issues.apache.org/jira/browse/FLINK-6335
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> The SQL on the batch side supports the {{DISTINCT}} keyword over aggregation. 
> This jira proposes to support the {{DISTINCT}} keyword on streaming 
> aggregation using the same technique on the batch side.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6430) Remove Calcite classes for time resolution of auxiliary group functions

2017-05-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6430:
-

Assignee: Haohui Mai

> Remove Calcite classes for time resolution of auxiliary group functions
> ---
>
> Key: FLINK-6430
> URL: https://issues.apache.org/jira/browse/FLINK-6430
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> TUMBLE/HOP/SESSION_START/END did not resolve time field correctly. FLINK-6409 
> copied some classes from Calcite that are not necessary in Calcite 1.13 
> anymore. We can remove them again.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6424) Add basic helper functions for map type

2017-04-30 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6424:
-

Assignee: Haohui Mai

> Add basic helper functions for map type
> ---
>
> Key: FLINK-6424
> URL: https://issues.apache.org/jira/browse/FLINK-6424
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Haohui Mai
>
> FLINK-6377 introduced the map type for the Table & SQL API. We still need to 
> implement functions around this type:
> - the value constructor in SQL that constructs a map {{MAP ‘[’ key, value [, 
> key, value ]* ‘]’}}
> - the value constructur in Table API {{map(key, value,...)}} (syntax up for 
> discussion)
> - {{ELEMENT, CARDINALITY}} for SQL API
> - {{.at(), .cardinality(), and .element()}} for Table API in Scala & Java



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6377) Support map types in the Table / SQL API

2017-04-24 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6377:
-

 Summary: Support map types in the Table / SQL API
 Key: FLINK-6377
 URL: https://issues.apache.org/jira/browse/FLINK-6377
 Project: Flink
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Haohui Mai


This jira tracks the efforts of adding supports for maps into the Table / SQL 
APIs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6373) Add runtime support for distinct aggregation over grouped windows

2017-04-24 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6373:
-

 Summary: Add runtime support for distinct aggregation over grouped 
windows
 Key: FLINK-6373
 URL: https://issues.apache.org/jira/browse/FLINK-6373
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


This is a follow up task for FLINK-6335. FLINK-6335 enables parsing the 
distinct aggregations over grouped windows. This jira tracks the effort of 
adding runtime support for the query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6335) Parse DISTINCT over grouped window in stream SQL

2017-04-24 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6335:
--
Summary: Parse DISTINCT over grouped window in stream SQL  (was: Support 
DISTINCT over grouped window in stream SQL)

> Parse DISTINCT over grouped window in stream SQL
> 
>
> Key: FLINK-6335
> URL: https://issues.apache.org/jira/browse/FLINK-6335
> Project: Flink
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> The SQL on the batch side supports the {{DISTINCT}} keyword over aggregation. 
> This jira proposes to support the {{DISTINCT}} keyword on streaming 
> aggregation using the same technique on the batch side.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6335) Support DISTINCT over grouped window in stream SQL

2017-04-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976845#comment-15976845
 ] 

Haohui Mai commented on FLINK-6335:
---

The scope of this jira is about supporting the DISTINCT keyword forbtumbling / 
sliding / session windows. Please correct me if I'm wrong, my understanding of 
FLINK-6250 is more on adding DISTINCT support for OVER windows.

> Support DISTINCT over grouped window in stream SQL
> --
>
> Key: FLINK-6335
> URL: https://issues.apache.org/jira/browse/FLINK-6335
> Project: Flink
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> The SQL on the batch side supports the {{DISTINCT}} keyword over aggregation. 
> This jira proposes to support the {{DISTINCT}} keyword on streaming 
> aggregation using the same technique on the batch side.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6335) Support DISTINCT over grouped window in stream SQL

2017-04-20 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6335:
-

 Summary: Support DISTINCT over grouped window in stream SQL
 Key: FLINK-6335
 URL: https://issues.apache.org/jira/browse/FLINK-6335
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


The SQL on the batch side supports the {{DISTINCT}} keyword over aggregation. 
This jira proposes to support the {{DISTINCT}} keyword on streaming aggregation 
using the same technique on the batch side.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6281) Create TableSink for JDBC

2017-04-07 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6281:
-

 Summary: Create TableSink for JDBC
 Key: FLINK-6281
 URL: https://issues.apache.org/jira/browse/FLINK-6281
 Project: Flink
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai


It would be nice to integrate the table APIs with the JDBC connectors so that 
the rows in the tables can be directly pushed into JDBC.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (FLINK-5622) Support tumbling window on batch tables in the SQL API

2017-04-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai closed FLINK-5622.
-
Resolution: Duplicate

> Support tumbling window on batch tables in the SQL API
> --
>
> Key: FLINK-5622
> URL: https://issues.apache.org/jira/browse/FLINK-5622
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> This is a follow up of FLINK-4692.
> FLINK-4692 adds supports on tumbling windows for batch tables. It would be 
> great to expose the functionality at the SQL layer as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5622) Support tumbling window on batch tables in the SQL API

2017-04-07 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961476#comment-15961476
 ] 

Haohui Mai commented on FLINK-5622:
---

I think so. Thanks for the reminder. I'll close this jira.

> Support tumbling window on batch tables in the SQL API
> --
>
> Key: FLINK-5622
> URL: https://issues.apache.org/jira/browse/FLINK-5622
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> This is a follow up of FLINK-4692.
> FLINK-4692 adds supports on tumbling windows for batch tables. It would be 
> great to expose the functionality at the SQL layer as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (FLINK-6225) Support Row Stream for CassandraSink

2017-04-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai reassigned FLINK-6225:
-

Assignee: Haohui Mai

> Support Row Stream for CassandraSink
> 
>
> Key: FLINK-6225
> URL: https://issues.apache.org/jira/browse/FLINK-6225
> Project: Flink
>  Issue Type: New Feature
>  Components: Cassandra Connector
>Affects Versions: 1.3.0
>Reporter: Jing Fan
>Assignee: Haohui Mai
> Fix For: 1.3.0
>
>
> Currently in CassandraSink, specifying query is not supported for row-stream. 
> The solution should be similar to CassandraTupleSink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (FLINK-6012) Support WindowStart / WindowEnd functions in streaming SQL

2017-04-07 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-6012:
--
Summary: Support WindowStart / WindowEnd functions in streaming SQL  (was: 
Support WindowStart / WindowEnd functions in stream SQL)

> Support WindowStart / WindowEnd functions in streaming SQL
> --
>
> Key: FLINK-6012
> URL: https://issues.apache.org/jira/browse/FLINK-6012
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> This jira proposes to add support for {{TUMBLE_START()}} / {{TUMBLE_END()}} / 
> {{HOP_START()}} / {{HOP_END()}} / {{SESSUIB_START()}} / {{SESSION_END()}} in 
> the planner in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6012) Support WindowStart / WindowEnd functions in stream SQL

2017-04-06 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958592#comment-15958592
 ] 

Haohui Mai commented on FLINK-6012:
---

I'm okay with either way, but I think we have a decent chance to implement both 
side in the same code as it can be implemented as a transformation that only 
happens at the logical plan level.

Maybe we can start with this plan and revisit after FLINK-6261 is landed? What 
do you think?


> Support WindowStart / WindowEnd functions in stream SQL
> ---
>
> Key: FLINK-6012
> URL: https://issues.apache.org/jira/browse/FLINK-6012
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> This jira proposes to add support for {{TUMBLE_START()}} / {{TUMBLE_END()}} / 
> {{HOP_START()}} / {{HOP_END()}} / {{SESSUIB_START()}} / {{SESSION_END()}} in 
> the planner in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6261) Add support for TUMBLE, HOP, SESSION to batch SQL

2017-04-05 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956510#comment-15956510
 ] 

Haohui Mai commented on FLINK-6261:
---

That's even better -- looking forward to the PR :-)

> Add support for TUMBLE, HOP, SESSION to batch SQL
> -
>
> Key: FLINK-6261
> URL: https://issues.apache.org/jira/browse/FLINK-6261
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> Add support for the TUMBLE, HOP, SESSION keywords for batch SQL. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6261) Add support for TUMBLE, HOP, SESSION to batch SQL

2017-04-04 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956108#comment-15956108
 ] 

Haohui Mai commented on FLINK-6261:
---

[~fhueske], I am interested in contributing this feature. If you haven't 
started working on it, do you mind assigning to me? Thanks.

> Add support for TUMBLE, HOP, SESSION to batch SQL
> -
>
> Key: FLINK-6261
> URL: https://issues.apache.org/jira/browse/FLINK-6261
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.3.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> Add support for the TUMBLE, HOP, SESSION keywords for batch SQL. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5998) Un-fat Hadoop from Flink fat jar

2017-03-31 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951280#comment-15951280
 ] 

Haohui Mai commented on FLINK-5998:
---

All failed tests passed locally. :-|

[~rmetzger], is it possible for you can take a look? Much appreciated.

> Un-fat Hadoop from Flink fat jar
> 
>
> Key: FLINK-5998
> URL: https://issues.apache.org/jira/browse/FLINK-5998
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Robert Metzger
>Assignee: Haohui Mai
>
> As a first step towards FLINK-2268, I would suggest to put all hadoop 
> dependencies into a jar separate from Flink's fat jar.
> This would allow users to put a custom Hadoop jar in there, or even deploy 
> Flink without a Hadoop fat jar at all in environments where Hadoop is 
> provided (EMR).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6217) ContaineredTaskManagerParameters sets off heap memory size incorrectly

2017-03-29 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6217:
-

 Summary: ContaineredTaskManagerParameters sets off heap memory 
size incorrectly
 Key: FLINK-6217
 URL: https://issues.apache.org/jira/browse/FLINK-6217
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


Thanks [~bill.liu8904] for triaging the issue.

When {{taskmanager.memory.off-heap}} is disabled, we observed that the total 
memory that Flink allocates exceed the total memory of the container:

For a 8G container the JobManager starts the container with the following 
parameter:

{noformat}
$JAVA_HOME/bin/java -Xms6072m -Xmx6072m -XX:MaxDirectMemorySize=6072m ...
{noformat}

The total amount of heap memory plus the off-heap memory exceeds the total 
amount of memory of the container. As a result YARN occasionally kills the 
container.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5668) passing taskmanager configuration through taskManagerEnv instead of file

2017-03-29 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946785#comment-15946785
 ] 

Haohui Mai commented on FLINK-5668:
---

Sorry for the delayed response.

Our main requirement is to allow Flink to support mission-critical, real-time 
applications. Our colleagues want to build mission-critical, real-time 
applications that are built on top of Flink. They are concerned about the fact 
that not being able to start any jobs when HDFS is down -- today there are no 
workarounds for their applications to keep their SLAs when HDFS is under 
maintenance.

As you pointed out, there are multiple issues (e.g., checkpoints) to keep the 
Flink job running in the above scenario. To get started we would like to be 
able to start the job when HDFS is down and address other issues in later jiras.

As a result this essentially reduces to one requirement -- Flink needs to have 
an option to bootstrap the jobs without persisting data on {{default.FS}}.

I think https://github.com/apache/flink/pull/2796/files will work as long as 
(1) Flink persists everything to that path, and (2) the path can specify a file 
system other than {{default.FS}} [~bill.liu8904] can you elaborate why it won't 
work for you?

Below are some inlined answers.

{quote}
All the paths are programatically generated and there are no configuration 
parameters for passing custom paths (correct me if I'm wrong).
Are you planning to basically fork Flink and create a custom YARN client / 
Application Master implementation that allows using custom paths?
{quote}

It is sufficient to just specify the root of the path -- I believe something 
like {{yarn.deploy.fs}} or https://github.com/apache/flink/pull/2796/files will 
work.

{quote}
I think we didn't have your use case in mind when implementing the code. We 
assumed that one file system will be used for distributing all required files. 
Also, this approach works nicely will all the Hadoop vendor's versions.
{quote}

We originally shared the same line of thoughts that HDFS HA should be 
sufficient. The problem is that mission-critical real-time applications have a 
much stricter SLA that HDFS thus they need to survive from HDFS downtime.

{quote}
The general theme is: Some persistent store is needed currently, at least for 
high-availability modes. Decoupling Yarn from a persistent store pushes the 
responsibility to another layer.
{quote}

Totally agree. Whether it is in HA mode or not, having a distributed file 
system underneath simplifies things a lot. Passing state as configuration / 
environment variables is just one solution but not necessarily the best one. I 
think we are good to go as long as Flink is able to bootstrap the jobs from 
places other than {{default.FS}}.

Thoughts?



> passing taskmanager configuration through taskManagerEnv instead of file
> 
>
> Key: FLINK-5668
> URL: https://issues.apache.org/jira/browse/FLINK-5668
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: Bill Liu
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When create a Flink cluster on Yarn,  JobManager depends on  HDFS to share  
> taskmanager-conf.yaml  with TaskManager.
> It's better to share the taskmanager-conf.yaml  on JobManager Web server 
> instead of HDFS, which could reduce the HDFS dependency  at job startup.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6209) StreamPlanEnvironment always has a parallelism of 1

2017-03-28 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946043#comment-15946043
 ] 

Haohui Mai commented on FLINK-6209:
---

We found that FLINK-5808 removes this snippet of code that causes the bug:

{noformat}
int parallelism = env.getParallelism();
if (parallelism > 0) {
setParallelism(parallelism);
}
{noformat}

[~aljoscha] do you have an idea on why this snippet is removed? Thanks.


> StreamPlanEnvironment always has a parallelism of 1
> ---
>
> Key: FLINK-6209
> URL: https://issues.apache.org/jira/browse/FLINK-6209
> Project: Flink
>  Issue Type: Bug
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> Thanks [~bill.liu8904] for triaging the issue.
> After FLINK-5808 we saw that the Flink jobs that are uploaded through the UI 
> always have a parallelism of 1, even the parallelism is explicitly set via in 
> the UI.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (FLINK-6209) StreamPlanEnvironment always has a parallelism of 1

2017-03-28 Thread Haohui Mai (JIRA)
Haohui Mai created FLINK-6209:
-

 Summary: StreamPlanEnvironment always has a parallelism of 1
 Key: FLINK-6209
 URL: https://issues.apache.org/jira/browse/FLINK-6209
 Project: Flink
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai


Thanks [~bill.liu8904] for triaging the issue.

After FLINK-5808 we saw that the Flink jobs that are uploaded through the UI 
always have a parallelism of 1, even the parallelism is explicitly set via in 
the UI.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5829) Bump Calcite version to 1.12 once available

2017-03-27 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943938#comment-15943938
 ] 

Haohui Mai commented on FLINK-5829:
---

Unfortunately it seems that it is infeasible to inherit and forward the calls 
because Calcite declares most of the methods as package local methods. It seems 
that Calcite intentionally keeps the class as a package local class.

Another option is to defer the functionality of unregistering tables until 
Calcite provides the API. We can open a jira in Calcite and get it fixed in 
Calcite 1.13. However, given the release schedule, it seems to me that the 
functionality will be deferred to Flink 1.4.

[~twalthr] does it sound okay to you? What do you think?

> Bump Calcite version to 1.12 once available
> ---
>
> Key: FLINK-5829
> URL: https://issues.apache.org/jira/browse/FLINK-5829
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: Haohui Mai
>
> Once Calcite 1.12 is release we should update to remove some copied classes. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5829) Bump Calcite version to 1.12 once available

2017-03-25 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941637#comment-15941637
 ] 

Haohui Mai commented on FLINK-5829:
---

I just pushed a PR. The migration process is relatively straightforward except 
that Calcite 1.12 seems to be conflicted with FLINK-4288.

The {{tableMap}} field has become a protected member thus unregister table 
become non-trivial. There are two options here.

1. Revert FLINK-4288. FLINK-4288 has not been released yet thus it is okay to 
pull it back with no concerns on backward compatibility.
2. Implement a proxy schema which inherits from {{CalciteSchema}} to regain the 
access of the field.

[~fhueske] [~twalthr], what do you think?

> Bump Calcite version to 1.12 once available
> ---
>
> Key: FLINK-5829
> URL: https://issues.apache.org/jira/browse/FLINK-5829
> Project: Flink
>  Issue Type: Task
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: Haohui Mai
>
> Once Calcite 1.12 is release we should update to remove some copied classes. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5570) Support register external catalog to table environment

2017-03-23 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939344#comment-15939344
 ] 

Haohui Mai commented on FLINK-5570:
---

I think the APIs of the PR looks good to me overall.

One question -- Does the PR need additional fixes on to make functions like 
{{isRegistered}} and {{getRowType}} be aware of databases?



> Support register external catalog to table environment
> --
>
> Key: FLINK-5570
> URL: https://issues.apache.org/jira/browse/FLINK-5570
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Kurt Young
>Assignee: jingzhang
>
> This issue aims to support register one or more {{ExternalCatalog}} (which is 
> referred in https://issues.apache.org/jira/browse/FLINK-5568) to 
> {{TableEnvironment}}. After registration, SQL and TableAPI queries could 
> access to tables in the external catalogs without register those tables one 
> by one to {{TableEnvironment}} beforehand.
> We plan to add two APIs in {{TableEnvironment}}:
> 1. register externalCatalog
> {code}
> def registerExternalCatalog(name: String, externalCatalog: ExternalCatalog): 
> Unit
> {code}
> 2. scan a table from registered catalog and returns the resulting {{Table}},  
> the API is very useful in TableAPI queries.
> {code}
> def scan(catalogName: String, tableIdentifier: TableIdentifier): Table
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6102) Update protobuf to latest version

2017-03-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934151#comment-15934151
 ] 

Haohui Mai commented on FLINK-6102:
---

I would recommend keeping it as is. The problem is that the whole Hadoop 2.x 
ecosystem is on protobuf 2.5. Upgrading it causes significantly more harms than 
goods.



> Update protobuf to latest version
> -
>
> Key: FLINK-6102
> URL: https://issues.apache.org/jira/browse/FLINK-6102
> Project: Flink
>  Issue Type: Task
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Su Ralph
> Fix For: 1.2.1
>
>
> In flink release 1.2.0, we have protobuf-java as 2.5.0, and it's packaged 
> into flink fat jar. 
> This would cause conflict when an user application use new version of 
> protobuf-java, it make more sense to update to later protobuf-java.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5998) Un-fat Hadoop from Flink fat jar

2017-03-20 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934054#comment-15934054
 ] 

Haohui Mai commented on FLINK-5998:
---

Sorry for the delay I'm just busy on other issues -- I'll take care of it in a 
day or two.

> Un-fat Hadoop from Flink fat jar
> 
>
> Key: FLINK-5998
> URL: https://issues.apache.org/jira/browse/FLINK-5998
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Robert Metzger
>Assignee: Haohui Mai
>
> As a first step towards FLINK-2268, I would suggest to put all hadoop 
> dependencies into a jar separate from Flink's fat jar.
> This would allow users to put a custom Hadoop jar in there, or even deploy 
> Flink without a Hadoop fat jar at all in environments where Hadoop is 
> provided (EMR).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5570) Support register external catalog to table environment

2017-03-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928975#comment-15928975
 ] 

Haohui Mai commented on FLINK-5570:
---

Thanks for the ping, [~fhueske]. Please allow me for a day or two to go through 
the PR again.

> Support register external catalog to table environment
> --
>
> Key: FLINK-5570
> URL: https://issues.apache.org/jira/browse/FLINK-5570
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Kurt Young
>Assignee: jingzhang
>
> This issue aims to support register one or more {{ExternalCatalog}} (which is 
> referred in https://issues.apache.org/jira/browse/FLINK-5568) to 
> {{TableEnvironment}}. After registration, SQL and TableAPI queries could 
> access to tables in the external catalogs without register those tables one 
> by one to {{TableEnvironment}} beforehand.
> We plan to add two APIs in {{TableEnvironment}}:
> 1. register externalCatalog
> {code}
> def registerExternalCatalog(name: String, externalCatalog: ExternalCatalog): 
> Unit
> {code}
> 2. scan a table from registered catalog and returns the resulting {{Table}},  
> the API is very useful in TableAPI queries.
> {code}
> def scan(catalogName: String, tableIdentifier: TableIdentifier): Table
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6033) Support UNNEST query in the stream SQL API

2017-03-13 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15922923#comment-15922923
 ] 

Haohui Mai commented on FLINK-6033:
---

Discussed offline with [~fhueske] -- the proposed approach is to build on top 
of the user-define table function supported introduced in FLINK-4469, by adding 
a rule to transform the {{UNNEST}} keyword into the {{explode()}} function.

> Support UNNEST query in the stream SQL API
> --
>
> Key: FLINK-6033
> URL: https://issues.apache.org/jira/browse/FLINK-6033
> Project: Flink
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> It would be nice to support the {{UNNEST}} keyword in the stream SQL API. 
> The keyword is widely used in queries that relate to nested fields.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >