[jira] [Created] (FLINK-15240) is_generic key is missing for Flink table stored in HiveCatalog

2019-12-12 Thread Bowen Li (Jira)
Bowen Li created FLINK-15240:


 Summary: is_generic key is missing for Flink table stored in 
HiveCatalog
 Key: FLINK-15240
 URL: https://issues.apache.org/jira/browse/FLINK-15240
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0, 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15239) TM Metaspace memory leak

2019-12-12 Thread Rui Li (Jira)
Rui Li created FLINK-15239:
--

 Summary: TM Metaspace memory leak
 Key: FLINK-15239
 URL: https://issues.apache.org/jira/browse/FLINK-15239
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Reporter: Rui Li
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15238) A sql can't generate a valid execution plan

2019-12-12 Thread xiaojin.wy (Jira)
xiaojin.wy created FLINK-15238:
--

 Summary: A sql can't generate a valid execution plan
 Key: FLINK-15238
 URL: https://issues.apache.org/jira/browse/FLINK-15238
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.10.0
Reporter: xiaojin.wy


The table and the query is like this:

 

 

After execution the sql, the exception will appear:

[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.table.api.TableException: Cannot generate a valid execution 
plan for the given query:

LogicalProject(deptno=[$0], x=[$3])
 LogicalJoin(condition=[true], joinType=[left])
 LogicalTableScan(table=[[default_catalog, default_database, scott_dept]])
 LogicalSort(sort0=[$0], dir0=[ASC], fetch=[1])
 LogicalProject(empno=[$0])
 LogicalTableScan(table=[[default_catalog, default_database, scott_emp]])

This exception indicates that the query uses an unsupported SQL feature.
Please check the documentation for the set of currently supported SQL features.

 

 

The whole exception is:

Caused by: org.apache.flink.table.api.TableException: Cannot generate a valid 
execution plan for the given query:Caused by: 
org.apache.flink.table.api.TableException: Cannot generate a valid execution 
plan for the given query:
LogicalProject(deptno=[$0], x=[$3])  LogicalJoin(condition=[true], 
joinType=[left])    LogicalTableScan(table=[[default_catalog, default_database, 
scott_dept]])    LogicalSort(sort0=[$0], dir0=[ASC], fetch=[1])      
LogicalProject(empno=[$0])        LogicalTableScan(table=[[default_catalog, 
default_database, scott_emp]])
This exception indicates that the query uses an unsupported SQL feature.Please 
check the documentation for the set of currently supported SQL features. at 
org.apache.flink.table.plan.Optimizer.runVolcanoPlanner(Optimizer.scala:284) at 
org.apache.flink.table.plan.Optimizer.optimizeLogicalPlan(Optimizer.scala:199) 
at 
org.apache.flink.table.plan.StreamOptimizer.optimize(StreamOptimizer.scala:66) 
at 
org.apache.flink.table.planner.StreamPlanner.translateToType(StreamPlanner.scala:389)
 at 
org.apache.flink.table.planner.StreamPlanner.writeToRetractSink(StreamPlanner.scala:308)
 at 
org.apache.flink.table.planner.StreamPlanner.org$apache$flink$table$planner$StreamPlanner$$writeToSink(StreamPlanner.scala:272)
 at 
org.apache.flink.table.planner.StreamPlanner$$anonfun$2.apply(StreamPlanner.scala:166)
 at 
org.apache.flink.table.planner.StreamPlanner$$anonfun$2.apply(StreamPlanner.scala:145)
 at scala.Option.map(Option.scala:146) at 
org.apache.flink.table.planner.StreamPlanner.org$apache$flink$table$planner$StreamPlanner$$translate(StreamPlanner.scala:145)
 at 
org.apache.flink.table.planner.StreamPlanner$$anonfun$translate$1.apply(StreamPlanner.scala:117)
 at 
org.apache.flink.table.planner.StreamPlanner$$anonfun$translate$1.apply(StreamPlanner.scala:117)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
org.apache.flink.table.planner.StreamPlanner.translate(StreamPlanner.scala:117) 
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:680)
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.insertIntoInternal(TableEnvironmentImpl.java:353)
 at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.insertInto(TableEnvironmentImpl.java:341)
 at 
org.apache.flink.table.api.internal.TableImpl.insertInto(TableImpl.java:428) at 
org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeQueryInternal$12(LocalExecutor.java:640)
 at 
org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:227)
 at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeQueryInternal(LocalExecutor.java:638)
 ... 8 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15237) CsvTableSource and ref class better to Extract into a module

2019-12-12 Thread xiaodao (Jira)
xiaodao created FLINK-15237:
---

 Summary: CsvTableSource  and ref class  better to  Extract into a 
module
 Key: FLINK-15237
 URL: https://issues.apache.org/jira/browse/FLINK-15237
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.10.0
Reporter: xiaodao


current the CsvTable ref class ,eg CsvTableSource  is contain in 
flink-table-api-java-bridge, i think it's better to extract as a new connector 
module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15236) Add a safety net for concurrent checkpoints on TM side

2019-12-12 Thread Congxian Qiu(klion26) (Jira)
Congxian Qiu(klion26) created FLINK-15236:
-

 Summary: Add a safety net for concurrent checkpoints on TM side
 Key: FLINK-15236
 URL: https://issues.apache.org/jira/browse/FLINK-15236
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Reporter: Congxian Qiu(klion26)


As discussed in FLINK-13808,  we can add additional config 
{{taskmanager.checkpoints.max-concurrent}} that limits the number of concurrent 
checkpoints on the TM for safety net.

this configure {{taskmanager.checkpoints.max-concurrent}}, and the default 
value for maxConcurrentCheckpoints=1 is 1 and unlimited for 
maxConcurrentCheckpoints > 1.
 * If maxConcurrentCheckpoints = 1, the default 
{{taskmanager.checkpoints.max-concurrent}} is 1.
 * If maxConcurrentCheckpoints > 1 the default value for 
{{taskmanager.checkpoints.max-concurrent}}, is unlimited 

should not take manually triggered checkpoints/savepoints into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15235) create a Flink distribution for hive that includes all Hive dependencies

2019-12-12 Thread Bowen Li (Jira)
Bowen Li created FLINK-15235:


 Summary: create a Flink distribution for hive that includes all 
Hive dependencies 
 Key: FLINK-15235
 URL: https://issues.apache.org/jira/browse/FLINK-15235
 Project: Flink
  Issue Type: Task
  Components: Connectors / Hive, Release System
Reporter: Bowen Li


consider create a Flink distribution for hive that includes all Hive 
dependencies, despite the existing FLink only distribution, to improve good 
quickstart experience for hive users



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15234) hive table created from flink catalog table cannot have null properties in parameters

2019-12-12 Thread Bowen Li (Jira)
Bowen Li created FLINK-15234:


 Summary: hive table created from flink catalog table cannot have 
null properties in parameters
 Key: FLINK-15234
 URL: https://issues.apache.org/jira/browse/FLINK-15234
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: Bowen Li
Assignee: Bowen Li
 Fix For: 1.10.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15233) Improve Kafka connector properties make append update-mode as default

2019-12-12 Thread Jark Wu (Jira)
Jark Wu created FLINK-15233:
---

 Summary: Improve Kafka connector properties make append 
update-mode as default
 Key: FLINK-15233
 URL: https://issues.apache.org/jira/browse/FLINK-15233
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Reporter: Jark Wu


Currently, {{update-mode}} is a required properties of Kafka. However, Kafka 
only support append {{update-mode}}. It is weird and un-user-friendly to make 
users set such a properties mandatory. We can make this properties as optional 
and {{append}} by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15232) Print match candidates to improve NoMatchingTableFactoryException

2019-12-12 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-15232:


 Summary: Print match candidates to improve 
NoMatchingTableFactoryException
 Key: FLINK-15232
 URL: https://issues.apache.org/jira/browse/FLINK-15232
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Reporter: Jingsong Lee
 Fix For: 1.10.0


Currently, all the required properties should exist and match, otherwise, 
{{NoMatchingTableFactoryException}} will be thrown.

We can pick a best candidate to print where is wrong,  print requiredContext 
and supportedProperties in different formats.

In this way, users can know which property in requiredContext is lack, and 
which property is not allow in supportedProperties.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15231) Wrong HeapVector in AbstractHeapVector.createHeapColumn

2019-12-12 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-15231:


 Summary: Wrong HeapVector in AbstractHeapVector.createHeapColumn
 Key: FLINK-15231
 URL: https://issues.apache.org/jira/browse/FLINK-15231
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.10.0
Reporter: Zhenghua Gao
 Fix For: 1.10.0


For TIMESTAMP WITHOUT TIME ZONE/TIMESTAMP WITH LOCAL TIME ZONE/DECIMAL types, 
AbstractHeapVector.createHeapColumn generates wrong HeapVectors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP policy for introducing config option keys

2019-12-12 Thread Jark Wu
Hi all,

It seems that this discussion is idle, but I want to resume it because
I think this will make our community run faster and keep us in the safe
side.

I would like to summarize the discussions so far (please correct me if I'm
wrong):

1. we all agree to have a VOTE in mailing list for such small API changes
2. the voting period can be shorter (48h?)   -- Dawid, Jark, Timo
3. requires a formal FLIP process is too heavy -- Zili Chen, Jark, Dawid,
Xingtong
 - FLIP number will explode
 - difficult to track major design changes
 - we can loosen some formal requirements for such changes
4. introduce a new kind of process -- Hequn, Jark, Xingtong

--

My proposal is introducing a new process similar to Xingtong pointed.

1. For small API changes, we don't need a formal FLIP process.
2. The discussion can happen in JIRA issue, a [DISCUSS] thread is not
mandatory.
3. The JIRA issue should describe API changes clearly in the description.
4. Once the proposal is finalized call a [VOTE] to have the proposal
adopted.
   The vote requires 3 binding votes (Committers), and at least 2 days.
This is a
   new kind of voting actions which should be added to Flink Bylaws.

-

Further discussion:

1. We need to define **what is small API changes**.
2. Do we need a place (wiki?) to track all such proposals/JIRA issues and
how?


Best,
Jark

On Wed, 16 Oct 2019 at 17:56, Timo Walther  wrote:

> Hi all,
>
> I agree with Jark. Having a voting with at least 3 binding votes makes
> sense for API changes. It also forces people to question the
> introduction of another config option that might make the configuration
> of Flink more complicated. A FLIP is usually a bigger effort with long
> term impacts on the general usability. A shorter voting period of 48
> hours for just a little config option sounds reasonable to me.
>
> Regards,
> Timo
>
> On 16.10.19 10:36, Xintong Song wrote:
> > Hi all,
> >
> > I think config option changes deserves a voting process, but not
> > necessarily a FLIP.
> >
> > My concern for always having FLIPs on config option changes is that, we
> > might result in too many FLIPs, which makes it difficult for people who
> > wants to track the major design changes other than pure config option
> > changes.
> >
> >
> > My two cents,
> >
> > - For config option changes introduced by FLIPs, they need to be
> explicitly
> > described in the FLIP document and voted. We do have a section 'Public
> > Interface' for that in the FLIP template.
> >
> > - For config option changes introduced by other JIRA tickets / PRs, they
> > need to be voted in the ML. We can add a statement 'whether the PR
> > introduce any config option changes' in the PR template, and if the
> answer
> > is yes, a link to the ML vote thread is required to be attached before
> > getting the PR merged.
> >
> >
> > What do you think?
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Oct 16, 2019 at 2:50 PM Zili Chen  wrote:
> >
> >> Hi Jark & Hequn,
> >>
> >> Do you stick to introduce a looser FLIP? We possibly introduce a
> redundant
> >> extra type
> >> of community consensus if we are able to just reuse the process of
> current
> >> FLIP. Given
> >> the activity of our community I don't think it costs too much for a
> config
> >> option keys
> >> change with 3 days at least voting required >3 committer votes.
> >>
> >> Best,
> >> tison.
> >>
> >>
> >> Hequn Cheng  于2019年10月16日周三 下午2:29写道:
> >>
> >>> Hi all,
> >>>
> >>> +1 to have a looser FLIP policy for these API changes.
> >>>
> >>> I think the concerns raised above are all valid. Besides the feedbacks
> >> from
> >>> @Jark, if we want to keep track of these changes, maybe we can create a
> >> new
> >>> kind of FLIP that is dedicated to these minor API changes? For example,
> >> we
> >>> can add a single wiki page and list all related JIRAs in it. The design
> >>> details can be described in the JIRA.
> >>> Another option is to simply add a new JIRA label to track these
> changes.
> >>>
> >>> What do you think?
> >>>
> >>> Best, Hequn
> >>>
> >>> On Tue, Oct 15, 2019 at 8:43 PM Zili Chen 
> wrote:
> >>>
>  The naming concern above can be a separated issue since it looks also
>  affect FLIP-54 and isn't limited for config option changes FLIP.
> 
>  Best,
>  tison.
> 
> 
>  Aljoscha Krettek  于2019年10月15日周二 下午8:37写道:
> 
> > Another PR that introduces new config options:
> > https://github.com/apache/flink/pull/9759
> >
> >> On 15. Oct 2019, at 14:31, Zili Chen  wrote:
> >>
> >> Hi Aljoscha & Dawid & Kostas,
> >>
> >> I agree that changes on config option keys deserve a FLIP and it is
> >> reasonable
> >> we commit the changes with a standard FLIP process so that ensure
> >> the
> > change
> >> given proper visibility.
> >>
> >> My concern is about naming. Give

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

2019-12-12 Thread Rui Li
Hi Timo,

I understand we need further discussion about syntax/dialect for 1.11. But
as Jark has pointed out, the current implementation violates the accepted
design of FLIP-63, which IMO qualifies as a bug. Given that it's a bug and
has great impact on the usability of our Hive integration, do you think we
can fix it in 1.10?

On Fri, Dec 13, 2019 at 12:24 AM Jingsong Li  wrote:

> Hi Timo,
>
> I am OK if you think they are not bug and they should not be included in
> 1.10.
>
> I think they have been accepted in FLIP-63. And there is no objection. It
> has been more than three months since the discussion of FLIP-63. It's been
> six months since Flink added these two syntaxs.
>
> But I can also start discussion and vote thread for FLIP-63 again, to make
> sure once again that everyone is happy.
>
> Best,
> Jingsong Lee
>
> On Thu, Dec 12, 2019 at 11:35 PM Timo Walther  wrote:
>
> > Hi Jingsong,
> >
> > I will also add my opinion here for future discussions:
> >
> > We had long discussions around SQL syntax in the past (e.g. for
> > WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the
> > end all parties were happy and we came up with a good long-term solution
> > that is unlikely to be changed in the near future. IMHO we can differ
> > from the standard esp. for DDL statements (every SQL vendors has custom
> > syntax there) but still we should have time to hear many opinions.
> > However, for DQL and DML statements we need to be even more cautious
> > because here we enter SQL standard areas.
> >
> > Happy to discuss a partion syntax for 1.11 and go with the solution that
> > Jark proposed using a config option.
> >
> > Thanks,
> > Timo
> >
> >
> > On 12.12.19 09:40, Jark Wu wrote:
> > > Hi Jingsong,
> > >
> > > Thanks for the explanation, I think I misunderstood your point at the
> > > begining.
> > > As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are
> > added
> > > to Flink's
> > > SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> > > dialect.
> > > However, the current implementation is opposite, INSERT OVERWRITE
> > > and INSERT PARTITION are
> > > under the dialect limitation, but CREATE PARTITION TABLE is not.
> > >
> > > So it is indeed a bug which should be fixed in 1.10.
> > >
> > > Best,
> > > Jark
> > >
> > > On Thu, 12 Dec 2019 at 16:35, Jingsong Li 
> > wrote:
> > >
> > >> Hi Jark,
> > >>
> > >> Let's recall FLIP-63,
> > >> We supported these syntax in hive dialect at 1.9. All of my reasons
> for
> > >> launching FLIP-63 are to bring partition support to Flink itself.
> > >> Not only batch, but also we have the need to stream jobs to write
> > partition
> > >> files today, which is also one of our very important application
> > scenarios.
> > >>
> > >> The original intention of FLIP-63 is to bring all partition syntax to
> > >> Flink, but in the end you and I have some different opinion in
> creating
> > >> partition table, so our consensus is to leave it in hive dialect, only
> > it.
> > >>
> > >> [1]
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >>
> > >> Best,
> > >> Jingsong Lee
> > >>
> > >> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu  wrote:
> > >>
> > >>> Hi jingsong,
> > >>>
> > >>> Watermark is not a standard syntax, that's why we had a FLIP and long
> > >>> discussion to add it to
> > >>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> > >>> PARTITION syntax to
> > >>>   Flink's own syntax,  we also need a FLIP or a VOTE, and this may
> > can't
> > >>> happen soon (we should
> > >>> hear more people's opinions on this).
> > >>>
> > >>> Regarding to the sql-dialect configuration, I was not saying to
> involve
> > >> the
> > >>> whole FLIP-89. I mean we can just
> > >>> start a VOTE to expose it as `table.planner.sql-dialect` and include
> it
> > >> in
> > >>> 1.10. The change can be very small, by
> > >>> adding a ConfigOption and changing the implementation of
> > >>> TableConfig#getSqlDialect/setSqlDialect. I believe
> > >>> it is smaller and safer than changing the parser.
> > >>>
> > >>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> > managers
> > >>> should be aware of this.
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>>
> > >>> On Thu, 12 Dec 2019 at 11:47, Danny Chan 
> wrote:
> > >>>
> >  Thanks Jingsong for bring up this discussion ~
> > 
> >  After reviewing FLIP-63, it seems that we have made a conclusion for
> > >> the
> >  syntax
> > 
> >  - INSERT OVERWRITE ...
> >  - INSERT INTO … PARTITION
> > 
> >  Which means that they should not have the Hive dialect limitation,
> so
> > >> I’m
> >  inclined that the behaviors for SQL-CLI is unexpected, or a “bug”
> that
> > >>> need
> >  to fix.
> > 
> >  We did not make a conclusion for the syntax:
> > 
> >  - CREATE TABLE … PARTITIONED BY ...
> > 
> >  Which means that the behavior of i

[jira] [Created] (FLINK-15230) flink1.9.1 table API JSON schema array type exception

2019-12-12 Thread kevin (Jira)
kevin created FLINK-15230:
-

 Summary: flink1.9.1 table API JSON schema array type exception
 Key: FLINK-15230
 URL: https://issues.apache.org/jira/browse/FLINK-15230
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
SQL / Planner
Affects Versions: 1.9.1, 1.9.0
 Environment: flink1.9.1
Reporter: kevin


strings: \{ type: 'array', items: { type: 'string' } }   

.field("strings", Types.OBJECT_ARRAY(Types.STRING))

Exception in thread "main" org.apache.flink.table.api.ValidationException: Type 
LEGACY(BasicArrayTypeInfo) of table field 'strings' does not match with 
type BasicArrayTypeInfo of the field 'strings' of the TableSource 
return type.Exception in thread "main" 
org.apache.flink.table.api.ValidationException: Type 
LEGACY(BasicArrayTypeInfo) of table field 'strings' does not match with 
type BasicArrayTypeInfo of the field 'strings' of the TableSource 
return type. at 
org.apache.flink.table.planner.sources.TableSourceUtil$$anonfun$4.apply(TableSourceUtil.scala:121)
 at 
org.apache.flink.table.planner.sources.TableSourceUtil$$anonfun$4.apply(TableSourceUtil.scala:92)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
 at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186) at 
org.apache.flink.table.planner.sources.TableSourceUtil$.computeIndexMapping(TableSourceUtil.scala:92)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:100)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlanInternal(StreamExecTableSourceScan.scala:55)
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:54)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecTableSourceScan.translateToPlan(StreamExecTableSourceScan.scala:55)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:86)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlanInternal(StreamExecCalc.scala:46)
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:54)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecCalc.translateToPlan(StreamExecCalc.scala:46)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToTransformation(StreamExecSink.scala:185)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:154)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:50)
 at 
org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:54)
 at 
org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlan(StreamExecSink.scala:50)
 at 
org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:61)
 at 
org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:60)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at scala.collection.Iterator$class.foreach(Iterator.scala:891) at 
scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at 
scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at 
scala.collection.AbstractIterable.foreach(Iterable.scala:54) at 
scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:60)
 at 
org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:149)
 at 
org.apache.flink.table.api.java.internal.StreamTableEnvironmentImpl.toDataStream(StreamTableEnvironmentImpl.java:319)
 at 
org.apache.flink.table.api.java.internal.StreamTableEnvironmentImpl.toAppendStream(StreamTableEnvironmentImpl.java:227)
 at 
org.apache.flink.table.api.java.internal.StreamTableEnvironmentImpl.toAppendStream(StreamTableEnvironmentImpl.java:218)
 at com.wf.flink.sql.TableSqlJson.main(TableSqlJson.java:64) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.Nati

[jira] [Created] (FLINK-15229) DDL for kafka connector following documentation doesn't work

2019-12-12 Thread Bowen Li (Jira)
Bowen Li created FLINK-15229:


 Summary: DDL for kafka connector following documentation doesn't 
work
 Key: FLINK-15229
 URL: https://issues.apache.org/jira/browse/FLINK-15229
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Documentation
Reporter: Bowen Li
 Fix For: 1.10.0


https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#kafka-connector



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15228) Drop vendor specific deployment documentation

2019-12-12 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-15228:


 Summary: Drop vendor specific deployment documentation
 Key: FLINK-15228
 URL: https://issues.apache.org/jira/browse/FLINK-15228
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Seth Wiesman
 Fix For: 1.10.0, 1.11.0


Based on a mailing list discussion we want to drop vendor specific deployment 
documentation

ml discussion: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Drop-vendor-specific-deployment-documentation-td35457.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15227) Fix broken documentation build

2019-12-12 Thread Seth Wiesman (Jira)
Seth Wiesman created FLINK-15227:


 Summary: Fix broken documentation build
 Key: FLINK-15227
 URL: https://issues.apache.org/jira/browse/FLINK-15227
 Project: Flink
  Issue Type: Bug
Reporter: Seth Wiesman
 Fix For: 1.10.0, 1.11.0


Incremental build: enabled
 Generating...
 Liquid Exception: Liquid syntax error (line 421): 'highlight' tag was never 
closed in dev/table/catalogs.zh.md
Liquid syntax error (line 421): 'highlight' tag was never closed
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-4.0.3/lib/liquid/block.rb:63:in
 `block in parse_body'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-c-4.0.0/lib/liquid/c.rb:30:in
 `block in parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-c-4.0.0/lib/liquid/c.rb:30:in
 `c_parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-c-4.0.0/lib/liquid/c.rb:30:in
 `parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-4.0.3/lib/liquid/block.rb:58:in
 `parse_body'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-4.0.3/lib/liquid/block.rb:12:in
 `parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-4.0.3/lib/liquid/tag.rb:10:in
 `parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-c-4.0.0/lib/liquid/c.rb:30:in
 `c_parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-c-4.0.0/lib/liquid/c.rb:30:in
 `parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-4.0.3/lib/liquid/document.rb:10:in
 `parse'
/Users/sjwiesman/flink/docs/.rubydeps/ruby/2.5.0/gems/liquid-4.0.3/lib/liquid/document.rb:5:in
 `parse'



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15226) Running job with application parameters fails on a job cluster

2019-12-12 Thread Biju Nair (Jira)
Biju Nair created FLINK-15226:
-

 Summary: Running job with application parameters fails on a job 
cluster
 Key: FLINK-15226
 URL: https://issues.apache.org/jira/browse/FLINK-15226
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task
Reporter: Biju Nair


Trying to move a job which takes in application parameters running on a session 
cluster to a job cluster and it fails with the following error in the task 
manager


{noformat}
 2019-11-23 01:29:16,498 ERROR 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Could not parse 
the command line options.
org.apache.flink.runtime.entrypoint.FlinkParseException: Failed to parse the 
command line arguments.
at 
org.apache.flink.runtime.entrypoint.parser.CommandLineParser.parse(CommandLineParser.java:52)
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.loadConfiguration(TaskManagerRunner.java:315)
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:284)
Caused by: org.apache.commons.cli.MissingOptionException: Missing required 
option: c
at 
org.apache.commons.cli.DefaultParser.checkRequiredOptions(DefaultParser.java:199)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:130)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:81)
at 
org.apache.flink.runtime.entrypoint.parser.CommandLineParser.parse(CommandLineParser.java:50)
... 2 more
usage: TaskManagerRunner -c  [-D ]
     -c,--configDir    Directory which contains the
                                                configuration file
                                                flink-conf.yml.
     -D                         use value for given property
Exception in thread "main" 
org.apache.flink.runtime.entrypoint.FlinkParseException: Failed to parse the 
command line arguments.
at 
org.apache.flink.runtime.entrypoint.parser.CommandLineParser.parse(CommandLineParser.java:52)
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.loadConfiguration(TaskManagerRunner.java:315)
at 
org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:284)
Caused by: org.apache.commons.cli.MissingOptionException: Missing required 
option: c
at 
org.apache.commons.cli.DefaultParser.checkRequiredOptions(DefaultParser.java:199)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:130)
at org.apache.commons.cli.DefaultParser.parse(DefaultParser.java:81)
at 
org.apache.flink.runtime.entrypoint.parser.CommandLineParser.parse(CommandLineParser.java:50)
... 2 more{noformat}
Looking at the code, this may be due to adding the {{configDir}} parameter at 
the end in t{{askManagers.sh}} script 
[here|https://github.com/apache/flink/blob/d43a1dd397dbc7c0559a07b83380fed164114241/flink-dist/src/main/flink-bin/bin/taskmanager.sh#L73].
 Based on the commandline parser 
[logic|https://github.com/apache/flink/blob/6258a4c333ce9dba914621b13eac2f7d91f5cb72/flink-runtime/src/main/java/org/apache/flink/runtime/entrypoint/parser/CommandLineParser.java#L50],
 parsing will stop once a parameter with out {{-D}} or {{--configDir/-c}} is 
encountered which in this case is true.  If this diagnosis is correct, can the 
{{configDir}} parameter can be added to the beginning instead of the end in the 
args list in the job manager shell script? Thanks for your input in advance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Improve documentation / tooling around security of Flink

2019-12-12 Thread Robert Metzger
Hi all,

There was recently a private report to the Flink PMC, as well as publicly
[1] about Flink's ability to execute arbitrary code. In scenarios where
Flink is accessible by somebody unauthorized, this can lead to issues.
The PMC received a similar report in November 2018.

I believe it would be good to warn our users a bit more prominently about
the risks of accidentally opening up Flink to the public internet, or other
unauthorized entities.

I have collected the following potential solutions discussed so far:

a) Add a check-security.sh script, or a check into the frontend if the
JobManager can be reached on the public internet
b) Add a prominent warning to the download page
c) add an opt-out warning to the Flink logs / UI that can be disabled via
the config.
d) Bind the REST endpoint to localhost only, by default


I'm curious to hear if others have other ideas what to do.
I personally like to kick things off with b).


Best,
Robert


[1] https://twitter.com/pyn3rd/status/1197397475897692160


Re: [DISCUSS] Overwrite and partition inserting support in 1.10

2019-12-12 Thread Jingsong Li
Hi Timo,

I am OK if you think they are not bug and they should not be included in
1.10.

I think they have been accepted in FLIP-63. And there is no objection. It
has been more than three months since the discussion of FLIP-63. It's been
six months since Flink added these two syntaxs.

But I can also start discussion and vote thread for FLIP-63 again, to make
sure once again that everyone is happy.

Best,
Jingsong Lee

On Thu, Dec 12, 2019 at 11:35 PM Timo Walther  wrote:

> Hi Jingsong,
>
> I will also add my opinion here for future discussions:
>
> We had long discussions around SQL syntax in the past (e.g. for
> WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the
> end all parties were happy and we came up with a good long-term solution
> that is unlikely to be changed in the near future. IMHO we can differ
> from the standard esp. for DDL statements (every SQL vendors has custom
> syntax there) but still we should have time to hear many opinions.
> However, for DQL and DML statements we need to be even more cautious
> because here we enter SQL standard areas.
>
> Happy to discuss a partion syntax for 1.11 and go with the solution that
> Jark proposed using a config option.
>
> Thanks,
> Timo
>
>
> On 12.12.19 09:40, Jark Wu wrote:
> > Hi Jingsong,
> >
> > Thanks for the explanation, I think I misunderstood your point at the
> > begining.
> > As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are
> added
> > to Flink's
> > SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> > dialect.
> > However, the current implementation is opposite, INSERT OVERWRITE
> > and INSERT PARTITION are
> > under the dialect limitation, but CREATE PARTITION TABLE is not.
> >
> > So it is indeed a bug which should be fixed in 1.10.
> >
> > Best,
> > Jark
> >
> > On Thu, 12 Dec 2019 at 16:35, Jingsong Li 
> wrote:
> >
> >> Hi Jark,
> >>
> >> Let's recall FLIP-63,
> >> We supported these syntax in hive dialect at 1.9. All of my reasons for
> >> launching FLIP-63 are to bring partition support to Flink itself.
> >> Not only batch, but also we have the need to stream jobs to write
> partition
> >> files today, which is also one of our very important application
> scenarios.
> >>
> >> The original intention of FLIP-63 is to bring all partition syntax to
> >> Flink, but in the end you and I have some different opinion in creating
> >> partition table, so our consensus is to leave it in hive dialect, only
> it.
> >>
> >> [1]
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu  wrote:
> >>
> >>> Hi jingsong,
> >>>
> >>> Watermark is not a standard syntax, that's why we had a FLIP and long
> >>> discussion to add it to
> >>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> >>> PARTITION syntax to
> >>>   Flink's own syntax,  we also need a FLIP or a VOTE, and this may
> can't
> >>> happen soon (we should
> >>> hear more people's opinions on this).
> >>>
> >>> Regarding to the sql-dialect configuration, I was not saying to involve
> >> the
> >>> whole FLIP-89. I mean we can just
> >>> start a VOTE to expose it as `table.planner.sql-dialect` and include it
> >> in
> >>> 1.10. The change can be very small, by
> >>> adding a ConfigOption and changing the implementation of
> >>> TableConfig#getSqlDialect/setSqlDialect. I believe
> >>> it is smaller and safer than changing the parser.
> >>>
> >>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> managers
> >>> should be aware of this.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
> >>>
> >>> On Thu, 12 Dec 2019 at 11:47, Danny Chan  wrote:
> >>>
>  Thanks Jingsong for bring up this discussion ~
> 
>  After reviewing FLIP-63, it seems that we have made a conclusion for
> >> the
>  syntax
> 
>  - INSERT OVERWRITE ...
>  - INSERT INTO … PARTITION
> 
>  Which means that they should not have the Hive dialect limitation, so
> >> I’m
>  inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
> >>> need
>  to fix.
> 
>  We did not make a conclusion for the syntax:
> 
>  - CREATE TABLE … PARTITIONED BY ...
> 
>  Which means that the behavior of it is under-discussion, so it is okey
> >> to
>  be without the HIVE dialect limitation, we do not actually have any
> >> table
>  sources/sinks that support such a DDL so for current code base, users
>  should not be influenced by the behaviors change.
> 
>  So I’m
> 
>  +1 to remove the hive dialect limitations for INSERT OVERWRITE and
> >> INSERT
>  PARTITION
>  +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
>  yet, we better do this until FLIP-89 is resolved.
> 
>  Best,
>  Danny Chan
>  在 2019年12月11日 +0800 PM5:29,Jingsong Li ,写道:
> > Hi Dev,
> >
> > After cutting 

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

2019-12-12 Thread Timo Walther

Hi Jingsong,

I will also add my opinion here for future discussions:

We had long discussions around SQL syntax in the past (e.g. for 
WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the 
end all parties were happy and we came up with a good long-term solution 
that is unlikely to be changed in the near future. IMHO we can differ 
from the standard esp. for DDL statements (every SQL vendors has custom 
syntax there) but still we should have time to hear many opinions. 
However, for DQL and DML statements we need to be even more cautious 
because here we enter SQL standard areas.


Happy to discuss a partion syntax for 1.11 and go with the solution that 
Jark proposed using a config option.


Thanks,
Timo


On 12.12.19 09:40, Jark Wu wrote:

Hi Jingsong,

Thanks for the explanation, I think I misunderstood your point at the
begining.
As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are added
to Flink's
SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
dialect.
However, the current implementation is opposite, INSERT OVERWRITE
and INSERT PARTITION are
under the dialect limitation, but CREATE PARTITION TABLE is not.

So it is indeed a bug which should be fixed in 1.10.

Best,
Jark

On Thu, 12 Dec 2019 at 16:35, Jingsong Li  wrote:


Hi Jark,

Let's recall FLIP-63,
We supported these syntax in hive dialect at 1.9. All of my reasons for
launching FLIP-63 are to bring partition support to Flink itself.
Not only batch, but also we have the need to stream jobs to write partition
files today, which is also one of our very important application scenarios.

The original intention of FLIP-63 is to bring all partition syntax to
Flink, but in the end you and I have some different opinion in creating
partition table, so our consensus is to leave it in hive dialect, only it.

[1]

https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support

Best,
Jingsong Lee

On Thu, Dec 12, 2019 at 4:05 PM Jark Wu  wrote:


Hi jingsong,

Watermark is not a standard syntax, that's why we had a FLIP and long
discussion to add it to
Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
PARTITION syntax to
  Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
happen soon (we should
hear more people's opinions on this).

Regarding to the sql-dialect configuration, I was not saying to involve

the

whole FLIP-89. I mean we can just
start a VOTE to expose it as `table.planner.sql-dialect` and include it

in

1.10. The change can be very small, by
adding a ConfigOption and changing the implementation of
TableConfig#getSqlDialect/setSqlDialect. I believe
it is smaller and safer than changing the parser.

Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
should be aware of this.

Best,
Jark



On Thu, 12 Dec 2019 at 11:47, Danny Chan  wrote:


Thanks Jingsong for bring up this discussion ~

After reviewing FLIP-63, it seems that we have made a conclusion for

the

syntax

- INSERT OVERWRITE ...
- INSERT INTO … PARTITION

Which means that they should not have the Hive dialect limitation, so

I’m

inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that

need

to fix.

We did not make a conclusion for the syntax:

- CREATE TABLE … PARTITIONED BY ...

Which means that the behavior of it is under-discussion, so it is okey

to

be without the HIVE dialect limitation, we do not actually have any

table

sources/sinks that support such a DDL so for current code base, users
should not be influenced by the behaviors change.

So I’m

+1 to remove the hive dialect limitations for INSERT OVERWRITE and

INSERT

PARTITION
+0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
yet, we better do this until FLIP-89 is resolved.

Best,
Danny Chan
在 2019年12月11日 +0800 PM5:29,Jingsong Li ,写道:

Hi Dev,

After cutting out the branch of 1.10, I tried the following functions

of

SQL-CLI and found that it does not support:
- insert overwrite
- PARTITION (partcol1=val1, partcol2=val2 ...)
The SQL pattern is:
INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION

(partcol1=val1,

partcol2=val2 ...) select_statement1 FROM from_statement;
It is a surprise to me.
The reason is that we only allow these two grammars in hive dialect.

And

SQL-CLI does not have an interface to switch dialects.

Because it directly hinders the SQL-CLI's insert syntax in hive

integration

and seriously hinders the practicability of SQL-CLI.
And we have introduced these two grammars in FLIP-63 [1] to Flink.
Here are my question:
1.Should we remove hive dialect limitation for these two grammars?
2.Should we fix this in 1.10?

[1]






https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support


Best,
Jingsong Lee







--
Best, Jingsong Lee







[jira] [Created] (FLINK-15225) LeaderChangeClusterComponentsTest#testReelectionOfDispatcher occasionally requires 30 seconds

2019-12-12 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-15225:


 Summary: 
LeaderChangeClusterComponentsTest#testReelectionOfDispatcher occasionally 
requires 30 seconds
 Key: FLINK-15225
 URL: https://issues.apache.org/jira/browse/FLINK-15225
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination, Tests
Affects Versions: 1.10.0
Reporter: Chesnay Schepler


{code:java}
20845 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl  - 
Starting the SlotManager.
20845 [mini-cluster-io-thread-1] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess  - 
Recover all persisted job graphs.
20845 [mini-cluster-io-thread-1] INFO  
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess  - 
Successfully recovered 0 persisted job graphs.
20845 [flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Trigger 
heartbeat request.
20845 [flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Trigger 
heartbeat request.
20845 [flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.highavailability.nonha.embedded.EmbeddedLeaderService  
- Received confirmation of leadership for leader 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d , 
session=e73f31a7-c45f-4328-addd-3d7aa17fd083
20845 [mini-cluster-io-thread-1] INFO  
org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Starting RPC endpoint for 
org.apache.flink.runtime.dispatcher.StandaloneDispatcher at 
akka://flink/user/dispatcherd5f8446f-b44d-4bcd-88b3-65105c1c6ca0 .
20845 [pool-1-thread-1] DEBUG org.apache.flink.runtime.rpc.akka.AkkaRpcService  
- Try to connect to remote RPC endpoint with address 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d. 
Returning a org.apache.flink.runtime.resourcemanager.ResourceManagerGateway 
gateway.
20845 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Connecting to 
ResourceManager 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d(addd3d7aa17fd083e73f31a7c45f4328).
20845 [flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Try to connect to remote 
RPC endpoint with address 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d. 
Returning a org.apache.flink.runtime.resourcemanager.ResourceManagerGateway 
gateway.
20845 [pool-1-thread-1] DEBUG org.apache.flink.runtime.rpc.akka.AkkaRpcService  
- Try to connect to remote RPC endpoint with address 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d. 
Returning a org.apache.flink.runtime.resourcemanager.ResourceManagerGateway 
gateway.
20846 [flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Resolved ResourceManager 
address, beginning registration
20846 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Connecting to 
ResourceManager 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d(addd3d7aa17fd083e73f31a7c45f4328).
20846 [flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Registration at 
ResourceManager attempt 1 (timeout=100ms)
20846 [flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Try to connect to remote 
RPC endpoint with address 
akka://flink/user/resourcemanager9fa41f3a-7bd4-4f72-ae07-76b75499c88d. 
Returning a org.apache.flink.runtime.resourcemanager.ResourceManagerGateway 
gateway.
20846 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Resolved ResourceManager 
address, beginning registration
20846 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor  - Registration at 
ResourceManager attempt 1 (timeout=100ms)
20846 [flink-akka.actor.default-dispatcher-3] DEBUG 
org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Try to connect to remote 
RPC endpoint with address akka://flink/user/taskmanager_285. Returning a 
org.apache.flink.runtime.taskexecutor.TaskExecutorGateway gateway.
20846 [mini-cluster-io-thread-1] INFO  
org.apache.flink.runtime.highavailability.nonha.embedded.EmbeddedLeaderService  
- Received confirmation of leadership for leader 
akka://flink/user/dispatcherd5f8446f-b44d-4bcd-88b3-65105c1c6ca0 , 
session=d20872e6-0b96-43ea-9da2-7590a255e906
20846 [pool-1-thread-1] DEBUG org.apache.flink.runtime.rpc.akka.AkkaRpcService  
- Try to connect to remote RPC endpoint with address 
akka://flink/user/dispatcherd5f8446f-b44d-4bcd-88b3-65105c1c6ca0. Returning a 
org.apache.flink.runtime.dispatcher.Dispa

[jira] [Created] (FLINK-15224) Resource requirements are not respected when fulfilling a slot request with unresolvedRootSlots from a SlotSharingManager

2019-12-12 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-15224:
---

 Summary: Resource requirements are not respected when fulfilling a 
slot request with unresolvedRootSlots from a SlotSharingManager
 Key: FLINK-15224
 URL: https://issues.apache.org/jira/browse/FLINK-15224
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.11.0
Reporter: Zhu Zhu


In {{SchedulerImpl#allocateMultiTaskSlot}}, if a slot request cannot be 
fulfilled immediately with a resolved root slot(MultiTaskSlot that is fulfilled 
by an allocated slot) or with available slots, it will be assigned to a random 
unresolved root slot. It does not do resource requirements check in this case, 
so a large task slot can be assigned to a small shared slot (unresolved root 
slot) and when the shared slot received its physical slot offer, it will be 
recognized as oversubscribing and the slot would be released and related tasks 
would fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15223) Csv connector should unescape delimiter parameter character

2019-12-12 Thread Jark Wu (Jira)
Jark Wu created FLINK-15223:
---

 Summary: Csv connector should unescape delimiter parameter 
character
 Key: FLINK-15223
 URL: https://issues.apache.org/jira/browse/FLINK-15223
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Reporter: Jark Wu
 Fix For: 1.10.0


As described in documentation[1], a csv format can use 
{{'format.line-delimiter' = '\n'}} to specify line delimiter. However, the 
property value is parsed into two characters "\n" , this result to reading 
failed. There is no workaround for now, unless fix it. The delimiter should be 
unescaped, e.g. using {{StringEscapeUtils.unescapeJava}}. Note that both old 
csv and new csv have the same problem, and both {{field-delimiter}} and 
{{line-delimiter}}.

[1]: 
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#old-csv-format



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15222) Move state benchmark utils into core repository

2019-12-12 Thread Yu Li (Jira)
Yu Li created FLINK-15222:
-

 Summary: Move state benchmark utils into core repository
 Key: FLINK-15222
 URL: https://issues.apache.org/jira/browse/FLINK-15222
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Reporter: Yu Li


Currently we're maintaining the state benchmark utils in the flink-benchmark 
repository instead of core repository, which not only make it hard to find out 
compatibility issues if state backend codes are refactored and will cause 
problems like FLINK-15199, but also disobeys the instructions of 
flink-benchmark project:
{quote}
Recommended code structure is to define all benchmarks in Apache Flink and only 
wrap them here, in this repository, into executor classes.

Such code structured is due to using GPL2 licensed jmh library for the actual 
execution of the benchmarks. Ideally we would prefer to have all of the code 
moved to Apache Flink
{quote}

We will improve this and prevent future incompatible problem in this JIRA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [RESULT] [VOTE] Release 1.8.3, release candidate #3

2019-12-12 Thread Patrick Lucas
The upstream pull request[1] is open—usually they merge within a day and
the images are available shortly thereafter.

[1] https://github.com/docker-library/official-images/pull/7112

--
Patrick Lucas

On Thu, Dec 12, 2019 at 8:11 AM Hequn Cheng  wrote:

> Hi Patrick,
>
> The release has been announced.
>
> +1 to integrate the publication of Docker images into the Flink release
> process. Thus we can leverage the current release procedure for the docker
> image.
> Looking forward to the proposal.
>
> Best, Hequn
>
> On Thu, Dec 12, 2019 at 1:52 PM Yang Wang  wrote:
>
>> Hi Lucas,
>>
>> That's great if we could integrate the publication of Flink official
>> docker
>> images into
>> the Flink release process. Since many users are using or starting to use
>> Flink in
>> container environments.
>>
>>
>> Best,
>> Yang
>>
>> Patrick Lucas  于2019年12月11日周三 下午11:44写道:
>>
>> > Thanks, Hequn!
>> >
>> > The Dockerfiles for the Flink images on Docker Hub for the 1.8.3 release
>> > are prepared[1] and I'll open a pull request upstream[2] once the
>> release
>> > announcement has gone out.
>> >
>> > And stay tuned: I'm working on a proposal for integrating the
>> publication
>> > of these Docker images into the Flink release process and will send it
>> out
>> > to the dev list before or shortly after the holidays.
>> >
>> > [1]
>> >
>> >
>> https://github.com/docker-flink/docker-flink/commit/4d85b71b7cf9fa4a38f8682ed13aa0f55445e32e
>> > [2] https://github.com/docker-library/official-images/
>> >
>> > On Wed, Dec 11, 2019 at 3:30 AM Hequn Cheng 
>> wrote:
>> >
>> > > Hi everyone,
>> > >
>> > > I'm happy to announce that we have unanimously approved this release.
>> > >
>> > > There are 10 approving votes, 3 of which are binding:
>> > > * Jincheng (binding)
>> > > * Ufuk (binding)
>> > > * Till (binding)
>> > > * Jingsong
>> > > * Fabian
>> > > * Danny
>> > > * Yang Wang
>> > > * Dian
>> > > * Wei
>> > > * Hequn
>> > >
>> > > There are no disapproving votes.
>> > >
>> > > Thanks everyone!
>> > >
>> >
>>
>


[jira] [Created] (FLINK-15221) supporting Exactly-once for table APi

2019-12-12 Thread chaiyongqiang (Jira)
chaiyongqiang created FLINK-15221:
-

 Summary: supporting Exactly-once for table APi
 Key: FLINK-15221
 URL: https://issues.apache.org/jira/browse/FLINK-15221
 Project: Flink
  Issue Type: Wish
  Components: Table SQL / API
Affects Versions: 1.9.0
Reporter: chaiyongqiang


The Table Api doesn't support End to End Exactly once sematic like datastream 
Api.  Does Flink have a plan for this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Adding e2e tests for Flink's Mesos integration

2019-12-12 Thread Gary Yao
Thanks for your explanation. I think the proposal is reasonable.

On Thu, Dec 12, 2019 at 3:32 AM Yangze Guo  wrote:

> Thanks for the feedback, Gary.
>
> Regarding the WordCount test:
> - True. There is no test coverage increment compared to others.
> However, I think each test case better not have multiple purposes so
> that we could find out the root cause quickly. As discussed in
> FLINK-15135[1], I prefer only including WordCount test as the first
> step. If the time overhead of E2E tests become severe in the future, I
> agree to remove it. WDYT?
> - I think the main overhead comes from building the image. The
> subsequent tests will run fast since they will not build it again.
>
> Regarding the Rocks test, I think it is a typical scenario using
> off-heap memory. The main purpose is to verify the memory usage and
> memory configuration in Mesos mode. Two typical use cases are off-heap
> and on-heap. Thus, I think the following two test cases are valuable
> to be included:
> - A streaming task using heap backend. It should explicitly set the
> “taskmanager.memory.managed.size” to zero to check the potential
> unexpected usage of off-heap memory.
> - A streaming task using rocks backend. It covers the scenario using
> off-heap memory.
>
> Look forward to your kind feedback.
>
> [1]https://issues.apache.org/jira/browse/FLINK-15135
>
> Best,
> Yangze Guo
>
>
>
> On Wed, Dec 11, 2019 at 6:14 PM Gary Yao  wrote:
> >
> > Thanks for driving this effort. Also +1 from my side. I have left a few
> > questions below.
> >
> > > - Wordcount end-to-end test. For verifying the basic process of Mesos
> > > deployment.
> >
> > Would this add additional test coverage compared to the
> > "multiple submissions" test case? I am asking because the E2E tests are
> > already
> > expensive to run, and adding new tests should be carefully considered.
> >
> > > - State TTL RocksDb backend end-to-end test. For verifying memory
> > > configuration behaviors, since Mesos has it’s own config options and
> > > logics.
> >
> > Can you elaborate more on this? Which config options are relevant here?
> >
> > On Wed, Dec 11, 2019 at 9:58 AM Till Rohrmann 
> wrote:
> >
> > > +1 for building the image locally. If need should arise, then we could
> > > change it always later.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Dec 11, 2019 at 4:05 AM Xintong Song 
> > > wrote:
> > >
> > > > Thanks, Yangtze.
> > > >
> > > > +1 for building the image locally.
> > > > The time consumption for both building image locally and pulling it
> from
> > > > DockerHub sounds reasonable and affordable. Therefore, I'm also in
> favor
> > > of
> > > > avoiding the cost maintaining a custom image.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Wed, Dec 11, 2019 at 10:11 AM Yangze Guo 
> wrote:
> > > >
> > > > > Thanks for the feedback, Yang.
> > > > >
> > > > > Some updates I want to share in this thread.
> > > > > I have built a PoC version of Meos e2e test with WordCount
> > > > > workflow.[1] Then, I ran it in the testing environment. As the
> result
> > > > > shown here[2]:
> > > > > - For pulling image from DockerHub, it took 1 minute and 21 seconds
> > > > > - For building it locally, it took 2 minutes and 54 seconds.
> > > > >
> > > > > I prefer building it locally. Although it is slower, I think the
> time
> > > > > overhead, comparing to the cost of maintaining the image in
> DockerHub
> > > > > and the whole test process, is trivial for building or pulling the
> > > > > image.
> > > > >
> > > > > I look forward to hearing from you. ;)
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> https://github.com/KarmaGYZ/flink/commit/0406d942446a1b17f81d93235b21a829bf88ccf0
> > > > > [2]https://travis-ci.org/KarmaGYZ/flink/jobs/623207957
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Mon, Dec 9, 2019 at 2:39 PM Yang Wang 
> > > wrote:
> > > > > >
> > > > > > Thanks Yangze for starting this discussion.
> > > > > >
> > > > > > Just share my thoughts.
> > > > > >
> > > > > > If the mesos official docker image could not meet our
> requirement, i
> > > > > suggest to build the image locally.
> > > > > > We have done the same things for yarn e2e tests. This way is more
> > > > > flexible and easy to maintain. However,
> > > > > > i have no idea how long building the mesos image locally will
> take.
> > > > > Based on previous experience of yarn, i
> > > > > > think it may not take too much time.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Yangze Guo  于2019年12月7日周六 下午4:25写道:
> > > > > >>
> > > > > >> Thanks for your feedback!
> > > > > >>
> > > > > >> @Till
> > > > > >> Regarding the time overhead, I think it mainly come from the
> network
> > > > > >> transmission. For building the image locally, it will totally
> > > download
> > > > > >> 260MB files including the base image and packages. For pulling
> from
> 

[jira] [Created] (FLINK-15220) Add startFromTimestamp in KafkaTableSource

2019-12-12 Thread Paul Lin (Jira)
Paul Lin created FLINK-15220:


 Summary: Add startFromTimestamp in KafkaTableSource
 Key: FLINK-15220
 URL: https://issues.apache.org/jira/browse/FLINK-15220
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Kafka
Reporter: Paul Lin


KafkaTableSource supports all startup modes in DataStream API except 
`startFromTimestamp`, but it's a common and valid use case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15219) LocalEnvironment is not initializing plugins

2019-12-12 Thread Arvid Heise (Jira)
Arvid Heise created FLINK-15219:
---

 Summary: LocalEnvironment is not initializing plugins
 Key: FLINK-15219
 URL: https://issues.apache.org/jira/browse/FLINK-15219
 Project: Flink
  Issue Type: Improvement
Reporter: Arvid Heise






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-12 Thread tison
A quick idea is that we separate the deployment from user program that it
has always been done
outside the program. On user program executed there is always a
ClusterClient that communicates with
an existing cluster, remote or local. It will be another thread so just for
your information.

Best,
tison.


tison  于2019年12月12日周四 下午4:40写道:

> Hi Peter,
>
> Another concern I realized recently is that with current Executors
> abstraction(FLIP-73)
> I'm afraid that user program is designed to ALWAYS run on the client side.
> Specifically,
> we deploy the job in executor when env.execute called. This abstraction
> possibly prevents
> Flink runs user program on the cluster side.
>
> For your proposal, in this case we already compiled the program and run on
> the client side,
> even we deploy a cluster and retrieve job graph from program metadata, it
> doesn't make
> many sense.
>
> cc Aljoscha & Kostas what do you think about this constraint?
>
> Best,
> tison.
>
>
> Peter Huang  于2019年12月10日周二 下午12:45写道:
>
>> Hi Tison,
>>
>> Yes, you are right. I think I made the wrong argument in the doc.
>> Basically, the packaging jar problem is only for platform users. In our
>> internal deploy service,
>> we further optimized the deployment latency by letting users to packaging
>> flink-runtime together with the uber jar, so that we don't need to
>> consider
>> multiple flink version
>> support for now. In the session client mode, as Flink libs will be shipped
>> anyway as local resources of yarn. Users actually don't need to package
>> those libs into job jar.
>>
>>
>>
>> Best Regards
>> Peter Huang
>>
>> On Mon, Dec 9, 2019 at 8:35 PM tison  wrote:
>>
>> > > 3. What do you mean about the package? Do users need to compile their
>> > jars
>> > inlcuding flink-clients, flink-optimizer, flink-table codes?
>> >
>> > The answer should be no because they exist in system classpath.
>> >
>> > Best,
>> > tison.
>> >
>> >
>> > Yang Wang  于2019年12月10日周二 下午12:18写道:
>> >
>> > > Hi Peter,
>> > >
>> > > Thanks a lot for starting this discussion. I think this is a very
>> useful
>> > > feature.
>> > >
>> > > Not only for Yarn, i am focused on flink on Kubernetes integration and
>> > come
>> > > across the same
>> > > problem. I do not want the job graph generated on client side.
>> Instead,
>> > the
>> > > user jars are built in
>> > > a user-defined image. When the job manager launched, we just need to
>> > > generate the job graph
>> > > based on local user jars.
>> > >
>> > > I have some small suggestion about this.
>> > >
>> > > 1. `ProgramJobGraphRetriever` is very similar to
>> > > `ClasspathJobGraphRetriever`, the differences
>> > > are the former needs `ProgramMetadata` and the latter needs some
>> > arguments.
>> > > Is it possible to
>> > > have an unified `JobGraphRetriever` to support both?
>> > > 2. Is it possible to not use a local user jar to start a per-job
>> cluster?
>> > > In your case, the user jars has
>> > > existed on hdfs already and we do need to download the jars to
>> deployer
>> > > service. Currently, we
>> > > always need a local user jar to start a flink cluster. It is be great
>> if
>> > we
>> > > could support remote user jars.
>> > > >> In the implementation, we assume users package flink-clients,
>> > > flink-optimizer, flink-table together within the job jar. Otherwise,
>> the
>> > > job graph generation within JobClusterEntryPoint will fail.
>> > > 3. What do you mean about the package? Do users need to compile their
>> > jars
>> > > inlcuding flink-clients, flink-optimizer, flink-table codes?
>> > >
>> > >
>> > >
>> > > Best,
>> > > Yang
>> > >
>> > > Peter Huang  于2019年12月10日周二 上午2:37写道:
>> > >
>> > > > Dear All,
>> > > >
>> > > > Recently, the Flink community starts to improve the yarn cluster
>> > > descriptor
>> > > > to make job jar and config files configurable from CLI. It improves
>> the
>> > > > flexibility of  Flink deployment Yarn Per Job Mode. For platform
>> users
>> > > who
>> > > > manage tens of hundreds of streaming pipelines for the whole org or
>> > > > company, we found the job graph generation in client-side is another
>> > > > pinpoint. Thus, we want to propose a configurable feature for
>> > > > FlinkYarnSessionCli. The feature can allow users to choose the job
>> > graph
>> > > > generation in Flink ClusterEntryPoint so that the job jar doesn't
>> need
>> > to
>> > > > be locally for the job graph generation. The proposal is organized
>> as a
>> > > > FLIP
>> > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation
>> > > > .
>> > > >
>> > > > Any questions and suggestions are welcomed. Thank you in advance.
>> > > >
>> > > >
>> > > > Best Regards
>> > > > Peter Huang
>> > > >
>> > >
>> >
>>
>


Re: [DISCUSS] Overwrite and partition inserting support in 1.10

2019-12-12 Thread Jark Wu
Hi Jingsong,

Thanks for the explanation, I think I misunderstood your point at the
begining.
As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are added
to Flink's
SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
dialect.
However, the current implementation is opposite, INSERT OVERWRITE
and INSERT PARTITION are
under the dialect limitation, but CREATE PARTITION TABLE is not.

So it is indeed a bug which should be fixed in 1.10.

Best,
Jark

On Thu, 12 Dec 2019 at 16:35, Jingsong Li  wrote:

> Hi Jark,
>
> Let's recall FLIP-63,
> We supported these syntax in hive dialect at 1.9. All of my reasons for
> launching FLIP-63 are to bring partition support to Flink itself.
> Not only batch, but also we have the need to stream jobs to write partition
> files today, which is also one of our very important application scenarios.
>
> The original intention of FLIP-63 is to bring all partition syntax to
> Flink, but in the end you and I have some different opinion in creating
> partition table, so our consensus is to leave it in hive dialect, only it.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>
> Best,
> Jingsong Lee
>
> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu  wrote:
>
> > Hi jingsong,
> >
> > Watermark is not a standard syntax, that's why we had a FLIP and long
> > discussion to add it to
> > Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> > PARTITION syntax to
> >  Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
> > happen soon (we should
> > hear more people's opinions on this).
> >
> > Regarding to the sql-dialect configuration, I was not saying to involve
> the
> > whole FLIP-89. I mean we can just
> > start a VOTE to expose it as `table.planner.sql-dialect` and include it
> in
> > 1.10. The change can be very small, by
> > adding a ConfigOption and changing the implementation of
> > TableConfig#getSqlDialect/setSqlDialect. I believe
> > it is smaller and safer than changing the parser.
> >
> > Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
> > should be aware of this.
> >
> > Best,
> > Jark
> >
> >
> >
> > On Thu, 12 Dec 2019 at 11:47, Danny Chan  wrote:
> >
> > > Thanks Jingsong for bring up this discussion ~
> > >
> > > After reviewing FLIP-63, it seems that we have made a conclusion for
> the
> > > syntax
> > >
> > > - INSERT OVERWRITE ...
> > > - INSERT INTO … PARTITION
> > >
> > > Which means that they should not have the Hive dialect limitation, so
> I’m
> > > inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
> > need
> > > to fix.
> > >
> > > We did not make a conclusion for the syntax:
> > >
> > > - CREATE TABLE … PARTITIONED BY ...
> > >
> > > Which means that the behavior of it is under-discussion, so it is okey
> to
> > > be without the HIVE dialect limitation, we do not actually have any
> table
> > > sources/sinks that support such a DDL so for current code base, users
> > > should not be influenced by the behaviors change.
> > >
> > > So I’m
> > >
> > > +1 to remove the hive dialect limitations for INSERT OVERWRITE and
> INSERT
> > > PARTITION
> > > +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> > > yet, we better do this until FLIP-89 is resolved.
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年12月11日 +0800 PM5:29,Jingsong Li ,写道:
> > > > Hi Dev,
> > > >
> > > > After cutting out the branch of 1.10, I tried the following functions
> > of
> > > > SQL-CLI and found that it does not support:
> > > > - insert overwrite
> > > > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > > > The SQL pattern is:
> > > > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> (partcol1=val1,
> > > > partcol2=val2 ...) select_statement1 FROM from_statement;
> > > > It is a surprise to me.
> > > > The reason is that we only allow these two grammars in hive dialect.
> > And
> > > > SQL-CLI does not have an interface to switch dialects.
> > > >
> > > > Because it directly hinders the SQL-CLI's insert syntax in hive
> > > integration
> > > > and seriously hinders the practicability of SQL-CLI.
> > > > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > > > Here are my question:
> > > > 1.Should we remove hive dialect limitation for these two grammars?
> > > > 2.Should we fix this in 1.10?
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > > >
> > > > Best,
> > > > Jingsong Lee
> > >
> >
>
>
> --
> Best, Jingsong Lee
>


Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

2019-12-12 Thread tison
Hi Peter,

Another concern I realized recently is that with current Executors
abstraction(FLIP-73)
I'm afraid that user program is designed to ALWAYS run on the client side.
Specifically,
we deploy the job in executor when env.execute called. This abstraction
possibly prevents
Flink runs user program on the cluster side.

For your proposal, in this case we already compiled the program and run on
the client side,
even we deploy a cluster and retrieve job graph from program metadata, it
doesn't make
many sense.

cc Aljoscha & Kostas what do you think about this constraint?

Best,
tison.


Peter Huang  于2019年12月10日周二 下午12:45写道:

> Hi Tison,
>
> Yes, you are right. I think I made the wrong argument in the doc.
> Basically, the packaging jar problem is only for platform users. In our
> internal deploy service,
> we further optimized the deployment latency by letting users to packaging
> flink-runtime together with the uber jar, so that we don't need to consider
> multiple flink version
> support for now. In the session client mode, as Flink libs will be shipped
> anyway as local resources of yarn. Users actually don't need to package
> those libs into job jar.
>
>
>
> Best Regards
> Peter Huang
>
> On Mon, Dec 9, 2019 at 8:35 PM tison  wrote:
>
> > > 3. What do you mean about the package? Do users need to compile their
> > jars
> > inlcuding flink-clients, flink-optimizer, flink-table codes?
> >
> > The answer should be no because they exist in system classpath.
> >
> > Best,
> > tison.
> >
> >
> > Yang Wang  于2019年12月10日周二 下午12:18写道:
> >
> > > Hi Peter,
> > >
> > > Thanks a lot for starting this discussion. I think this is a very
> useful
> > > feature.
> > >
> > > Not only for Yarn, i am focused on flink on Kubernetes integration and
> > come
> > > across the same
> > > problem. I do not want the job graph generated on client side. Instead,
> > the
> > > user jars are built in
> > > a user-defined image. When the job manager launched, we just need to
> > > generate the job graph
> > > based on local user jars.
> > >
> > > I have some small suggestion about this.
> > >
> > > 1. `ProgramJobGraphRetriever` is very similar to
> > > `ClasspathJobGraphRetriever`, the differences
> > > are the former needs `ProgramMetadata` and the latter needs some
> > arguments.
> > > Is it possible to
> > > have an unified `JobGraphRetriever` to support both?
> > > 2. Is it possible to not use a local user jar to start a per-job
> cluster?
> > > In your case, the user jars has
> > > existed on hdfs already and we do need to download the jars to deployer
> > > service. Currently, we
> > > always need a local user jar to start a flink cluster. It is be great
> if
> > we
> > > could support remote user jars.
> > > >> In the implementation, we assume users package flink-clients,
> > > flink-optimizer, flink-table together within the job jar. Otherwise,
> the
> > > job graph generation within JobClusterEntryPoint will fail.
> > > 3. What do you mean about the package? Do users need to compile their
> > jars
> > > inlcuding flink-clients, flink-optimizer, flink-table codes?
> > >
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Peter Huang  于2019年12月10日周二 上午2:37写道:
> > >
> > > > Dear All,
> > > >
> > > > Recently, the Flink community starts to improve the yarn cluster
> > > descriptor
> > > > to make job jar and config files configurable from CLI. It improves
> the
> > > > flexibility of  Flink deployment Yarn Per Job Mode. For platform
> users
> > > who
> > > > manage tens of hundreds of streaming pipelines for the whole org or
> > > > company, we found the job graph generation in client-side is another
> > > > pinpoint. Thus, we want to propose a configurable feature for
> > > > FlinkYarnSessionCli. The feature can allow users to choose the job
> > graph
> > > > generation in Flink ClusterEntryPoint so that the job jar doesn't
> need
> > to
> > > > be locally for the job graph generation. The proposal is organized
> as a
> > > > FLIP
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation
> > > > .
> > > >
> > > > Any questions and suggestions are welcomed. Thank you in advance.
> > > >
> > > >
> > > > Best Regards
> > > > Peter Huang
> > > >
> > >
> >
>


Re: [DISCUSS] Overwrite and partition inserting support in 1.10

2019-12-12 Thread Jingsong Li
Hi Jark,

Let's recall FLIP-63,
We supported these syntax in hive dialect at 1.9. All of my reasons for
launching FLIP-63 are to bring partition support to Flink itself.
Not only batch, but also we have the need to stream jobs to write partition
files today, which is also one of our very important application scenarios.

The original intention of FLIP-63 is to bring all partition syntax to
Flink, but in the end you and I have some different opinion in creating
partition table, so our consensus is to leave it in hive dialect, only it.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support

Best,
Jingsong Lee

On Thu, Dec 12, 2019 at 4:05 PM Jark Wu  wrote:

> Hi jingsong,
>
> Watermark is not a standard syntax, that's why we had a FLIP and long
> discussion to add it to
> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> PARTITION syntax to
>  Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
> happen soon (we should
> hear more people's opinions on this).
>
> Regarding to the sql-dialect configuration, I was not saying to involve the
> whole FLIP-89. I mean we can just
> start a VOTE to expose it as `table.planner.sql-dialect` and include it in
> 1.10. The change can be very small, by
> adding a ConfigOption and changing the implementation of
> TableConfig#getSqlDialect/setSqlDialect. I believe
> it is smaller and safer than changing the parser.
>
> Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
> should be aware of this.
>
> Best,
> Jark
>
>
>
> On Thu, 12 Dec 2019 at 11:47, Danny Chan  wrote:
>
> > Thanks Jingsong for bring up this discussion ~
> >
> > After reviewing FLIP-63, it seems that we have made a conclusion for the
> > syntax
> >
> > - INSERT OVERWRITE ...
> > - INSERT INTO … PARTITION
> >
> > Which means that they should not have the Hive dialect limitation, so I’m
> > inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
> need
> > to fix.
> >
> > We did not make a conclusion for the syntax:
> >
> > - CREATE TABLE … PARTITIONED BY ...
> >
> > Which means that the behavior of it is under-discussion, so it is okey to
> > be without the HIVE dialect limitation, we do not actually have any table
> > sources/sinks that support such a DDL so for current code base, users
> > should not be influenced by the behaviors change.
> >
> > So I’m
> >
> > +1 to remove the hive dialect limitations for INSERT OVERWRITE and INSERT
> > PARTITION
> > +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> > yet, we better do this until FLIP-89 is resolved.
> >
> > Best,
> > Danny Chan
> > 在 2019年12月11日 +0800 PM5:29,Jingsong Li ,写道:
> > > Hi Dev,
> > >
> > > After cutting out the branch of 1.10, I tried the following functions
> of
> > > SQL-CLI and found that it does not support:
> > > - insert overwrite
> > > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > > The SQL pattern is:
> > > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> > > partcol2=val2 ...) select_statement1 FROM from_statement;
> > > It is a surprise to me.
> > > The reason is that we only allow these two grammars in hive dialect.
> And
> > > SQL-CLI does not have an interface to switch dialects.
> > >
> > > Because it directly hinders the SQL-CLI's insert syntax in hive
> > integration
> > > and seriously hinders the practicability of SQL-CLI.
> > > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > > Here are my question:
> > > 1.Should we remove hive dialect limitation for these two grammars?
> > > 2.Should we fix this in 1.10?
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >
> > > Best,
> > > Jingsong Lee
> >
>


-- 
Best, Jingsong Lee


Re: [DISCUSS] Overwrite and partition inserting support in 1.10

2019-12-12 Thread Jark Wu
Hi jingsong,

Watermark is not a standard syntax, that's why we had a FLIP and long
discussion to add it to
Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
PARTITION syntax to
 Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
happen soon (we should
hear more people's opinions on this).

Regarding to the sql-dialect configuration, I was not saying to involve the
whole FLIP-89. I mean we can just
start a VOTE to expose it as `table.planner.sql-dialect` and include it in
1.10. The change can be very small, by
adding a ConfigOption and changing the implementation of
TableConfig#getSqlDialect/setSqlDialect. I believe
it is smaller and safer than changing the parser.

Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
should be aware of this.

Best,
Jark



On Thu, 12 Dec 2019 at 11:47, Danny Chan  wrote:

> Thanks Jingsong for bring up this discussion ~
>
> After reviewing FLIP-63, it seems that we have made a conclusion for the
> syntax
>
> - INSERT OVERWRITE ...
> - INSERT INTO … PARTITION
>
> Which means that they should not have the Hive dialect limitation, so I’m
> inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that need
> to fix.
>
> We did not make a conclusion for the syntax:
>
> - CREATE TABLE … PARTITIONED BY ...
>
> Which means that the behavior of it is under-discussion, so it is okey to
> be without the HIVE dialect limitation, we do not actually have any table
> sources/sinks that support such a DDL so for current code base, users
> should not be influenced by the behaviors change.
>
> So I’m
>
> +1 to remove the hive dialect limitations for INSERT OVERWRITE and INSERT
> PARTITION
> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> yet, we better do this until FLIP-89 is resolved.
>
> Best,
> Danny Chan
> 在 2019年12月11日 +0800 PM5:29,Jingsong Li ,写道:
> > Hi Dev,
> >
> > After cutting out the branch of 1.10, I tried the following functions of
> > SQL-CLI and found that it does not support:
> > - insert overwrite
> > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > The SQL pattern is:
> > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> > partcol2=val2 ...) select_statement1 FROM from_statement;
> > It is a surprise to me.
> > The reason is that we only allow these two grammars in hive dialect. And
> > SQL-CLI does not have an interface to switch dialects.
> >
> > Because it directly hinders the SQL-CLI's insert syntax in hive
> integration
> > and seriously hinders the practicability of SQL-CLI.
> > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > Here are my question:
> > 1.Should we remove hive dialect limitation for these two grammars?
> > 2.Should we fix this in 1.10?
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >
> > Best,
> > Jingsong Lee
>


Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-12 Thread Zhu Zhu
Thanks Hequn for driving the release and everyone who makes this release
possible!

Thanks,
Zhu Zhu

Wei Zhong  于2019年12月12日周四 下午3:45写道:

> Thanks Hequn for being the release manager. Great work!
>
> Best,
> Wei
>
> 在 2019年12月12日,15:27,Jingsong Li  写道:
>
> Thanks Hequn for your driving, 1.8.3 fixed a lot of issues and it is very
> useful to users.
> Great work!
>
> Best,
> Jingsong Lee
>
> On Thu, Dec 12, 2019 at 3:25 PM jincheng sun 
> wrote:
>
>> Thanks for being the release manager and the great work Hequn :)
>> Also thanks to the community making this release possible!
>>
>> Best,
>> Jincheng
>>
>> Jark Wu  于2019年12月12日周四 下午3:23写道:
>>
>>> Thanks Hequn for helping out this release and being the release manager.
>>> Great work!
>>>
>>> Best,
>>> Jark
>>>
>>> On Thu, 12 Dec 2019 at 15:02, Jeff Zhang  wrote:
>>>
>>> > Great work, Hequn
>>> >
>>> > Dian Fu  于2019年12月12日周四 下午2:32写道:
>>> >
>>> >> Thanks Hequn for being the release manager and everyone who
>>> contributed
>>> >> to this release.
>>> >>
>>> >> Regards,
>>> >> Dian
>>> >>
>>> >> 在 2019年12月12日,下午2:24,Hequn Cheng  写道:
>>> >>
>>> >> Hi,
>>> >>
>>> >> The Apache Flink community is very happy to announce the release of
>>> >> Apache Flink 1.8.3, which is the third bugfix release for the Apache
>>> Flink
>>> >> 1.8 series.
>>> >>
>>> >> Apache Flink® is an open-source stream processing framework for
>>> >> distributed, high-performing, always-available, and accurate data
>>> streaming
>>> >> applications.
>>> >>
>>> >> The release is available for download at:
>>> >> https://flink.apache.org/downloads.html
>>> >>
>>> >> Please check out the release blog post for an overview of the
>>> >> improvements for this bugfix release:
>>> >> https://flink.apache.org/news/2019/12/11/release-1.8.3.html
>>> >>
>>> >> The full release notes are available in Jira:
>>> >> https://issues.apache.org/jira/projects/FLINK/versions/12346112
>>> >>
>>> >> We would like to thank all contributors of the Apache Flink community
>>> who
>>> >> made this release possible!
>>> >> Great thanks to @Jincheng as a mentor during this release.
>>> >>
>>> >> Regards,
>>> >> Hequn
>>> >>
>>> >>
>>> >>
>>> >
>>> > --
>>> > Best Regards
>>> >
>>> > Jeff Zhang
>>> >
>>>
>>
>
> --
> Best, Jingsong Lee
>
>
>