[GitHub] yanghua commented on issue #6567: [FLINK-10074] Allowable number of checkpoint failures

2018-08-22 Thread GitBox
yanghua commented on issue #6567: [FLINK-10074] Allowable number of checkpoint 
failures
URL: https://github.com/apache/flink/pull/6567#issuecomment-415303507
 
 
   @tillrohrmann When you have free time, please review this PR. thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10074) Allowable number of checkpoint failures

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589739#comment-16589739
 ] 

ASF GitHub Bot commented on FLINK-10074:


yanghua commented on issue #6567: [FLINK-10074] Allowable number of checkpoint 
failures
URL: https://github.com/apache/flink/pull/6567#issuecomment-415303507
 
 
   @tillrohrmann When you have free time, please review this PR. thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allowable number of checkpoint failures 
> 
>
> Key: FLINK-10074
> URL: https://issues.apache.org/jira/browse/FLINK-10074
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Thomas Weise
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> For intermittent checkpoint failures it is desirable to have a mechanism to 
> avoid restarts. If, for example, a transient S3 error prevents checkpoint 
> completion, the next checkpoint may very well succeed. The user may wish to 
> not incur the expense of restart under such scenario and this could be 
> expressed with a failure threshold (number of subsequent checkpoint 
> failures), possibly combined with a list of exceptions to tolerate.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-10163) Support CREATE VIEW in SQL Client

2018-08-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10163:
---
Labels: pull-request-available  (was: )

> Support CREATE VIEW in SQL Client
> -
>
> Key: FLINK-10163
> URL: https://issues.apache.org/jira/browse/FLINK-10163
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> The possibility to define a name for a subquery would improve the usability 
> of the SQL Client. The SQL standard defines \{{CREATE VIEW}} for defining a 
> virtual table.
>  
> Example:
> {code}
>  CREATE VIEW viewName
>  [ '(' columnName [, columnName ]* ')' ]
>  AS Query
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10163) Support CREATE VIEW in SQL Client

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589718#comment-16589718
 ] 

ASF GitHub Bot commented on FLINK-10163:


twalthr opened a new pull request #6606: [FLINK-10163] [sql-client] Support 
views in SQL Client
URL: https://github.com/apache/flink/pull/6606
 
 
   ## What is the purpose of the change
   
   Adds initial support for views in SQL Client. It adds the following 
statements:
   - `CREATE VIEW`: Creates a virtual table from a SQL query. Syntax: 'CREATE 
VIEW  AS '
   - `SHOW VIEW`: Describes a previously created virtual table. Syntax: 'SHOW 
VIEW '
   - `DROP VIEW`: Deletes a previously created virtual table. Syntax: 'DROP 
VIEW '
   
   It also adds the section `views` to environment files.
   
   
   ## Brief change log
   
   - Rewritten statement parser for statements with more than 1 operand
   - View support added throughout the SQL Client
   - Initial validation of environment files during startup
   - Code clean up
   
   
   ## Verifying this change
   
   - `SqlCommandParserTest`
   - `ExecutionContextTest`
   - `LocalExecutorITCase`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? docs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support CREATE VIEW in SQL Client
> -
>
> Key: FLINK-10163
> URL: https://issues.apache.org/jira/browse/FLINK-10163
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> The possibility to define a name for a subquery would improve the usability 
> of the SQL Client. The SQL standard defines \{{CREATE VIEW}} for defining a 
> virtual table.
>  
> Example:
> {code}
>  CREATE VIEW viewName
>  [ '(' columnName [, columnName ]* ')' ]
>  AS Query
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] twalthr opened a new pull request #6606: [FLINK-10163] [sql-client] Support views in SQL Client

2018-08-22 Thread GitBox
twalthr opened a new pull request #6606: [FLINK-10163] [sql-client] Support 
views in SQL Client
URL: https://github.com/apache/flink/pull/6606
 
 
   ## What is the purpose of the change
   
   Adds initial support for views in SQL Client. It adds the following 
statements:
   - `CREATE VIEW`: Creates a virtual table from a SQL query. Syntax: 'CREATE 
VIEW  AS '
   - `SHOW VIEW`: Describes a previously created virtual table. Syntax: 'SHOW 
VIEW '
   - `DROP VIEW`: Deletes a previously created virtual table. Syntax: 'DROP 
VIEW '
   
   It also adds the section `views` to environment files.
   
   
   ## Brief change log
   
   - Rewritten statement parser for statements with more than 1 operand
   - View support added throughout the SQL Client
   - Initial validation of environment files during startup
   - Code clean up
   
   
   ## Verifying this change
   
   - `SqlCommandParserTest`
   - `ExecutionContextTest`
   - `LocalExecutorITCase`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? docs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10201) The batchTestUtil was mistakenly used in some stream sql tests

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589679#comment-16589679
 ] 

ASF GitHub Bot commented on FLINK-10201:


xccui opened a new pull request #6605: [FLINK-10201] [table] [test] The 
batchTestUtil was mistakenly used in some stream sql tests
URL: https://github.com/apache/flink/pull/6605
 
 
   ## What is the purpose of the change
   
   This PR fixes two improper test cases that mistakenly use batchTestUtil for 
stream sql.
   
   ## Brief change log
   
   Replaces the `batchTestUtil` with `streamTestUti`l in 
`SetOperatorsTest.testValuesWithCast()` and 
`CorrelateTest.testLeftOuterJoinAsSubQuery()`.
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> The batchTestUtil was mistakenly used in some stream sql tests
> --
>
> Key: FLINK-10201
> URL: https://issues.apache.org/jira/browse/FLINK-10201
> Project: Flink
>  Issue Type: Test
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Minor
>  Labels: pull-request-available
>
> The {{batchTestUtil}} was mistakenly used in stream sql tests 
> {{SetOperatorsTest.testValuesWithCast()}} and 
> {{CorrelateTest.testLeftOuterJoinAsSubQuery()}}. That should be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-10201) The batchTestUtil was mistakenly used in some stream sql tests

2018-08-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10201:
---
Labels: pull-request-available  (was: )

> The batchTestUtil was mistakenly used in some stream sql tests
> --
>
> Key: FLINK-10201
> URL: https://issues.apache.org/jira/browse/FLINK-10201
> Project: Flink
>  Issue Type: Test
>  Components: Table API & SQL
>Reporter: Xingcan Cui
>Assignee: Xingcan Cui
>Priority: Minor
>  Labels: pull-request-available
>
> The {{batchTestUtil}} was mistakenly used in stream sql tests 
> {{SetOperatorsTest.testValuesWithCast()}} and 
> {{CorrelateTest.testLeftOuterJoinAsSubQuery()}}. That should be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] xccui opened a new pull request #6605: [FLINK-10201] [table] [test] The batchTestUtil was mistakenly used in some stream sql tests

2018-08-22 Thread GitBox
xccui opened a new pull request #6605: [FLINK-10201] [table] [test] The 
batchTestUtil was mistakenly used in some stream sql tests
URL: https://github.com/apache/flink/pull/6605
 
 
   ## What is the purpose of the change
   
   This PR fixes two improper test cases that mistakenly use batchTestUtil for 
stream sql.
   
   ## Brief change log
   
   Replaces the `batchTestUtil` with `streamTestUti`l in 
`SetOperatorsTest.testValuesWithCast()` and 
`CorrelateTest.testLeftOuterJoinAsSubQuery()`.
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-10201) The batchTestUtil was mistakenly used in some stream sql tests

2018-08-22 Thread Xingcan Cui (JIRA)
Xingcan Cui created FLINK-10201:
---

 Summary: The batchTestUtil was mistakenly used in some stream sql 
tests
 Key: FLINK-10201
 URL: https://issues.apache.org/jira/browse/FLINK-10201
 Project: Flink
  Issue Type: Test
  Components: Table API & SQL
Reporter: Xingcan Cui
Assignee: Xingcan Cui


The {{batchTestUtil}} was mistakenly used in stream sql tests 
{{SetOperatorsTest.testValuesWithCast()}} and 
{{CorrelateTest.testLeftOuterJoinAsSubQuery()}}. That should be fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10172) Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589439#comment-16589439
 ] 

ASF GitHub Bot commented on FLINK-10172:


walterddr edited a comment on issue #6585: [FLINK-10172][table]re-adding asc & 
desc suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415214473
 
 
   @twalthr I just did a cross-check and all touched expressions are covered in 
testings. :-) The only thing not covered is `"w.start()"` in addition to 
`"start(w)"`. which I think we can leave it as it is not necessary to treat as 
a special expression.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc
> --
>
> Key: FLINK-10172
> URL: https://issues.apache.org/jira/browse/FLINK-10172
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.3, 1.4.2, 1.5.3, 1.6.0, 1.7.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.3, 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>
> The following expression throws an exception in parsing {{"id.asc"}} term.
> {code:java}
> Table allOrders = orderTable
> .select("id,order_date,amount,customer_id")
> .orderBy("id.asc");
> {code}
> while it is correctly parsed for Scala:
> {code:scala}
> val allOrders:Table = orderTable
> .select('id, 'order_date, 'amount, 'customer_id)
> .orderBy('id.asc)
> {code}
> Anticipated some inconsistency between ExpressionParser and ExpressionDsl



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] walterddr edited a comment on issue #6585: [FLINK-10172][table]re-adding asc & desc suffix expression to expressionParser

2018-08-22 Thread GitBox
walterddr edited a comment on issue #6585: [FLINK-10172][table]re-adding asc & 
desc suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415214473
 
 
   @twalthr I just did a cross-check and all touched expressions are covered in 
testings. :-) The only thing not covered is `"w.start()"` in addition to 
`"start(w)"`. which I think we can leave it as it is not necessary to treat as 
a special expression.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10172) Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589438#comment-16589438
 ] 

ASF GitHub Bot commented on FLINK-10172:


walterddr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc 
suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415214473
 
 
   @twalthr I just did a cross-check and all touched expressions are covered in 
testings. :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc
> --
>
> Key: FLINK-10172
> URL: https://issues.apache.org/jira/browse/FLINK-10172
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.3, 1.4.2, 1.5.3, 1.6.0, 1.7.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.3, 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>
> The following expression throws an exception in parsing {{"id.asc"}} term.
> {code:java}
> Table allOrders = orderTable
> .select("id,order_date,amount,customer_id")
> .orderBy("id.asc");
> {code}
> while it is correctly parsed for Scala:
> {code:scala}
> val allOrders:Table = orderTable
> .select('id, 'order_date, 'amount, 'customer_id)
> .orderBy('id.asc)
> {code}
> Anticipated some inconsistency between ExpressionParser and ExpressionDsl



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] walterddr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc suffix expression to expressionParser

2018-08-22 Thread GitBox
walterddr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc 
suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415214473
 
 
   @twalthr I just did a cross-check and all touched expressions are covered in 
testings. :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9061) add entropy to s3 path for better scalability

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589347#comment-16589347
 ] 

ASF GitHub Bot commented on FLINK-9061:
---

StephanEwen commented on issue #6302:  [FLINK-9061][checkpointing] add entropy 
to s3 path for better scalability
URL: https://github.com/apache/flink/pull/6302#issuecomment-415179311
 
 
   I rebased and adopted this PR for Flink 1.5/1.6 under #6604 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> add entropy to s3 path for better scalability
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jamie Grier
>Assignee: Indrajit Roychoudhury
>Priority: Critical
>  Labels: pull-request-available
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StephanEwen commented on issue #6302: [FLINK-9061][checkpointing] add entropy to s3 path for better scalability

2018-08-22 Thread GitBox
StephanEwen commented on issue #6302:  [FLINK-9061][checkpointing] add entropy 
to s3 path for better scalability
URL: https://github.com/apache/flink/pull/6302#issuecomment-415179311
 
 
   I rebased and adopted this PR for Flink 1.5/1.6 under #6604 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9061) add entropy to s3 path for better scalability

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589342#comment-16589342
 ] 

ASF GitHub Bot commented on FLINK-9061:
---

StephanEwen opened a new pull request #6604: [FLINK-9061] Optionally add 
entropy to checkpoint paths better S3 scalability
URL: https://github.com/apache/flink/pull/6604
 
 
   ## What is the purpose of the change
   
   This pull request adds hooks to optionally inject entropy to the checkpoint 
path based upon a user defined pattern in the configuration for better S3 
scalability.
   
   This is a revised version of #6302 by @indrc (based on Flink 1.4) that ports 
the changes to the new FileStateStorage code introduced in Flink 1.5.
   
   ## Brief change log
   
 - Adds an optional `CheckpointPathFilter` that is applied to data and 
metadata paths before creating the files
 - Adds an implementation of that filter that replaces entropy keys with 
entropy (data files) or removes the entropy key (metadata files)
 - Adds functionality to read the entropy pattern from the config or set it 
on the state backend, and configure the CheckpointPathFilter based on that
   
   ## Verifying this change
   
 - Added unit tests under `org.apache.flink.runtime.state.filesystem` 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **no**
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **yes**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **yes**
 - If yes, how is the feature documented? **docs**
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> add entropy to s3 path for better scalability
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jamie Grier
>Assignee: Indrajit Roychoudhury
>Priority: Critical
>  Labels: pull-request-available
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StephanEwen opened a new pull request #6604: [FLINK-9061] Optionally add entropy to checkpoint paths better S3 scalability

2018-08-22 Thread GitBox
StephanEwen opened a new pull request #6604: [FLINK-9061] Optionally add 
entropy to checkpoint paths better S3 scalability
URL: https://github.com/apache/flink/pull/6604
 
 
   ## What is the purpose of the change
   
   This pull request adds hooks to optionally inject entropy to the checkpoint 
path based upon a user defined pattern in the configuration for better S3 
scalability.
   
   This is a revised version of #6302 by @indrc (based on Flink 1.4) that ports 
the changes to the new FileStateStorage code introduced in Flink 1.5.
   
   ## Brief change log
   
 - Adds an optional `CheckpointPathFilter` that is applied to data and 
metadata paths before creating the files
 - Adds an implementation of that filter that replaces entropy keys with 
entropy (data files) or removes the entropy key (metadata files)
 - Adds functionality to read the entropy pattern from the config or set it 
on the state backend, and configure the CheckpointPathFilter based on that
   
   ## Verifying this change
   
 - Added unit tests under `org.apache.flink.runtime.state.filesystem` 
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): **no**
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
 - The serializers: **no**
 - The runtime per-record code paths (performance sensitive): **no**
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **yes**
 - The S3 file system connector: **no**
   
   ## Documentation
   
 - Does this pull request introduce a new feature? **yes**
 - If yes, how is the feature documented? **docs**
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-8500) Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher)

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589289#comment-16589289
 ] 

ASF GitHub Bot commented on FLINK-8500:
---

FredTing commented on a change in pull request #6105: [FLINK-8500] Get the 
timestamp of the Kafka message from kafka consumer
URL: https://github.com/apache/flink/pull/6105#discussion_r212082669
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -45,6 +45,22 @@
 */
T deserialize(byte[] messageKey, byte[] message, String topic, int 
partition, long offset) throws IOException;
 
+   /**
+* Deserializes the byte message.
+*
+* @param messageKey the key as a byte array (null if no key has been 
set).
+* @param message The message, as a byte array (null if the message was 
empty or deleted).
+* @param partition The partition the message has originated from.
+* @param offset the offset of the message in the original source (for 
example the Kafka offset).
+* @param timestamp the timestamp of the consumer record
+* @param timestampType The timestamp type, could be NO_TIMESTAMP, 
CREATE_TIME or INGEST_TIME.
+*
+* @return The deserialized message as an object (null if the message 
cannot be deserialized).
+*/
+   default T deserialize(byte[] messageKey, byte[] message, String topic, 
int partition, long offset, long timestamp, TimestampType timestampType) throws 
IOException {
 
 Review comment:
   @tzulitai We can remove the non-timestamped version of the `deserialize` 
method completely. It will break the interface and if you implemented this 
method it quite easy to migrate to the new method. I've only introduced the new 
`deserialize` method as default to make this change backwards compatible.
   When backwards compatibility is required we could mark the non-timestamped 
version of the `deserialize` method as `@Deprecated` and give it a default 
implementation that throws a `NotImplementedException` or a 
`FlinkRuntimeException` with the message "you must implement the deserialize 
method when implementingt the KeyedDeserializationSchema interface"   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Get the timestamp of the Kafka message from kafka consumer(Kafka010Fetcher)
> ---
>
> Key: FLINK-8500
> URL: https://issues.apache.org/jira/browse/FLINK-8500
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: yanxiaobin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
> Attachments: image-2018-01-30-14-58-58-167.png, 
> image-2018-01-31-10-48-59-633.png
>
>
> The method deserialize of KeyedDeserializationSchema  needs a parameter 
> 'kafka message timestamp' (from ConsumerRecord) .In some business scenarios, 
> this is useful!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] FredTing commented on a change in pull request #6105: [FLINK-8500] Get the timestamp of the Kafka message from kafka consumer

2018-08-22 Thread GitBox
FredTing commented on a change in pull request #6105: [FLINK-8500] Get the 
timestamp of the Kafka message from kafka consumer
URL: https://github.com/apache/flink/pull/6105#discussion_r212082669
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -45,6 +45,22 @@
 */
T deserialize(byte[] messageKey, byte[] message, String topic, int 
partition, long offset) throws IOException;
 
+   /**
+* Deserializes the byte message.
+*
+* @param messageKey the key as a byte array (null if no key has been 
set).
+* @param message The message, as a byte array (null if the message was 
empty or deleted).
+* @param partition The partition the message has originated from.
+* @param offset the offset of the message in the original source (for 
example the Kafka offset).
+* @param timestamp the timestamp of the consumer record
+* @param timestampType The timestamp type, could be NO_TIMESTAMP, 
CREATE_TIME or INGEST_TIME.
+*
+* @return The deserialized message as an object (null if the message 
cannot be deserialized).
+*/
+   default T deserialize(byte[] messageKey, byte[] message, String topic, 
int partition, long offset, long timestamp, TimestampType timestampType) throws 
IOException {
 
 Review comment:
   @tzulitai We can remove the non-timestamped version of the `deserialize` 
method completely. It will break the interface and if you implemented this 
method it quite easy to migrate to the new method. I've only introduced the new 
`deserialize` method as default to make this change backwards compatible.
   When backwards compatibility is required we could mark the non-timestamped 
version of the `deserialize` method as `@Deprecated` and give it a default 
implementation that throws a `NotImplementedException` or a 
`FlinkRuntimeException` with the message "you must implement the deserialize 
method when implementingt the KeyedDeserializationSchema interface"   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10167) SessionWindows not compatible with typed DataStreams in scala

2018-08-22 Thread Andrew Roberts (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589125#comment-16589125
 ] 

Andrew Roberts commented on FLINK-10167:


Interesting - I was making several assumptions that turned out to be wrong. I 
follow you now on the value-vs-object (AnyVal vs AnyRef) split. The underlying 
issue I was seeing was due to keying by a string that I thought had to be 
immutable because it contained only immutable objects and case classes, but 
changing the key to a "more immutable" version somehow fixed the issue. Also, 
it's worth noting that switching the stream to a tuple of (key, value) also 
worked to "solve" the problem, if I keyed that stream by the first tuple 
element. I don't quite understand why, since it was the same string that didn't 
work when used directly as the key, but perhaps some kind of evaluation order 
weirdness was in play. In any case, I've gotten my job to where I want it to be 
- thanks for your help!

> SessionWindows not compatible with typed DataStreams in scala
> -
>
> Key: FLINK-10167
> URL: https://issues.apache.org/jira/browse/FLINK-10167
> Project: Flink
>  Issue Type: Bug
>Reporter: Andrew Roberts
>Priority: Major
>
> I'm trying to construct a trivial job that uses session windows, and it looks 
> like the data type parameter is hardcoded to `Object`/`AnyRef`. Due to the 
> invariance of java classes in scala, this means that we can't use the 
> provided SessionWindow helper classes in scala on typed streams.
>  
> Example job:
> {code:java}
> import org.apache.flink.api.scala._
> import org.apache.flink.streaming.api.scala.{DataStream, 
> StreamExecutionEnvironment}
> import 
> org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows
> import org.apache.flink.streaming.api.windowing.time.Time
> import org.apache.flink.streaming.api.windowing.windows.{TimeWindow, Window}
> import org.apache.flink.util.Collector
> object TestJob {
>   val jobName = "TestJob"
>   def main(args: Array[String]): Unit = {
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> env.fromCollection(Range(0, 100).toList)
>   .keyBy(_ / 10)
>   .window(ProcessingTimeSessionWindows.withGap(Time.minutes(1)))
>   .reduce(
> (a: Int, b: Int) => a + b,
> (key: Int, window: Window, items: Iterable[Int], out: 
> Collector[String]) => s"${key}: ${items}"
>   )
>   .map(println(_))
> env.execute(jobName)
>   }
> }{code}
>  
> Compile error:
> {code:java}
> [error]  found   : 
> org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows
> [error]  required: 
> org.apache.flink.streaming.api.windowing.assigners.WindowAssigner[_ >: Int, ?]
> [error] Note: Object <: Any (and 
> org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows
>  <: 
> org.apache.flink.streaming.api.windowing.assigners.MergingWindowAssigner[Object,org.apache.flink.streaming.api.windowing.windows.TimeWindow]),
>  but Java-defined class WindowAssigner is invariant in type T.
> [error] You may wish to investigate a wildcard type such as `_ <: Any`. (SLS 
> 3.2.10)
> [error]       
> .window(ProcessingTimeSessionWindows.withGap(Time.minutes(1))){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10200) Consolidate FileBasedStateOutputStream and FsCheckpointStateOutputStream

2018-08-22 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-10200:
--

 Summary: Consolidate FileBasedStateOutputStream and 
FsCheckpointStateOutputStream
 Key: FLINK-10200
 URL: https://issues.apache.org/jira/browse/FLINK-10200
 Project: Flink
  Issue Type: Improvement
Reporter: Stefan Richter
Assignee: Stefan Richter


We should be able to find a common denominator that avoids code duplication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10199) StandaloneSessionClusterEntrypoint does not respect host/ui-port given from jobmanager.sh

2018-08-22 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-10199:


 Summary: StandaloneSessionClusterEntrypoint does not respect 
host/ui-port given from jobmanager.sh
 Key: FLINK-10199
 URL: https://issues.apache.org/jira/browse/FLINK-10199
 Project: Flink
  Issue Type: Bug
  Components: Startup Shell Scripts
Affects Versions: 1.6.0, 1.5.0
Reporter: Dawid Wysakowicz


The parameters {{--host}} and {{--webui-port}} provided for jobmanager.sh 
script are ignored in standalonesession. You cannot start e.g HA setup via 
{{start-cluster.sh}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] yanghua commented on a change in pull request #6593: [hotfix][flink-streaming-java] modify typo in the StreamingJobGraphGenerator.java

2018-08-22 Thread GitBox
yanghua commented on a change in pull request #6593: 
[hotfix][flink-streaming-java] modify typo in the 
StreamingJobGraphGenerator.java
URL: https://github.com/apache/flink/pull/6593#discussion_r211993277
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java
 ##
 @@ -130,7 +130,7 @@ private JobGraph createJobGraph() {
jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
// Generate deterministic hashes for the nodes in order to 
identify them across
-   // submission iff they didn't change.
+   // submission if they didn't change.
 
 Review comment:
   There are indeed abbreviations of this form, and there is no problem with 
this understanding. Agree.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10172) Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588986#comment-16588986
 ] 

ASF GitHub Bot commented on FLINK-10172:


twalthr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc 
suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415064636
 
 
   Thanks for the explanation @fhueske and @walterddr. Feel free to add more 
tests if they make sense. But actually most of the operators should already 
have tests. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc
> --
>
> Key: FLINK-10172
> URL: https://issues.apache.org/jira/browse/FLINK-10172
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.3, 1.4.2, 1.5.3, 1.6.0, 1.7.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.3, 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>
> The following expression throws an exception in parsing {{"id.asc"}} term.
> {code:java}
> Table allOrders = orderTable
> .select("id,order_date,amount,customer_id")
> .orderBy("id.asc");
> {code}
> while it is correctly parsed for Scala:
> {code:scala}
> val allOrders:Table = orderTable
> .select('id, 'order_date, 'amount, 'customer_id)
> .orderBy('id.asc)
> {code}
> Anticipated some inconsistency between ExpressionParser and ExpressionDsl



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] twalthr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc suffix expression to expressionParser

2018-08-22 Thread GitBox
twalthr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc 
suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415064636
 
 
   Thanks for the explanation @fhueske and @walterddr. Feel free to add more 
tests if they make sense. But actually most of the operators should already 
have tests. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588975#comment-16588975
 ] 

ASF GitHub Bot commented on FLINK-10193:


GJL commented on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415061650
 
 
   @tillrohrmann The value of the timeout from the perspective of the method 
implementation will always be what is passed to the RPC. This is regardless of 
whether the annotation is present or not because if the annotation is missing, 
the `AkkaInvocationHandler`, cannot know which item in the args array should be 
replaced with the default timeout. 
   
   If the call originates from the REST API, the value will be 
`RpcUtils.INF_TIMEOUT`:
   
https://github.com/apache/flink/blob/6258a4c333ce9dba914621b13eac2f7d91f5cb72/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/savepoints/SavepointHandlers.java#L132
 
   
   I can add a test to the handler test suite but this will not verify my 
change. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL commented on issue #6601: [FLINK-10193] Add @RpcTimeout to JobMasterGateway.triggerSavepoint

2018-08-22 Thread GitBox
GJL commented on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415061650
 
 
   @tillrohrmann The value of the timeout from the perspective of the method 
implementation will always be what is passed to the RPC. This is regardless of 
whether the annotation is present or not because if the annotation is missing, 
the `AkkaInvocationHandler`, cannot know which item in the args array should be 
replaced with the default timeout. 
   
   If the call originates from the REST API, the value will be 
`RpcUtils.INF_TIMEOUT`:
   
https://github.com/apache/flink/blob/6258a4c333ce9dba914621b13eac2f7d91f5cb72/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/savepoints/SavepointHandlers.java#L132
 
   
   I can add a test to the handler test suite but this will not verify my 
change. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10172) Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588945#comment-16588945
 ] 

ASF GitHub Bot commented on FLINK-10172:


walterddr edited a comment on issue #6585: [FLINK-10172][table]re-adding asc & 
desc suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415057802
 
 
   @twalthr yes @fhueske is correct the 2 functions are special expressions 
which requires special expression parsing. So far I am looking over the diff in 
https://github.com/apache/flink/pull/1926/files and didn't find anything 
abnormal but I will add additional tests for the lists of functions originally 
in the `ExpressionParser` class. Should I open a JIRA for it? (haven't find a 
bug yet)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc
> --
>
> Key: FLINK-10172
> URL: https://issues.apache.org/jira/browse/FLINK-10172
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.3, 1.4.2, 1.5.3, 1.6.0, 1.7.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.3, 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>
> The following expression throws an exception in parsing {{"id.asc"}} term.
> {code:java}
> Table allOrders = orderTable
> .select("id,order_date,amount,customer_id")
> .orderBy("id.asc");
> {code}
> while it is correctly parsed for Scala:
> {code:scala}
> val allOrders:Table = orderTable
> .select('id, 'order_date, 'amount, 'customer_id)
> .orderBy('id.asc)
> {code}
> Anticipated some inconsistency between ExpressionParser and ExpressionDsl



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10172) Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588944#comment-16588944
 ] 

ASF GitHub Bot commented on FLINK-10172:


walterddr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc 
suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415057802
 
 
   @twalthr yes @fhueske is correct the 2 functions are special expressions 
which requires special expression parsing. So far I am looking over the diff in 
https://github.com/apache/flink/pull/1926/files and didn't find anything 
abnormal but I will add additional tests for the lists of functions originally 
in the `ExpressionParser` class. Should I open a JIRA for it (haven't find a 
bug yet)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Inconsistentcy in ExpressionParser and ExpressionDsl for order by asc/desc
> --
>
> Key: FLINK-10172
> URL: https://issues.apache.org/jira/browse/FLINK-10172
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Affects Versions: 1.3.3, 1.4.2, 1.5.3, 1.6.0, 1.7.0
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.3.3, 1.4.3, 1.6.1, 1.7.0, 1.5.4
>
>
> The following expression throws an exception in parsing {{"id.asc"}} term.
> {code:java}
> Table allOrders = orderTable
> .select("id,order_date,amount,customer_id")
> .orderBy("id.asc");
> {code}
> while it is correctly parsed for Scala:
> {code:scala}
> val allOrders:Table = orderTable
> .select('id, 'order_date, 'amount, 'customer_id)
> .orderBy('id.asc)
> {code}
> Anticipated some inconsistency between ExpressionParser and ExpressionDsl



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] walterddr edited a comment on issue #6585: [FLINK-10172][table]re-adding asc & desc suffix expression to expressionParser

2018-08-22 Thread GitBox
walterddr edited a comment on issue #6585: [FLINK-10172][table]re-adding asc & 
desc suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415057802
 
 
   @twalthr yes @fhueske is correct the 2 functions are special expressions 
which requires special expression parsing. So far I am looking over the diff in 
https://github.com/apache/flink/pull/1926/files and didn't find anything 
abnormal but I will add additional tests for the lists of functions originally 
in the `ExpressionParser` class. Should I open a JIRA for it? (haven't find a 
bug yet)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] walterddr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc suffix expression to expressionParser

2018-08-22 Thread GitBox
walterddr commented on issue #6585: [FLINK-10172][table]re-adding asc & desc 
suffix expression to expressionParser
URL: https://github.com/apache/flink/pull/6585#issuecomment-415057802
 
 
   @twalthr yes @fhueske is correct the 2 functions are special expressions 
which requires special expression parsing. So far I am looking over the diff in 
https://github.com/apache/flink/pull/1926/files and didn't find anything 
abnormal but I will add additional tests for the lists of functions originally 
in the `ExpressionParser` class. Should I open a JIRA for it (haven't find a 
bug yet)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588943#comment-16588943
 ] 

ASF GitHub Bot commented on FLINK-10198:


StefanRRichter commented on issue #6603: [FLINK-10198][state] Set Env object in 
DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415057325
 
 
   I think it should (in general) help to get resources in check and you can 
set the number of threads on the default env in a `OptionsFactory` to a value 
so that your disk can be saturated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Set Env object in DBOptions for RocksDB
> ---
>
> Key: FLINK-10198
> URL: https://issues.apache.org/jira/browse/FLINK-10198
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> I think we should consider to always set a default environment when we create 
> the DBOptions.
> See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
> *Support for Multiple Embedded Databases in the same process*
> A common use-case for RocksDB is that applications inherently partition their 
> data set into logical partitions or shards. This technique benefits 
> application load balancing and fast recovery from faults. This means that a 
> single server process should be able to operate multiple RocksDB databases 
> simultaneously. This is done via an environment object named Env. Among other 
> things, a thread pool is associated with an Env. If applications want to 
> share a common thread pool (for background compactions) among multiple 
> database instances, then it should use the same Env object for opening those 
> databases.
> Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StefanRRichter commented on issue #6603: [FLINK-10198][state] Set Env object in DBOptions for RocksDB

2018-08-22 Thread GitBox
StefanRRichter commented on issue #6603: [FLINK-10198][state] Set Env object in 
DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415057325
 
 
   I think it should (in general) help to get resources in check and you can 
set the number of threads on the default env in a `OptionsFactory` to a value 
so that your disk can be saturated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Aitozi commented on a change in pull request #6593: [hotfix][flink-streaming-java] modify typo in the StreamingJobGraphGenerator.java

2018-08-22 Thread GitBox
Aitozi commented on a change in pull request #6593: 
[hotfix][flink-streaming-java] modify typo in the 
StreamingJobGraphGenerator.java
URL: https://github.com/apache/flink/pull/6593#discussion_r211980992
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java
 ##
 @@ -130,7 +130,7 @@ private JobGraph createJobGraph() {
jobGraph.setScheduleMode(ScheduleMode.EAGER);
 
// Generate deterministic hashes for the nodes in order to 
identify them across
-   // submission iff they didn't change.
+   // submission if they didn't change.
 
 Review comment:
   I think here means `if and only if` and is not a typo.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588929#comment-16588929
 ] 

ASF GitHub Bot commented on FLINK-10198:


Aitozi commented on issue #6603: [FLINK-10198][state] Set Env object in 
DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415055316
 
 
   Does this can help to save the cpu resource with the sharing of background 
compaction thread pool ? Or will it encountered the write stall when the 
compaction thread is not enough with the sharing in the whole process ?  
@StefanRRichter 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Set Env object in DBOptions for RocksDB
> ---
>
> Key: FLINK-10198
> URL: https://issues.apache.org/jira/browse/FLINK-10198
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> I think we should consider to always set a default environment when we create 
> the DBOptions.
> See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
> *Support for Multiple Embedded Databases in the same process*
> A common use-case for RocksDB is that applications inherently partition their 
> data set into logical partitions or shards. This technique benefits 
> application load balancing and fast recovery from faults. This means that a 
> single server process should be able to operate multiple RocksDB databases 
> simultaneously. This is done via an environment object named Env. Among other 
> things, a thread pool is associated with an Env. If applications want to 
> share a common thread pool (for background compactions) among multiple 
> database instances, then it should use the same Env object for opening those 
> databases.
> Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Aitozi commented on issue #6603: [FLINK-10198][state] Set Env object in DBOptions for RocksDB

2018-08-22 Thread GitBox
Aitozi commented on issue #6603: [FLINK-10198][state] Set Env object in 
DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415055316
 
 
   Does this can help to save the cpu resource with the sharing of background 
compaction thread pool ? Or will it encountered the write stall when the 
compaction thread is not enough with the sharing in the whole process ?  
@StefanRRichter 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588902#comment-16588902
 ] 

ASF GitHub Bot commented on FLINK-10198:


StefanRRichter edited a comment on issue #6603: [FLINK-10198][state] Set Env 
object in DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415044671
 
 
   @aljoscha can you take a look, in case there was a good reason why this was 
not always done from the start?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Set Env object in DBOptions for RocksDB
> ---
>
> Key: FLINK-10198
> URL: https://issues.apache.org/jira/browse/FLINK-10198
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> I think we should consider to always set a default environment when we create 
> the DBOptions.
> See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
> *Support for Multiple Embedded Databases in the same process*
> A common use-case for RocksDB is that applications inherently partition their 
> data set into logical partitions or shards. This technique benefits 
> application load balancing and fast recovery from faults. This means that a 
> single server process should be able to operate multiple RocksDB databases 
> simultaneously. This is done via an environment object named Env. Among other 
> things, a thread pool is associated with an Env. If applications want to 
> share a common thread pool (for background compactions) among multiple 
> database instances, then it should use the same Env object for opening those 
> databases.
> Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StefanRRichter edited a comment on issue #6603: [FLINK-10198][state] Set Env object in DBOptions for RocksDB

2018-08-22 Thread GitBox
StefanRRichter edited a comment on issue #6603: [FLINK-10198][state] Set Env 
object in DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415044671
 
 
   @aljoscha can you take a look, in case there was a good reason why this was 
not always done from the start?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588899#comment-16588899
 ] 

ASF GitHub Bot commented on FLINK-10198:


StefanRRichter commented on issue #6603: [FLINK-10198][state] Set Env object in 
DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415044671
 
 
   @aljoscha can you take a look, in casethere was a good reason why this was 
not always done from the start?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Set Env object in DBOptions for RocksDB
> ---
>
> Key: FLINK-10198
> URL: https://issues.apache.org/jira/browse/FLINK-10198
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> I think we should consider to always set a default environment when we create 
> the DBOptions.
> See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
> *Support for Multiple Embedded Databases in the same process*
> A common use-case for RocksDB is that applications inherently partition their 
> data set into logical partitions or shards. This technique benefits 
> application load balancing and fast recovery from faults. This means that a 
> single server process should be able to operate multiple RocksDB databases 
> simultaneously. This is done via an environment object named Env. Among other 
> things, a thread pool is associated with an Env. If applications want to 
> share a common thread pool (for background compactions) among multiple 
> database instances, then it should use the same Env object for opening those 
> databases.
> Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StefanRRichter commented on issue #6603: [FLINK-10198][state] Set Env object in DBOptions for RocksDB

2018-08-22 Thread GitBox
StefanRRichter commented on issue #6603: [FLINK-10198][state] Set Env object in 
DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603#issuecomment-415044671
 
 
   @aljoscha can you take a look, in casethere was a good reason why this was 
not always done from the start?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588897#comment-16588897
 ] 

ASF GitHub Bot commented on FLINK-10198:


StefanRRichter opened a new pull request #6603: [FLINK-10198][state] Set Env 
object in DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603
 
 
   ## What is the purpose of the change
   
   This PR always sets a default environment when creating the `DBOptions`. 
This could simplify resource management for multiple RocksDB instances on one 
machine.
   
   See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
   Support for Multiple Embedded Databases in the same process
   A common use-case for RocksDB is that applications inherently partition 
their data set into logical partitions or shards. This technique benefits 
application load balancing and fast recovery from faults. This means that a 
single server process should be able to operate multiple RocksDB databases 
simultaneously. This is done via an environment object named Env. Among other 
things, a thread pool is associated with an Env. If applications want to share 
a common thread pool (for background compactions) among multiple database 
instances, then it should use the same Env object for opening those databases.
   
   Similarly, multiple database instances may share the same block cache.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   `RocksDBStateBackendConfigTest#testSetDefaultEnvInOptions`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (yes)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Set Env object in DBOptions for RocksDB
> ---
>
> Key: FLINK-10198
> URL: https://issues.apache.org/jira/browse/FLINK-10198
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> I think we should consider to always set a default environment when we create 
> the DBOptions.
> See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
> *Support for Multiple Embedded Databases in the same process*
> A common use-case for RocksDB is that applications inherently partition their 
> data set into logical partitions or shards. This technique benefits 
> application load balancing and fast recovery from faults. This means that a 
> single server process should be able to operate multiple RocksDB databases 
> simultaneously. This is done via an environment object named Env. Among other 
> things, a thread pool is associated with an Env. If applications want to 
> share a common thread pool (for background compactions) among multiple 
> database instances, then it should use the same Env object for opening those 
> databases.
> Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10198:
---
Labels: pull-request-available  (was: )

> Set Env object in DBOptions for RocksDB
> ---
>
> Key: FLINK-10198
> URL: https://issues.apache.org/jira/browse/FLINK-10198
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>  Labels: pull-request-available
>
> I think we should consider to always set a default environment when we create 
> the DBOptions.
> See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
> *Support for Multiple Embedded Databases in the same process*
> A common use-case for RocksDB is that applications inherently partition their 
> data set into logical partitions or shards. This technique benefits 
> application load balancing and fast recovery from faults. This means that a 
> single server process should be able to operate multiple RocksDB databases 
> simultaneously. This is done via an environment object named Env. Among other 
> things, a thread pool is associated with an Env. If applications want to 
> share a common thread pool (for background compactions) among multiple 
> database instances, then it should use the same Env object for opening those 
> databases.
> Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StefanRRichter opened a new pull request #6603: [FLINK-10198][state] Set Env object in DBOptions for RocksDB

2018-08-22 Thread GitBox
StefanRRichter opened a new pull request #6603: [FLINK-10198][state] Set Env 
object in DBOptions for RocksDB
URL: https://github.com/apache/flink/pull/6603
 
 
   ## What is the purpose of the change
   
   This PR always sets a default environment when creating the `DBOptions`. 
This could simplify resource management for multiple RocksDB instances on one 
machine.
   
   See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
   Support for Multiple Embedded Databases in the same process
   A common use-case for RocksDB is that applications inherently partition 
their data set into logical partitions or shards. This technique benefits 
application load balancing and fast recovery from faults. This means that a 
single server process should be able to operate multiple RocksDB databases 
simultaneously. This is done via an environment object named Env. Among other 
things, a thread pool is associated with an Env. If applications want to share 
a common thread pool (for background compactions) among multiple database 
instances, then it should use the same Env object for opening those databases.
   
   Similarly, multiple database instances may share the same block cache.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   `RocksDBStateBackendConfigTest#testSetDefaultEnvInOptions`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (yes)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-10198) Set Env object in DBOptions for RocksDB

2018-08-22 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-10198:
--

 Summary: Set Env object in DBOptions for RocksDB
 Key: FLINK-10198
 URL: https://issues.apache.org/jira/browse/FLINK-10198
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.7.0
Reporter: Stefan Richter
Assignee: Stefan Richter


I think we should consider to always set a default environment when we create 
the DBOptions.

See https://github.com/facebook/rocksdb/wiki/rocksdb-basics:
*Support for Multiple Embedded Databases in the same process*
A common use-case for RocksDB is that applications inherently partition their 
data set into logical partitions or shards. This technique benefits application 
load balancing and fast recovery from faults. This means that a single server 
process should be able to operate multiple RocksDB databases simultaneously. 
This is done via an environment object named Env. Among other things, a thread 
pool is associated with an Env. If applications want to share a common thread 
pool (for background compactions) among multiple database instances, then it 
should use the same Env object for opening those databases.

Similarly, multiple database instances may share the same block cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10164) Add support for resuming from savepoints to StandaloneJobClusterEntrypoint

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588856#comment-16588856
 ] 

ASF GitHub Bot commented on FLINK-10164:


tillrohrmann closed pull request #6573: [Backport 1.6][FLINK-10164] Add 
fromSavepoint and allowNonRestoredState CLI options to 
StandaloneJobClusterEntrypoint
URL: https://github.com/apache/flink/pull/6573
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ops/deployment/kubernetes.md 
b/docs/ops/deployment/kubernetes.md
index 298e473d5f2..5244f5ed544 100644
--- a/docs/ops/deployment/kubernetes.md
+++ b/docs/ops/deployment/kubernetes.md
@@ -34,9 +34,8 @@ Please follow [Kubernetes' setup 
guide](https://kubernetes.io/docs/setup/) in or
 If you want to run Kubernetes locally, we recommend using 
[MiniKube](https://kubernetes.io/docs/setup/minikube/).
 
 
-  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 
-  promisc on'` before deploying a Flink cluster. Otherwise Flink components 
are not able to self reference 
-  themselves through a Kubernetes service. 
+  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 promisc on'` before deploying a Flink 
cluster. 
+  Otherwise Flink components are not able to self reference themselves through 
a Kubernetes service. 
 
 
 ## Flink session cluster on Kubernetes
diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
index 0adb8cf75b2..357a87e4fbc 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
@@ -19,6 +19,7 @@
 package org.apache.flink.client.cli;
 
 import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.runtime.jobgraph.SavepointRestoreSettings;
 
 import org.apache.commons.cli.CommandLine;
 import org.apache.commons.cli.DefaultParser;
@@ -76,10 +77,10 @@
"Address of the JobManager (master) to which to 
connect. " +
"Use this flag to connect to a different JobManager 
than the one specified in the configuration.");
 
-   static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
+   public static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
"Path to a savepoint to restore the job from (for 
example hdfs:///flink/savepoint-1537).");
 
-   static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
+   public static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
"Allow to skip savepoint state that cannot be restored. 
" +
"You need to allow this if you removed 
an operator from your " +
"program that was part of the program 
when the savepoint was triggered.");
@@ -401,6 +402,16 @@ private static void printCustomCliOptions(
}
}
 
+   public static SavepointRestoreSettings 
createSavepointRestoreSettings(CommandLine commandLine) {
+   if (commandLine.hasOption(SAVEPOINT_PATH_OPTION.getOpt())) {
+   String savepointPath = 
commandLine.getOptionValue(SAVEPOINT_PATH_OPTION.getOpt());
+   boolean allowNonRestoredState = 
commandLine.hasOption(SAVEPOINT_ALLOW_NON_RESTORED_OPTION.getOpt());
+   return SavepointRestoreSettings.forPath(savepointPath, 
allowNonRestoredState);
+   } else {
+   return SavepointRestoreSettings.none();
+   }
+   }
+
// 

//  Line Parsing
// 

diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
index 1acda1b5265..ccaa4916f9c 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
@@ -36,8 +36,6 @@
 import static org.apache.flink.client.cli.CliFrontendParser.JAR_OPTION;
 import static org.apache.flink.client.cli.CliFrontendParser.LOGGING_OPTION;
 import static org.apache.

[jira] [Closed] (FLINK-10164) Add support for resuming from savepoints to StandaloneJobClusterEntrypoint

2018-08-22 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-10164.
-
Resolution: Fixed

Fixed via
1.7.0: 6258a4c
1.6.1: 5434ca5

> Add support for resuming from savepoints to StandaloneJobClusterEntrypoint
> --
>
> Key: FLINK-10164
> URL: https://issues.apache.org/jira/browse/FLINK-10164
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0
>
>
> The {{StandaloneJobClusterEntrypoint}} should support to resume from a 
> savepoint/checkpoint. I suggest to introduce an optional command line 
> parameter for specifying the savepoint/checkpoint path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tillrohrmann closed pull request #6573: [Backport 1.6][FLINK-10164] Add fromSavepoint and allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint

2018-08-22 Thread GitBox
tillrohrmann closed pull request #6573: [Backport 1.6][FLINK-10164] Add 
fromSavepoint and allowNonRestoredState CLI options to 
StandaloneJobClusterEntrypoint
URL: https://github.com/apache/flink/pull/6573
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ops/deployment/kubernetes.md 
b/docs/ops/deployment/kubernetes.md
index 298e473d5f2..5244f5ed544 100644
--- a/docs/ops/deployment/kubernetes.md
+++ b/docs/ops/deployment/kubernetes.md
@@ -34,9 +34,8 @@ Please follow [Kubernetes' setup 
guide](https://kubernetes.io/docs/setup/) in or
 If you want to run Kubernetes locally, we recommend using 
[MiniKube](https://kubernetes.io/docs/setup/minikube/).
 
 
-  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 
-  promisc on'` before deploying a Flink cluster. Otherwise Flink components 
are not able to self reference 
-  themselves through a Kubernetes service. 
+  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 promisc on'` before deploying a Flink 
cluster. 
+  Otherwise Flink components are not able to self reference themselves through 
a Kubernetes service. 
 
 
 ## Flink session cluster on Kubernetes
diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
index 0adb8cf75b2..357a87e4fbc 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
@@ -19,6 +19,7 @@
 package org.apache.flink.client.cli;
 
 import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.runtime.jobgraph.SavepointRestoreSettings;
 
 import org.apache.commons.cli.CommandLine;
 import org.apache.commons.cli.DefaultParser;
@@ -76,10 +77,10 @@
"Address of the JobManager (master) to which to 
connect. " +
"Use this flag to connect to a different JobManager 
than the one specified in the configuration.");
 
-   static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
+   public static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
"Path to a savepoint to restore the job from (for 
example hdfs:///flink/savepoint-1537).");
 
-   static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
+   public static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
"Allow to skip savepoint state that cannot be restored. 
" +
"You need to allow this if you removed 
an operator from your " +
"program that was part of the program 
when the savepoint was triggered.");
@@ -401,6 +402,16 @@ private static void printCustomCliOptions(
}
}
 
+   public static SavepointRestoreSettings 
createSavepointRestoreSettings(CommandLine commandLine) {
+   if (commandLine.hasOption(SAVEPOINT_PATH_OPTION.getOpt())) {
+   String savepointPath = 
commandLine.getOptionValue(SAVEPOINT_PATH_OPTION.getOpt());
+   boolean allowNonRestoredState = 
commandLine.hasOption(SAVEPOINT_ALLOW_NON_RESTORED_OPTION.getOpt());
+   return SavepointRestoreSettings.forPath(savepointPath, 
allowNonRestoredState);
+   } else {
+   return SavepointRestoreSettings.none();
+   }
+   }
+
// 

//  Line Parsing
// 

diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
index 1acda1b5265..ccaa4916f9c 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
@@ -36,8 +36,6 @@
 import static org.apache.flink.client.cli.CliFrontendParser.JAR_OPTION;
 import static org.apache.flink.client.cli.CliFrontendParser.LOGGING_OPTION;
 import static org.apache.flink.client.cli.CliFrontendParser.PARALLELISM_OPTION;
-import static 
org.apache.flink.client.cli.CliFrontendParser.SAVEPOINT_ALLOW_NON_RESTORED_OPTION;
-import static 
org.apache.flink.client.cli.CliFrontendParser.SAVEPOINT_PATH_OPTION;
 import static 
org.a

[jira] [Commented] (FLINK-10164) Add support for resuming from savepoints to StandaloneJobClusterEntrypoint

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588854#comment-16588854
 ] 

ASF GitHub Bot commented on FLINK-10164:


tillrohrmann closed pull request #6572: [FLINK-10164] Add fromSavepoint and 
allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint
URL: https://github.com/apache/flink/pull/6572
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ops/deployment/kubernetes.md 
b/docs/ops/deployment/kubernetes.md
index 298e473d5f2..5244f5ed544 100644
--- a/docs/ops/deployment/kubernetes.md
+++ b/docs/ops/deployment/kubernetes.md
@@ -34,9 +34,8 @@ Please follow [Kubernetes' setup 
guide](https://kubernetes.io/docs/setup/) in or
 If you want to run Kubernetes locally, we recommend using 
[MiniKube](https://kubernetes.io/docs/setup/minikube/).
 
 
-  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 
-  promisc on'` before deploying a Flink cluster. Otherwise Flink components 
are not able to self reference 
-  themselves through a Kubernetes service. 
+  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 promisc on'` before deploying a Flink 
cluster. 
+  Otherwise Flink components are not able to self reference themselves through 
a Kubernetes service. 
 
 
 ## Flink session cluster on Kubernetes
diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
index 0adb8cf75b2..357a87e4fbc 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
@@ -19,6 +19,7 @@
 package org.apache.flink.client.cli;
 
 import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.runtime.jobgraph.SavepointRestoreSettings;
 
 import org.apache.commons.cli.CommandLine;
 import org.apache.commons.cli.DefaultParser;
@@ -76,10 +77,10 @@
"Address of the JobManager (master) to which to 
connect. " +
"Use this flag to connect to a different JobManager 
than the one specified in the configuration.");
 
-   static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
+   public static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
"Path to a savepoint to restore the job from (for 
example hdfs:///flink/savepoint-1537).");
 
-   static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
+   public static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
"Allow to skip savepoint state that cannot be restored. 
" +
"You need to allow this if you removed 
an operator from your " +
"program that was part of the program 
when the savepoint was triggered.");
@@ -401,6 +402,16 @@ private static void printCustomCliOptions(
}
}
 
+   public static SavepointRestoreSettings 
createSavepointRestoreSettings(CommandLine commandLine) {
+   if (commandLine.hasOption(SAVEPOINT_PATH_OPTION.getOpt())) {
+   String savepointPath = 
commandLine.getOptionValue(SAVEPOINT_PATH_OPTION.getOpt());
+   boolean allowNonRestoredState = 
commandLine.hasOption(SAVEPOINT_ALLOW_NON_RESTORED_OPTION.getOpt());
+   return SavepointRestoreSettings.forPath(savepointPath, 
allowNonRestoredState);
+   } else {
+   return SavepointRestoreSettings.none();
+   }
+   }
+
// 

//  Line Parsing
// 

diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
index 1acda1b5265..ccaa4916f9c 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
@@ -36,8 +36,6 @@
 import static org.apache.flink.client.cli.CliFrontendParser.JAR_OPTION;
 import static org.apache.flink.client.cli.CliFrontendParser.LOGGING_OPTION;
 import static org.apache.flink.client.cl

[GitHub] tillrohrmann closed pull request #6572: [FLINK-10164] Add fromSavepoint and allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint

2018-08-22 Thread GitBox
tillrohrmann closed pull request #6572: [FLINK-10164] Add fromSavepoint and 
allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint
URL: https://github.com/apache/flink/pull/6572
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ops/deployment/kubernetes.md 
b/docs/ops/deployment/kubernetes.md
index 298e473d5f2..5244f5ed544 100644
--- a/docs/ops/deployment/kubernetes.md
+++ b/docs/ops/deployment/kubernetes.md
@@ -34,9 +34,8 @@ Please follow [Kubernetes' setup 
guide](https://kubernetes.io/docs/setup/) in or
 If you want to run Kubernetes locally, we recommend using 
[MiniKube](https://kubernetes.io/docs/setup/minikube/).
 
 
-  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 
-  promisc on'` before deploying a Flink cluster. Otherwise Flink components 
are not able to self reference 
-  themselves through a Kubernetes service. 
+  Note: If using MiniKube please make sure to execute 
`minikube ssh 'sudo ip link set docker0 promisc on'` before deploying a Flink 
cluster. 
+  Otherwise Flink components are not able to self reference themselves through 
a Kubernetes service. 
 
 
 ## Flink session cluster on Kubernetes
diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
index 0adb8cf75b2..357a87e4fbc 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontendParser.java
@@ -19,6 +19,7 @@
 package org.apache.flink.client.cli;
 
 import org.apache.flink.configuration.CheckpointingOptions;
+import org.apache.flink.runtime.jobgraph.SavepointRestoreSettings;
 
 import org.apache.commons.cli.CommandLine;
 import org.apache.commons.cli.DefaultParser;
@@ -76,10 +77,10 @@
"Address of the JobManager (master) to which to 
connect. " +
"Use this flag to connect to a different JobManager 
than the one specified in the configuration.");
 
-   static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
+   public static final Option SAVEPOINT_PATH_OPTION = new Option("s", 
"fromSavepoint", true,
"Path to a savepoint to restore the job from (for 
example hdfs:///flink/savepoint-1537).");
 
-   static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
+   public static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new 
Option("n", "allowNonRestoredState", false,
"Allow to skip savepoint state that cannot be restored. 
" +
"You need to allow this if you removed 
an operator from your " +
"program that was part of the program 
when the savepoint was triggered.");
@@ -401,6 +402,16 @@ private static void printCustomCliOptions(
}
}
 
+   public static SavepointRestoreSettings 
createSavepointRestoreSettings(CommandLine commandLine) {
+   if (commandLine.hasOption(SAVEPOINT_PATH_OPTION.getOpt())) {
+   String savepointPath = 
commandLine.getOptionValue(SAVEPOINT_PATH_OPTION.getOpt());
+   boolean allowNonRestoredState = 
commandLine.hasOption(SAVEPOINT_ALLOW_NON_RESTORED_OPTION.getOpt());
+   return SavepointRestoreSettings.forPath(savepointPath, 
allowNonRestoredState);
+   } else {
+   return SavepointRestoreSettings.none();
+   }
+   }
+
// 

//  Line Parsing
// 

diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
index 1acda1b5265..ccaa4916f9c 100644
--- 
a/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
+++ 
b/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
@@ -36,8 +36,6 @@
 import static org.apache.flink.client.cli.CliFrontendParser.JAR_OPTION;
 import static org.apache.flink.client.cli.CliFrontendParser.LOGGING_OPTION;
 import static org.apache.flink.client.cli.CliFrontendParser.PARALLELISM_OPTION;
-import static 
org.apache.flink.client.cli.CliFrontendParser.SAVEPOINT_ALLOW_NON_RESTORED_OPTION;
-import static 
org.apache.flink.client.cli.CliFrontendParser.SAVEPOINT_PATH_OPTION;
 import static 
org.apache.flink.cli

[jira] [Commented] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588799#comment-16588799
 ] 

ASF GitHub Bot commented on FLINK-10193:


tillrohrmann commented on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415020422
 
 
   Can't we check in the `JobMaster#triggerSavepoint` method which `timeout` 
value was passed in? It is not a super good test but it would guard against 
this specific failure in the future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tillrohrmann commented on issue #6601: [FLINK-10193] Add @RpcTimeout to JobMasterGateway.triggerSavepoint

2018-08-22 Thread GitBox
tillrohrmann commented on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415020422
 
 
   Can't we check in the `JobMaster#triggerSavepoint` method which `timeout` 
value was passed in? It is not a super good test but it would guard against 
this specific failure in the future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10162) Add host in checkpoint detail

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588784#comment-16588784
 ] 

ASF GitHub Bot commented on FLINK-10162:


yanghua commented on issue #6566: 
[FLINK-10162][flink-runtime][flink-runtime-web] add host in checkpoint detail
URL: https://github.com/apache/flink/pull/6566#issuecomment-415014507
 
 
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add host in checkpoint detail
> -
>
> Key: FLINK-10162
> URL: https://issues.apache.org/jira/browse/FLINK-10162
> Project: Flink
>  Issue Type: Improvement
>Reporter: xymaqingxiang
>Assignee: xymaqingxiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2018-08-16-21-14-57-221.png, 
> image-2018-08-16-21-15-42-167.png, image-2018-08-16-21-15-46-585.png, 
> image-2018-08-16-21-36-04-977.png
>
>
> When a streaming job run for a long time,  there will be failures at 
> checkpoint. We will have a requirement to look at the log for the 
> corresponding task and the performance of the machine on which the task is 
> located. 
> Therefore, I suggest that add the host information to the checkpoint so that 
> if the checkpoint fails, you can specifically login to the appropriate 
> machine to see the cause of the failure.
> The failures are as follows:
> !image-2018-08-16-21-14-57-221.png!
> !image-2018-08-16-21-15-46-585.png!
> The revised representation is as follows:
>   !image-2018-08-16-21-36-04-977.png!
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588783#comment-16588783
 ] 

ASF GitHub Bot commented on FLINK-10193:


GJL edited a comment on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415012048
 
 
   I see the following possibilities to unit test this:
   
   1. Override `triggerSavepoint` to a NOP, and trigger savepoint twice with 
increasing timeouts. The second invocation should fail later.
   2. Check for presence of the annotation with reflection.
   
   I deliberately did not add tests because Option 1 is slow and depends on
   _"good"_ thread scheduling. Option 2 is fragile and not refactoring friendly.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] yanghua commented on issue #6566: [FLINK-10162][flink-runtime][flink-runtime-web] add host in checkpoint detail

2018-08-22 Thread GitBox
yanghua commented on issue #6566: 
[FLINK-10162][flink-runtime][flink-runtime-web] add host in checkpoint detail
URL: https://github.com/apache/flink/pull/6566#issuecomment-415014507
 
 
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] GJL edited a comment on issue #6601: [FLINK-10193] Add @RpcTimeout to JobMasterGateway.triggerSavepoint

2018-08-22 Thread GitBox
GJL edited a comment on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415012048
 
 
   I see the following possibilities to unit test this:
   
   1. Override `triggerSavepoint` to a NOP, and trigger savepoint twice with 
increasing timeouts. The second invocation should fail later.
   2. Check for presence of the annotation with reflection.
   
   I deliberately did not add tests because Option 1 is slow and depends on
   _"good"_ thread scheduling. Option 2 is fragile and not refactoring friendly.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread Gary Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-10193:
-
Affects Version/s: 1.7.0

> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588778#comment-16588778
 ] 

ASF GitHub Bot commented on FLINK-10193:


GJL commented on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415012048
 
 
   I see the following possibilities to test this:
   
   1. Override `triggerSavepoint` to a NOP, and trigger savepoint twice with 
increasing timeouts. The second invocation should fail later.
   2. Check for presence of the annotation with reflection.
   
   I deliberately did not add tests because Option 1 is slow and depends on
   _"good"_ thread scheduling. Option 2 is fragile and not refactoring friendly.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL commented on issue #6601: [FLINK-10193] Add @RpcTimeout to JobMasterGateway.triggerSavepoint

2018-08-22 Thread GitBox
GJL commented on issue #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601#issuecomment-415012048
 
 
   I see the following possibilities to test this:
   
   1. Override `triggerSavepoint` to a NOP, and trigger savepoint twice with 
increasing timeouts. The second invocation should fail later.
   2. Check for presence of the annotation with reflection.
   
   I deliberately did not add tests because Option 1 is slow and depends on
   _"good"_ thread scheduling. Option 2 is fragile and not refactoring friendly.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-7551) Add VERSION to the REST urls.

2018-08-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7551:
--
Labels: pull-request-available  (was: )

> Add VERSION to the REST urls. 
> --
>
> Key: FLINK-7551
> URL: https://issues.apache.org/jira/browse/FLINK-7551
> Project: Flink
>  Issue Type: Improvement
>  Components: REST
>Affects Versions: 1.4.0
>Reporter: Kostas Kloudas
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> This is to guarantee that we can update the REST API without breaking 
> existing third-party clients.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7551) Add VERSION to the REST urls.

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588773#comment-16588773
 ] 

ASF GitHub Bot commented on FLINK-7551:
---

zentol opened a new pull request #6602:  [FLINK-7551][rest] Add versioning to 
REST API
URL: https://github.com/apache/flink/pull/6602
 
 
   ## What is the purpose of the change
   
   This PR adds a versioning scheme to the REST API.
   
   Versions are represented as a simple prefix (e.g. `v1`) that is prefixed to 
the request, like `/v1/foo/bar`.
   
   Supported versions are encoded in the `MessageHeaders`, and used by the 
`RestServerEndpoint` to register handlers for specific versioned URLs.
   Additionally handlers are also registered for the _unversioned_ URLs 
effectively adding a default version to every request, which is the oldest 
supported one. What this means is that users working against unversioned URLs 
will for the time being always work against version 1, even when new versions 
are added.
   
   ## Brief change log
   
   * [hotfix] Update error handling for `*FileServerHandlers` to be in-line 
with other handlers
   
   * added `RestAPIVersion` enum
   * added `RestHandlerSpecification#getSupportedAPIVersions`
   * `RestClient#sendRequest` now accepts an optional `RestAPIVersion` 
argument; default is latest supported version
   
   `RestServerEndpoint`:
   * modified sorting logic so that newer versions are registered first, to 
ensure that the oldest version is the default (since they override previous 
registrations)
   * additionally register handlers for versioned URLs
   
   * updated the `RestAPIDocsGenerator` to create separate tables for each 
version
   * updated REST documentation to include information about versioning
   * reworked REST API docs layout to better separate versioned and legacy APIs
   
   
![untitled](https://user-images.githubusercontent.com/5725237/44462438-e0732780-a614-11e8-96c0-b49244bb68e8.png)
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   * RestServerEndpointITCase#testVersioning
   * RestAPIVersionTest
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add VERSION to the REST urls. 
> --
>
> Key: FLINK-7551
> URL: https://issues.apache.org/jira/browse/FLINK-7551
> Project: Flink
>  Issue Type: Improvement
>  Components: REST
>Affects Versions: 1.4.0
>Reporter: Kostas Kloudas
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> This is to guarantee that we can update the REST API without breaking 
> existing third-party clients.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zentol opened a new pull request #6602: [FLINK-7551][rest] Add versioning to REST API

2018-08-22 Thread GitBox
zentol opened a new pull request #6602:  [FLINK-7551][rest] Add versioning to 
REST API
URL: https://github.com/apache/flink/pull/6602
 
 
   ## What is the purpose of the change
   
   This PR adds a versioning scheme to the REST API.
   
   Versions are represented as a simple prefix (e.g. `v1`) that is prefixed to 
the request, like `/v1/foo/bar`.
   
   Supported versions are encoded in the `MessageHeaders`, and used by the 
`RestServerEndpoint` to register handlers for specific versioned URLs.
   Additionally handlers are also registered for the _unversioned_ URLs 
effectively adding a default version to every request, which is the oldest 
supported one. What this means is that users working against unversioned URLs 
will for the time being always work against version 1, even when new versions 
are added.
   
   ## Brief change log
   
   * [hotfix] Update error handling for `*FileServerHandlers` to be in-line 
with other handlers
   
   * added `RestAPIVersion` enum
   * added `RestHandlerSpecification#getSupportedAPIVersions`
   * `RestClient#sendRequest` now accepts an optional `RestAPIVersion` 
argument; default is latest supported version
   
   `RestServerEndpoint`:
   * modified sorting logic so that newer versions are registered first, to 
ensure that the oldest version is the default (since they override previous 
registrations)
   * additionally register handlers for versioned URLs
   
   * updated the `RestAPIDocsGenerator` to create separate tables for each 
version
   * updated REST documentation to include information about versioning
   * reworked REST API docs layout to better separate versioned and legacy APIs
   
   
![untitled](https://user-images.githubusercontent.com/5725237/44462438-e0732780-a614-11e8-96c0-b49244bb68e8.png)
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   * RestServerEndpointITCase#testVersioning
   * RestAPIVersionTest
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10193:
---
Labels: pull-request-available  (was: )

> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588768#comment-16588768
 ] 

ASF GitHub Bot commented on FLINK-10193:


GJL opened a new pull request #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601
 
 
   ## What is the purpose of the change
   
   *This adds the `@RpcTimeout` annotation to 
`JobMasterGateway.triggerSavepoint` so that the default timeout is overridden.*
   
   cc: @uce @tillrohrmann 
   
   ## Brief change log
   
 - *Add @RpcTimeout to `JobMasterGateway.triggerSavepoint`.*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *Manually verified the change by running `SlowToCheckpoint` job attached 
to FLINK-10193 issue.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] GJL opened a new pull request #6601: [FLINK-10193] Add @RpcTimeout to JobMasterGateway.triggerSavepoint

2018-08-22 Thread GitBox
GJL opened a new pull request #6601: [FLINK-10193] Add @RpcTimeout to 
JobMasterGateway.triggerSavepoint
URL: https://github.com/apache/flink/pull/6601
 
 
   ## What is the purpose of the change
   
   *This adds the `@RpcTimeout` annotation to 
`JobMasterGateway.triggerSavepoint` so that the default timeout is overridden.*
   
   cc: @uce @tillrohrmann 
   
   ## Brief change log
   
 - *Add @RpcTimeout to `JobMasterGateway.triggerSavepoint`.*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *Manually verified the change by running `SlowToCheckpoint` job attached 
to FLINK-10193 issue.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes** / no / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10164) Add support for resuming from savepoints to StandaloneJobClusterEntrypoint

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588762#comment-16588762
 ] 

ASF GitHub Bot commented on FLINK-10164:


tillrohrmann commented on issue #6572: [FLINK-10164] Add fromSavepoint and 
allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint
URL: https://github.com/apache/flink/pull/6572#issuecomment-415005785
 
 
   Travis passed. Merging this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add support for resuming from savepoints to StandaloneJobClusterEntrypoint
> --
>
> Key: FLINK-10164
> URL: https://issues.apache.org/jira/browse/FLINK-10164
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.7.0
>
>
> The {{StandaloneJobClusterEntrypoint}} should support to resume from a 
> savepoint/checkpoint. I suggest to introduce an optional command line 
> parameter for specifying the savepoint/checkpoint path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] tillrohrmann commented on issue #6572: [FLINK-10164] Add fromSavepoint and allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint

2018-08-22 Thread GitBox
tillrohrmann commented on issue #6572: [FLINK-10164] Add fromSavepoint and 
allowNonRestoredState CLI options to StandaloneJobClusterEntrypoint
URL: https://github.com/apache/flink/pull/6572#issuecomment-415005785
 
 
   Travis passed. Merging this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-8868) Support Table Function as Table for Stream Sql

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588757#comment-16588757
 ] 

ASF GitHub Bot commented on FLINK-8868:
---

Xpray commented on a change in pull request #6574: [FLINK-8868] [table] Support 
Table Function as Table Source for Stream Sql
URL: https://github.com/apache/flink/pull/6574#discussion_r211924061
 
 

 ##
 File path: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/sql/SqlITCase.scala
 ##
 @@ -897,6 +897,45 @@ class SqlITCase extends StreamingWithStateTestBase {
 
 assertEquals(List(expected.toString()), StreamITCase.testResults.sorted)
   }
+
+  @Test
 
 Review comment:
   @pnowojski , I think it's better to support SQL only for this time. TableAPI 
needs more effort.



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support Table Function as Table for Stream Sql
> --
>
> Key: FLINK-8868
> URL: https://issues.apache.org/jira/browse/FLINK-8868
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>Priority: Major
>  Labels: pull-request-available
>
> for stream sql:
> support SQL like:  SELECT * FROM TABLE(tf("a"))
> for batch sql:
> udtf might produce infinite recors, it need to be discussed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Xpray commented on a change in pull request #6574: [FLINK-8868] [table] Support Table Function as Table Source for Stream Sql

2018-08-22 Thread GitBox
Xpray commented on a change in pull request #6574: [FLINK-8868] [table] Support 
Table Function as Table Source for Stream Sql
URL: https://github.com/apache/flink/pull/6574#discussion_r211924061
 
 

 ##
 File path: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/sql/SqlITCase.scala
 ##
 @@ -897,6 +897,45 @@ class SqlITCase extends StreamingWithStateTestBase {
 
 assertEquals(List(expected.toString()), StreamITCase.testResults.sorted)
   }
+
+  @Test
 
 Review comment:
   @pnowojski , I think it's better to support SQL only for this time. TableAPI 
needs more effort.



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-10153) Add tutorial section to documentation

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588749#comment-16588749
 ] 

ASF GitHub Bot commented on FLINK-10153:


TisonKun commented on issue #6565: [FLINK-10153] [docs] Add Tutorials section 
and rework structure.
URL: https://github.com/apache/flink/pull/6565#issuecomment-415003199
 
 
   @fhueske go ahead!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add tutorial section to documentation
> -
>
> Key: FLINK-10153
> URL: https://issues.apache.org/jira/browse/FLINK-10153
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>Priority: Major
>  Labels: pull-request-available
>
> The current documentation does not feature a dedicated tutorials section and 
> has a few issues that should be fix in order to help our (future) users 
> getting started with Flink.
> I propose to add a single "Tutorials" section to the documentation where 
> users find step-by-step guides. The tutorials section help users with 
> different goals:
>   * Get a quick idea of the overall system
>   * Implement a DataStream/DataSet/Table API/SQL job
>   * Set up Flink on a local machine (or run a Docker container)
> There are already a few guides to get started but they are located at 
> different places and should be moved into the Tutorials section. Moreover, 
> some sections such as "Project Setup" contain content that addresses users 
> with very different intentions.
> I propose to
> * add a new Tutorials section and move all existing tutorials there (and 
> later add new ones).
> * move the "Quickstart" section to "Tutorials".
> * remove the "Project Setup" section and move the pages to other sections 
> (some pages will be split up or adjusted).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10162) Add host in checkpoint detail

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588746#comment-16588746
 ] 

ASF GitHub Bot commented on FLINK-10162:


maqingxiang commented on issue #6566: 
[FLINK-10162][flink-runtime][flink-runtime-web] add host in checkpoint detail
URL: https://github.com/apache/flink/pull/6566#issuecomment-415002987
 
 
   cc @yanghua 
   
   Thank you for the review, please help review it again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add host in checkpoint detail
> -
>
> Key: FLINK-10162
> URL: https://issues.apache.org/jira/browse/FLINK-10162
> Project: Flink
>  Issue Type: Improvement
>Reporter: xymaqingxiang
>Assignee: xymaqingxiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2018-08-16-21-14-57-221.png, 
> image-2018-08-16-21-15-42-167.png, image-2018-08-16-21-15-46-585.png, 
> image-2018-08-16-21-36-04-977.png
>
>
> When a streaming job run for a long time,  there will be failures at 
> checkpoint. We will have a requirement to look at the log for the 
> corresponding task and the performance of the machine on which the task is 
> located. 
> Therefore, I suggest that add the host information to the checkpoint so that 
> if the checkpoint fails, you can specifically login to the appropriate 
> machine to see the cause of the failure.
> The failures are as follows:
> !image-2018-08-16-21-14-57-221.png!
> !image-2018-08-16-21-15-46-585.png!
> The revised representation is as follows:
>   !image-2018-08-16-21-36-04-977.png!
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] TisonKun commented on issue #6565: [FLINK-10153] [docs] Add Tutorials section and rework structure.

2018-08-22 Thread GitBox
TisonKun commented on issue #6565: [FLINK-10153] [docs] Add Tutorials section 
and rework structure.
URL: https://github.com/apache/flink/pull/6565#issuecomment-415003199
 
 
   @fhueske go ahead!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] maqingxiang commented on issue #6566: [FLINK-10162][flink-runtime][flink-runtime-web] add host in checkpoint detail

2018-08-22 Thread GitBox
maqingxiang commented on issue #6566: 
[FLINK-10162][flink-runtime][flink-runtime-web] add host in checkpoint detail
URL: https://github.com/apache/flink/pull/6566#issuecomment-415002987
 
 
   cc @yanghua 
   
   Thank you for the review, please help review it again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-10162) Add host in checkpoint detail

2018-08-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10162:
---
Labels: pull-request-available  (was: )

> Add host in checkpoint detail
> -
>
> Key: FLINK-10162
> URL: https://issues.apache.org/jira/browse/FLINK-10162
> Project: Flink
>  Issue Type: Improvement
>Reporter: xymaqingxiang
>Assignee: xymaqingxiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2018-08-16-21-14-57-221.png, 
> image-2018-08-16-21-15-42-167.png, image-2018-08-16-21-15-46-585.png, 
> image-2018-08-16-21-36-04-977.png
>
>
> When a streaming job run for a long time,  there will be failures at 
> checkpoint. We will have a requirement to look at the log for the 
> corresponding task and the performance of the machine on which the task is 
> located. 
> Therefore, I suggest that add the host information to the checkpoint so that 
> if the checkpoint fails, you can specifically login to the appropriate 
> machine to see the cause of the failure.
> The failures are as follows:
> !image-2018-08-16-21-14-57-221.png!
> !image-2018-08-16-21-15-46-585.png!
> The revised representation is as follows:
>   !image-2018-08-16-21-36-04-977.png!
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-10176) Replace ByteArrayData[Input|Output]View with Data[Output|InputDe]Serializer

2018-08-22 Thread Stefan Richter (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter closed FLINK-10176.
--
Resolution: Implemented

Merged in:
master: 3fd6587

> Replace ByteArrayData[Input|Output]View with Data[Output|InputDe]Serializer
> ---
>
> Key: FLINK-10176
> URL: https://issues.apache.org/jira/browse/FLINK-10176
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Major
>
> I have found that the functionality of {{ByteArrayData[Input|Output]View}} is 
> very similar to the already existing {{Data[Output|InputDe]Serializer}}. With 
> some very small additions, we can replace the former with the later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-10175) Fix concurrent access to shared buffer in map state / querable state

2018-08-22 Thread Stefan Richter (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter closed FLINK-10175.
--
Resolution: Fixed

Merged in:
master: 2b6beed

> Fix concurrent access to shared buffer in map state / querable state
> 
>
> Key: FLINK-10175
> URL: https://issues.apache.org/jira/browse/FLINK-10175
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Accidental sharing of buffers between event processing loop and queryable 
> state thread can happen in {{RocksDBMapState::deserializeUserKey}}. Queryable 
> state should provide a separate buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10175) Fix concurrent access to shared buffer in map state / querable state

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588737#comment-16588737
 ] 

ASF GitHub Bot commented on FLINK-10175:


asfgit closed pull request #6583: [FLINK-10175] Fix concurrent access to shared 
buffer between RocksDBMapState and querable state
URL: https://github.com/apache/flink/pull/6583
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
 
b/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
index 3be5779ceec..cc4a54bfeef 100644
--- 
a/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
+++ 
b/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
@@ -101,11 +101,11 @@ public 
TypeInformationKeyValueSerializationSchema(Class keyClass, Class va
V value = null;
 
if (messageKey != null) {
-   inputDeserializer.setBuffer(messageKey, 0, 
messageKey.length);
+   inputDeserializer.setBuffer(messageKey);
key = keySerializer.deserialize(inputDeserializer);
}
if (message != null) {
-   inputDeserializer.setBuffer(message, 0, message.length);
+   inputDeserializer.setBuffer(message);
value = valueSerializer.deserialize(inputDeserializer);
}
return new Tuple2<>(key, value);
diff --git 
a/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
 
b/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
index 78da3fadc3e..6be265a5152 100644
--- 
a/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
+++ 
b/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
@@ -81,9 +81,9 @@ public TypeInformationSerializationSchema(TypeInformation 
typeInfo, TypeSeria
@Override
public T deserialize(byte[] message) {
if (dis != null) {
-   dis.setBuffer(message, 0, message.length);
+   dis.setBuffer(message);
} else {
-   dis = new DataInputDeserializer(message, 0, 
message.length);
+   dis = new DataInputDeserializer(message);
}
 
try {
diff --git 
a/flink-core/src/main/java/org/apache/flink/core/memory/ByteArrayDataInputView.java
 
b/flink-core/src/main/java/org/apache/flink/core/memory/ByteArrayDataInputView.java
deleted file mode 100644
index 698a9f97dc0..000
--- 
a/flink-core/src/main/java/org/apache/flink/core/memory/ByteArrayDataInputView.java
+++ /dev/null
@@ -1,60 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.core.memory;
-
-import javax.annotation.Nonnull;
-
-/**
- * Reusable adapter to {@link DataInputView} that operates on given 
byte-arrays.
- */
-public class ByteArrayDataInputView extends DataInputViewStreamWrapper {
-
-   @Nonnull
-   private final ByteArrayInputStreamWithPos inStreamWithPos;
-
-   public ByteArrayDataInputView() {
-   super(new ByteArrayInputStreamWithPos());
-   this.inStreamWithPos = (ByteArrayInputStreamWithPos) in;
-   }
-
-   public ByteArrayDataInputView(@Nonnull byte[] buffer) {
-   this(buffer, 0, buffer.length);
-   }
-
-  

[GitHub] asfgit closed pull request #6583: [FLINK-10175] Fix concurrent access to shared buffer between RocksDBMapState and querable state

2018-08-22 Thread GitBox
asfgit closed pull request #6583: [FLINK-10175] Fix concurrent access to shared 
buffer between RocksDBMapState and querable state
URL: https://github.com/apache/flink/pull/6583
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
 
b/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
index 3be5779ceec..cc4a54bfeef 100644
--- 
a/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
+++ 
b/flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/TypeInformationKeyValueSerializationSchema.java
@@ -101,11 +101,11 @@ public 
TypeInformationKeyValueSerializationSchema(Class keyClass, Class va
V value = null;
 
if (messageKey != null) {
-   inputDeserializer.setBuffer(messageKey, 0, 
messageKey.length);
+   inputDeserializer.setBuffer(messageKey);
key = keySerializer.deserialize(inputDeserializer);
}
if (message != null) {
-   inputDeserializer.setBuffer(message, 0, message.length);
+   inputDeserializer.setBuffer(message);
value = valueSerializer.deserialize(inputDeserializer);
}
return new Tuple2<>(key, value);
diff --git 
a/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
 
b/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
index 78da3fadc3e..6be265a5152 100644
--- 
a/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
+++ 
b/flink-core/src/main/java/org/apache/flink/api/common/serialization/TypeInformationSerializationSchema.java
@@ -81,9 +81,9 @@ public TypeInformationSerializationSchema(TypeInformation 
typeInfo, TypeSeria
@Override
public T deserialize(byte[] message) {
if (dis != null) {
-   dis.setBuffer(message, 0, message.length);
+   dis.setBuffer(message);
} else {
-   dis = new DataInputDeserializer(message, 0, 
message.length);
+   dis = new DataInputDeserializer(message);
}
 
try {
diff --git 
a/flink-core/src/main/java/org/apache/flink/core/memory/ByteArrayDataInputView.java
 
b/flink-core/src/main/java/org/apache/flink/core/memory/ByteArrayDataInputView.java
deleted file mode 100644
index 698a9f97dc0..000
--- 
a/flink-core/src/main/java/org/apache/flink/core/memory/ByteArrayDataInputView.java
+++ /dev/null
@@ -1,60 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.flink.core.memory;
-
-import javax.annotation.Nonnull;
-
-/**
- * Reusable adapter to {@link DataInputView} that operates on given 
byte-arrays.
- */
-public class ByteArrayDataInputView extends DataInputViewStreamWrapper {
-
-   @Nonnull
-   private final ByteArrayInputStreamWithPos inStreamWithPos;
-
-   public ByteArrayDataInputView() {
-   super(new ByteArrayInputStreamWithPos());
-   this.inStreamWithPos = (ByteArrayInputStreamWithPos) in;
-   }
-
-   public ByteArrayDataInputView(@Nonnull byte[] buffer) {
-   this(buffer, 0, buffer.length);
-   }
-
-   public ByteArrayDataInputView(@Nonnull byte[] buffer, int offset, int 
length) {
-   this();
-   setData(buffer, offset, length);
-   }
-
-   public int getPosition() {
-   return inStreamWithPos.getPosition();
-   

[jira] [Updated] (FLINK-10138) Queryable state (rocksdb) end-to-end test failed on Travis

2018-08-22 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-10138:
-
Component/s: Queryable State

> Queryable state (rocksdb) end-to-end test failed on Travis
> --
>
> Key: FLINK-10138
> URL: https://issues.apache.org/jira/browse/FLINK-10138
> Project: Flink
>  Issue Type: Bug
>  Components: Queryable State, Tests
>Affects Versions: 1.7.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.7.0
>
>
> The {{Queryable state (rocksdb) end-to-end test}} failed on Travis with the 
> following exception
> {code}
> Exception in thread "main" java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: Failed request 54.
>  Caused by: java.lang.RuntimeException: Failed request 54.
>  Caused by: java.lang.RuntimeException: Error while processing request with 
> ID 54. Caused by: org.apache.flink.util.FlinkRuntimeException: Error while 
> deserializing the user value.
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState$RocksDBMapEntry.getValue(RocksDBMapState.java:446)
>   at 
> org.apache.flink.queryablestate.client.state.serialization.KvStateSerializer.serializeMap(KvStateSerializer.java:222)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState.getSerializedValue(RocksDBMapState.java:296)
>   at 
> org.apache.flink.queryablestate.server.KvStateServerHandler.getSerializedValue(KvStateServerHandler.java:107)
>   at 
> org.apache.flink.queryablestate.server.KvStateServerHandler.handleRequest(KvStateServerHandler.java:84)
>   at 
> org.apache.flink.queryablestate.server.KvStateServerHandler.handleRequest(KvStateServerHandler.java:48)
>   at 
> org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.run(AbstractServerHandler.java:236)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.EOFException: No more bytes left.
>   at 
> org.apache.flink.api.java.typeutils.runtime.NoFetchingInput.require(NoFetchingInput.java:79)
>   at com.esotericsoftware.kryo.io.Input.readString(Input.java:452)
>   at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:132)
>   at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>   at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:641)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:752)
>   at 
> org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:315)
>   at 
> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.deserialize(PojoSerializer.java:411)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState.deserializeUserValue(RocksDBMapState.java:347)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState.access$100(RocksDBMapState.java:66)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState$RocksDBMapEntry.getValue(RocksDBMapState.java:444)
>   ... 11 more
>   at 
> org.apache.flink.queryablestate.server.KvStateServerHandler.handleRequest(KvStateServerHandler.java:95)
>   at 
> org.apache.flink.queryablestate.server.KvStateServerHandler.handleRequest(KvStateServerHandler.java:48)
>   at 
> org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.run(AbstractServerHandler.java:236)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
>   at 
> org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.lambda$run$0(AbstractServerHandler.java:273)
>   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
>   at 
> java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)
>   at 
> java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2140)
>   at 
> org.apache.flink.queryablestate.network.AbstractServerHandler$AsyncRequestTask.run(AbstractServerHandler.java:236)
>   at 
> java.util.con

[jira] [Commented] (FLINK-10175) Fix concurrent access to shared buffer in map state / querable state

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588675#comment-16588675
 ] 

ASF GitHub Bot commented on FLINK-10175:


StefanRRichter commented on issue #6583: [FLINK-10175] Fix concurrent access to 
shared buffer between RocksDBMapState and querable state
URL: https://github.com/apache/flink/pull/6583#issuecomment-414987221
 
 
   Thanks for the review @kl0u ! Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix concurrent access to shared buffer in map state / querable state
> 
>
> Key: FLINK-10175
> URL: https://issues.apache.org/jira/browse/FLINK-10175
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.7.0
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Accidental sharing of buffers between event processing loop and queryable 
> state thread can happen in {{RocksDBMapState::deserializeUserKey}}. Queryable 
> state should provide a separate buffer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StefanRRichter commented on issue #6583: [FLINK-10175] Fix concurrent access to shared buffer between RocksDBMapState and querable state

2018-08-22 Thread GitBox
StefanRRichter commented on issue #6583: [FLINK-10175] Fix concurrent access to 
shared buffer between RocksDBMapState and querable state
URL: https://github.com/apache/flink/pull/6583#issuecomment-414987221
 
 
   Thanks for the review @kl0u ! Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-10193) Default RPC timeout is used when triggering savepoint via JobMasterGateway

2018-08-22 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10193:
--
Fix Version/s: 1.5.4
   1.7.0
   1.6.1

> Default RPC timeout is used when triggering savepoint via JobMasterGateway
> --
>
> Key: FLINK-10193
> URL: https://issues.apache.org/jira/browse/FLINK-10193
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.5.3, 1.6.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Critical
> Fix For: 1.6.1, 1.7.0, 1.5.4
>
> Attachments: SlowToCheckpoint.java
>
>
> When calling {{JobMasterGateway#triggerSavepoint(String, boolean, Time)}}, 
> the default timeout is used because the time parameter of the method  is not 
> annotated with {{@RpcTimeout}}. 
> *Expected behavior*
> * timeout for the RPC should be {{RpcUtils.INF_TIMEOUT}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10167) SessionWindows not compatible with typed DataStreams in scala

2018-08-22 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588629#comment-16588629
 ] 

Aljoscha Krettek commented on FLINK-10167:
--

I meant values as the Scala language uses values, i.e. subclasses of 
{{AnyVal}}, which are roughly those types that the Java language calls 
primitives (for reference: 
[https://www.scala-lang.org/api/current/scala/AnyVal.html]). A Scala {{Tuple}} 
is not a value so it works. The issue is that Java {{Object}} is roughly equal 
to Scala {{AnyRef}}, so only subclasses of {{AnyRef}} work here.

Regarding your second question, there was no specific reasoning behind this, 
{{Object}} was used here because it allows using streams of any Java type that 
is not a primitive, likewise, it works for Scala types that are not values. I 
think if we implemented it now and took Scala into consideration more carefully 
we would have done it differently.

Regarding your exception, I think this could be caused by the key not being 
deterministic. If you post your code I could have a look.

> SessionWindows not compatible with typed DataStreams in scala
> -
>
> Key: FLINK-10167
> URL: https://issues.apache.org/jira/browse/FLINK-10167
> Project: Flink
>  Issue Type: Bug
>Reporter: Andrew Roberts
>Priority: Major
>
> I'm trying to construct a trivial job that uses session windows, and it looks 
> like the data type parameter is hardcoded to `Object`/`AnyRef`. Due to the 
> invariance of java classes in scala, this means that we can't use the 
> provided SessionWindow helper classes in scala on typed streams.
>  
> Example job:
> {code:java}
> import org.apache.flink.api.scala._
> import org.apache.flink.streaming.api.scala.{DataStream, 
> StreamExecutionEnvironment}
> import 
> org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows
> import org.apache.flink.streaming.api.windowing.time.Time
> import org.apache.flink.streaming.api.windowing.windows.{TimeWindow, Window}
> import org.apache.flink.util.Collector
> object TestJob {
>   val jobName = "TestJob"
>   def main(args: Array[String]): Unit = {
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> env.fromCollection(Range(0, 100).toList)
>   .keyBy(_ / 10)
>   .window(ProcessingTimeSessionWindows.withGap(Time.minutes(1)))
>   .reduce(
> (a: Int, b: Int) => a + b,
> (key: Int, window: Window, items: Iterable[Int], out: 
> Collector[String]) => s"${key}: ${items}"
>   )
>   .map(println(_))
> env.execute(jobName)
>   }
> }{code}
>  
> Compile error:
> {code:java}
> [error]  found   : 
> org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows
> [error]  required: 
> org.apache.flink.streaming.api.windowing.assigners.WindowAssigner[_ >: Int, ?]
> [error] Note: Object <: Any (and 
> org.apache.flink.streaming.api.windowing.assigners.ProcessingTimeSessionWindows
>  <: 
> org.apache.flink.streaming.api.windowing.assigners.MergingWindowAssigner[Object,org.apache.flink.streaming.api.windowing.windows.TimeWindow]),
>  but Java-defined class WindowAssigner is invariant in type T.
> [error] You may wish to investigate a wildcard type such as `_ <: Any`. (SLS 
> 3.2.10)
> [error]       
> .window(ProcessingTimeSessionWindows.withGap(Time.minutes(1))){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-10197) flink CEP error when use sql

2018-08-22 Thread Dawid Wysakowicz (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz closed FLINK-10197.

Resolution: Not A Bug

> flink CEP error when use sql
> 
>
> Key: FLINK-10197
> URL: https://issues.apache.org/jira/browse/FLINK-10197
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.6.0
>Reporter: sean.miao
>Priority: Major
>
> When I used sql to write the flink cep task, I encountered some problems. 
> However, calcite-1.16 supports the use of match_recognize to write cep.
> my code :
> {code:java}
> // set up execution environment
>   ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>   BatchTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);
>   DataSet input = env.fromElements(
>  new WC("1", 1),
>  new WC("2", 2),
>  new WC("1", 3),
>  new WC("2", 3),
>  new WC("2", 2),
>  new WC("1", 1));
>   // register the DataSet as table "WordCount"
>   tEnv.registerDataSet("t", input, "word, id");
>   final String sql = "select * \n"
> + "  from t match_recognize \n"
> + "  (\n"
> + "   measures STRT.word as  start_word ,"
> + "FINAL LAST(A.id) as A_id \n"
> + "pattern (STRT A+) \n"
> + "define \n"
> + "  A AS A.word = '1' \n"
> + "  ) mr";
> // run a SQL query on the Table and retrieve the result as a new Table
> //Table table = tEnv.sqlQuery(
> //   "SELECT word, SUM(frequency) as frequency FROM WordCount GROUP BY 
> word");
> //tEnv.sqlUpdate(sql);
>   tEnv.sqlQuery(sql);
> {code}
> ERROR:
> {code:java}
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. From line 0, column 0 to line 7, column 17: No match 
> found for function signature PREV(, )
> at 
> org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:91)
> at 
> org.apache.flink.table.api.TableEnvironment.sqlQuery(TableEnvironment.scala:652)
> at TableSQL.WordCountSQL.main(WordCountSQL.java:71)
> Caused by: org.apache.calcite.runtime.CalciteContextException: From line 0, 
> column 0 to line 7, column 17: No match found for function signature 
> PREV(, )
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:803)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:788)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4706)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.handleUnresolvedFunction(SqlValidatorImpl.java:1690)
> at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:278)
> at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:223)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5432)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5419)
> at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1606)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1591)
> at 
> org.apache.calcite.sql.type.InferTypes$1.inferOperandTypes(InferTypes.java:51)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1777)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1783)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateDefinitions(SqlValidatorImpl.java:5030)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateMatchRecognize(SqlValidatorImpl.java:4903)
> at 
> org.apache.calcite.sql.validate.MatchRecognizeNamespace.validateImpl(MatchRecognizeNamespace.java:38)
> at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975)
> at 
> org.apache.calcite.sql.validate.SqlValidator

[jira] [Commented] (FLINK-10197) flink CEP error when use sql

2018-08-22 Thread Dawid Wysakowicz (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588618#comment-16588618
 ] 

Dawid Wysakowicz commented on FLINK-10197:
--

Flink does not support MATCH_RECOGNIZE clause. This is an ongoing effort. See 
[FLINK-7062]

> flink CEP error when use sql
> 
>
> Key: FLINK-10197
> URL: https://issues.apache.org/jira/browse/FLINK-10197
> Project: Flink
>  Issue Type: Bug
>  Components: CEP
>Affects Versions: 1.6.0
>Reporter: sean.miao
>Priority: Major
>
> When I used sql to write the flink cep task, I encountered some problems. 
> However, calcite-1.16 supports the use of match_recognize to write cep.
> my code :
> {code:java}
> // set up execution environment
>   ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>   BatchTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);
>   DataSet input = env.fromElements(
>  new WC("1", 1),
>  new WC("2", 2),
>  new WC("1", 3),
>  new WC("2", 3),
>  new WC("2", 2),
>  new WC("1", 1));
>   // register the DataSet as table "WordCount"
>   tEnv.registerDataSet("t", input, "word, id");
>   final String sql = "select * \n"
> + "  from t match_recognize \n"
> + "  (\n"
> + "   measures STRT.word as  start_word ,"
> + "FINAL LAST(A.id) as A_id \n"
> + "pattern (STRT A+) \n"
> + "define \n"
> + "  A AS A.word = '1' \n"
> + "  ) mr";
> // run a SQL query on the Table and retrieve the result as a new Table
> //Table table = tEnv.sqlQuery(
> //   "SELECT word, SUM(frequency) as frequency FROM WordCount GROUP BY 
> word");
> //tEnv.sqlUpdate(sql);
>   tEnv.sqlQuery(sql);
> {code}
> ERROR:
> {code:java}
> Exception in thread "main" org.apache.flink.table.api.ValidationException: 
> SQL validation failed. From line 0, column 0 to line 7, column 17: No match 
> found for function signature PREV(, )
> at 
> org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:91)
> at 
> org.apache.flink.table.api.TableEnvironment.sqlQuery(TableEnvironment.scala:652)
> at TableSQL.WordCountSQL.main(WordCountSQL.java:71)
> Caused by: org.apache.calcite.runtime.CalciteContextException: From line 0, 
> column 0 to line 7, column 17: No match found for function signature 
> PREV(, )
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:803)
> at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:788)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4706)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.handleUnresolvedFunction(SqlValidatorImpl.java:1690)
> at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:278)
> at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:223)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5432)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5419)
> at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1606)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1591)
> at 
> org.apache.calcite.sql.type.InferTypes$1.inferOperandTypes(InferTypes.java:51)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1777)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1783)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateDefinitions(SqlValidatorImpl.java:5030)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateMatchRecognize(SqlValidatorImpl.java:4903)
> at 
> org.apache.calcite.sql.validate.MatchRecognizeNamespace.validateImpl(MatchRecognizeNamespace.java:38)
> at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
> at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
> at 
> org.apache.calci

[jira] [Created] (FLINK-10197) flink CEP error when use sql

2018-08-22 Thread sean.miao (JIRA)
sean.miao created FLINK-10197:
-

 Summary: flink CEP error when use sql
 Key: FLINK-10197
 URL: https://issues.apache.org/jira/browse/FLINK-10197
 Project: Flink
  Issue Type: Bug
  Components: CEP
Affects Versions: 1.6.0
Reporter: sean.miao


When I used sql to write the flink cep task, I encountered some problems. 
However, calcite-1.16 supports the use of match_recognize to write cep.

my code :
{code:java}
// set up execution environment
  ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
  BatchTableEnvironment tEnv = TableEnvironment.getTableEnvironment(env);


  DataSet input = env.fromElements(
 new WC("1", 1),
 new WC("2", 2),
 new WC("1", 3),
 new WC("2", 3),
 new WC("2", 2),
 new WC("1", 1));

  // register the DataSet as table "WordCount"
  tEnv.registerDataSet("t", input, "word, id");
  final String sql = "select * \n"
+ "  from t match_recognize \n"
+ "  (\n"
+ "   measures STRT.word as  start_word ,"
+ "FINAL LAST(A.id) as A_id \n"
+ "pattern (STRT A+) \n"
+ "define \n"
+ "  A AS A.word = '1' \n"
+ "  ) mr";
// run a SQL query on the Table and retrieve the result as a new Table
//Table table = tEnv.sqlQuery(
//   "SELECT word, SUM(frequency) as frequency FROM WordCount GROUP BY 
word");
//tEnv.sqlUpdate(sql);
  tEnv.sqlQuery(sql);
{code}
ERROR:
{code:java}
Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL 
validation failed. From line 0, column 0 to line 7, column 17: No match found 
for function signature PREV(, )
at 
org.apache.flink.table.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:91)
at 
org.apache.flink.table.api.TableEnvironment.sqlQuery(TableEnvironment.scala:652)
at TableSQL.WordCountSQL.main(WordCountSQL.java:71)
Caused by: org.apache.calcite.runtime.CalciteContextException: From line 0, 
column 0 to line 7, column 17: No match found for function signature 
PREV(, )
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:803)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:788)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4706)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.handleUnresolvedFunction(SqlValidatorImpl.java:1690)
at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:278)
at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:223)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5432)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5419)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1606)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1591)
at 
org.apache.calcite.sql.type.InferTypes$1.inferOperandTypes(InferTypes.java:51)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1777)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1783)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateDefinitions(SqlValidatorImpl.java:5030)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateMatchRecognize(SqlValidatorImpl.java:4903)
at 
org.apache.calcite.sql.validate.MatchRecognizeNamespace.validateImpl(MatchRecognizeNamespace.java:38)
at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:947)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:928)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2975)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2960)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3219)
at 
org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
at 
org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(Sq

[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2018-08-22 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588608#comment-16588608
 ] 

Chesnay Schepler commented on FLINK-7525:
-

Then you will have to intercept all requests by users and filter out cancel 
requests etc. .

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7964) Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588599#comment-16588599
 ] 

ASF GitHub Bot commented on FLINK-7964:
---

yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414967223
 
 
   @pnowojski Agree, then I will first refactor the duplicate code based on 
1.0, then add the 1.1 module. In fact, 2.0 is also in the process of being 
prepared, but it is still on another branch of my locality.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Apache Kafka 1.0/1.1 connectors
> ---
>
> Key: FLINK-7964
> URL: https://issues.apache.org/jira/browse/FLINK-7964
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: Hai Zhou
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project 
> Management Committee has packed a number of valuable enhancements into the 
> release. Here is a summary of a few of them:
> * Since its introduction in version 0.10, the Streams API has become hugely 
> popular among Kafka users, including the likes of Pinterest, Rabobank, 
> Zalando, and The New York Times. In 1.0, the the API continues to evolve at a 
> healthy pace. To begin with, the builder API has been improved (KIP-120). A 
> new API has been added to expose the state of active tasks at runtime 
> (KIP-130). The new cogroup API makes it much easier to deal with partitioned 
> aggregates with fewer StateStores and fewer moving parts in your code 
> (KIP-150). Debuggability gets easier with enhancements to the print() and 
> writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 
> and KIP-161 too. For more on streams, check out the Apache Kafka Streams 
> documentation, including some helpful new tutorial videos.
> * Operating Kafka at scale requires that the system remain observable, and to 
> make that easier, we’ve made a number of improvements to metrics. These are 
> too many to summarize without becoming tedious, but Connect metrics have been 
> significantly improved (KIP-196), a litany of new health check metrics are 
> now exposed (KIP-188), and we now have a global topic and partition count 
> (KIP-168). Check out KIP-164 and KIP-187 for even more.
> * We now support Java 9, leading, among other things, to significantly faster 
> TLS and CRC32C implementations. Over-the-wire encryption will be faster now, 
> which will keep Kafka fast and compute costs low when encryption is enabled.
> * In keeping with the security theme, KIP-152 cleans up the error handling on 
> Simple Authentication Security Layer (SASL) authentication attempts. 
> Previously, some authentication error conditions were indistinguishable from 
> broker failures and were not logged in a clear way. This is cleaner now.
> * Kafka can now tolerate disk failures better. Historically, JBOD storage 
> configurations have not been recommended, but the architecture has 
> nevertheless been tempting: after all, why not rely on Kafka’s own 
> replication mechanism to protect against storage failure rather than using 
> RAID? With KIP-112, Kafka now handles disk failure more gracefully. A single 
> disk failure in a JBOD broker will not bring the entire broker down; rather, 
> the broker will continue serving any log files that remain on functioning 
> disks.
> * Since release 0.11.0, the idempotent producer (which is the producer used 
> in the presence of a transaction, which of course is the producer we use for 
> exactly-once processing) required max.in.flight.requests.per.connection to be 
> equal to one. As anyone who has written or tested a wire protocol can attest, 
> this put an upper bound on throughput. Thanks to KAFKA-5949, this can now be 
> as large as five, relaxing the throughput constraint quite a bit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread GitBox
yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414967223
 
 
   @pnowojski Agree, then I will first refactor the duplicate code based on 
1.0, then add the 1.1 module. In fact, 2.0 is also in the process of being 
prepared, but it is still on another branch of my locality.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-7964) Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588592#comment-16588592
 ] 

ASF GitHub Bot commented on FLINK-7964:
---

pnowojski commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414964833
 
 
   There is a risk that our reflection code will brake on sub version upgrade, 
however it's not that big a problem, since in our connectors we are freezing 
Kafka client's version that we are using. Keep in mind that Kafka client's 
version that we are using is independent of Kafka server's that we are talking 
to. For example our Kafka 0.11.3 connector as far as I know is able to work 
with both Kafka 1.0 and 1.1 (with lower performance however).
   
   However, I thought a little bit more about arguments for providing separate 
module for each major Kafka version and there is one huge argument in favour of 
that: bug fixes in Kafka. There are quite a lot of bugs in Kafka and sometimes 
they are not backported to older versions. I think this reason is good enough 
for adding `flink-connector-kafka-1.1` module, even if our code is almost 
identical. Also this would reduce our user's confusion "Does 
flink-connector-kafka-0.11 works with Kafka 1.0?".
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Apache Kafka 1.0/1.1 connectors
> ---
>
> Key: FLINK-7964
> URL: https://issues.apache.org/jira/browse/FLINK-7964
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: Hai Zhou
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project 
> Management Committee has packed a number of valuable enhancements into the 
> release. Here is a summary of a few of them:
> * Since its introduction in version 0.10, the Streams API has become hugely 
> popular among Kafka users, including the likes of Pinterest, Rabobank, 
> Zalando, and The New York Times. In 1.0, the the API continues to evolve at a 
> healthy pace. To begin with, the builder API has been improved (KIP-120). A 
> new API has been added to expose the state of active tasks at runtime 
> (KIP-130). The new cogroup API makes it much easier to deal with partitioned 
> aggregates with fewer StateStores and fewer moving parts in your code 
> (KIP-150). Debuggability gets easier with enhancements to the print() and 
> writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 
> and KIP-161 too. For more on streams, check out the Apache Kafka Streams 
> documentation, including some helpful new tutorial videos.
> * Operating Kafka at scale requires that the system remain observable, and to 
> make that easier, we’ve made a number of improvements to metrics. These are 
> too many to summarize without becoming tedious, but Connect metrics have been 
> significantly improved (KIP-196), a litany of new health check metrics are 
> now exposed (KIP-188), and we now have a global topic and partition count 
> (KIP-168). Check out KIP-164 and KIP-187 for even more.
> * We now support Java 9, leading, among other things, to significantly faster 
> TLS and CRC32C implementations. Over-the-wire encryption will be faster now, 
> which will keep Kafka fast and compute costs low when encryption is enabled.
> * In keeping with the security theme, KIP-152 cleans up the error handling on 
> Simple Authentication Security Layer (SASL) authentication attempts. 
> Previously, some authentication error conditions were indistinguishable from 
> broker failures and were not logged in a clear way. This is cleaner now.
> * Kafka can now tolerate disk failures better. Historically, JBOD storage 
> configurations have not been recommended, but the architecture has 
> nevertheless been tempting: after all, why not rely on Kafka’s own 
> replication mechanism to protect against storage failure rather than using 
> RAID? With KIP-112, Kafka now handles disk failure more gracefully. A single 
> disk failure in a JBOD broker will not bring the entire broker down; rather, 
> the broker will continue serving any log files that remain on functioning 
> disks.
> * Since release 0.11.0, the idempotent producer (which is the producer used 
> in the presence of a transaction, which of course is the producer we use for 
> exactly-once processing) required max.in.flight.requests.per.conne

[GitHub] pnowojski commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread GitBox
pnowojski commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414964833
 
 
   There is a risk that our reflection code will brake on sub version upgrade, 
however it's not that big a problem, since in our connectors we are freezing 
Kafka client's version that we are using. Keep in mind that Kafka client's 
version that we are using is independent of Kafka server's that we are talking 
to. For example our Kafka 0.11.3 connector as far as I know is able to work 
with both Kafka 1.0 and 1.1 (with lower performance however).
   
   However, I thought a little bit more about arguments for providing separate 
module for each major Kafka version and there is one huge argument in favour of 
that: bug fixes in Kafka. There are quite a lot of bugs in Kafka and sometimes 
they are not backported to older versions. I think this reason is good enough 
for adding `flink-connector-kafka-1.1` module, even if our code is almost 
identical. Also this would reduce our user's confusion "Does 
flink-connector-kafka-0.11 works with Kafka 1.0?".
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2018-08-22 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588583#comment-16588583
 ] 

vinoyang commented on FLINK-7525:
-

Ok, then I have no opinion on your handling. It's just that we do have such a 
requirement. We built a platform around Flink, and the lifecycle management of 
jobs is not based on Flink UI. Instead, it operates on our own web, but allows 
users to access the Flink UI. So we need to disable the Cancel button.

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2018-08-22 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588570#comment-16588570
 ] 

Chesnay Schepler commented on FLINK-7525:
-

no.

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2018-08-22 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588566#comment-16588566
 ] 

vinoyang commented on FLINK-7525:
-

I don't know much about the details of this part, but is there a way to 
separate the rest client and other forms of http requests?

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7964) Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588556#comment-16588556
 ] 

ASF GitHub Bot commented on FLINK-7964:
---

yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414957744
 
 
   @pnowojski  Ok, you are a gentleman and gave me a lot of good advice. I 
think it's necessary to create a new module for each large version of kafka, 
but if we create a separate module for each sub version of kafka, it does seem 
a bit more. But we did use some kafka non-public APIs, and I don't know if they 
have the possibility to be modified in the sub version. If you don't add a 
separate module for the sub version, you will only be able to use branch 
decisions like if/else in the future.
   
   Are you working in the same place with @aljoscha ? If so, can you discuss 
this issue offline? And I will first refactor the current duplicate test code. 
Do you agree?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Apache Kafka 1.0/1.1 connectors
> ---
>
> Key: FLINK-7964
> URL: https://issues.apache.org/jira/browse/FLINK-7964
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: Hai Zhou
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project 
> Management Committee has packed a number of valuable enhancements into the 
> release. Here is a summary of a few of them:
> * Since its introduction in version 0.10, the Streams API has become hugely 
> popular among Kafka users, including the likes of Pinterest, Rabobank, 
> Zalando, and The New York Times. In 1.0, the the API continues to evolve at a 
> healthy pace. To begin with, the builder API has been improved (KIP-120). A 
> new API has been added to expose the state of active tasks at runtime 
> (KIP-130). The new cogroup API makes it much easier to deal with partitioned 
> aggregates with fewer StateStores and fewer moving parts in your code 
> (KIP-150). Debuggability gets easier with enhancements to the print() and 
> writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 
> and KIP-161 too. For more on streams, check out the Apache Kafka Streams 
> documentation, including some helpful new tutorial videos.
> * Operating Kafka at scale requires that the system remain observable, and to 
> make that easier, we’ve made a number of improvements to metrics. These are 
> too many to summarize without becoming tedious, but Connect metrics have been 
> significantly improved (KIP-196), a litany of new health check metrics are 
> now exposed (KIP-188), and we now have a global topic and partition count 
> (KIP-168). Check out KIP-164 and KIP-187 for even more.
> * We now support Java 9, leading, among other things, to significantly faster 
> TLS and CRC32C implementations. Over-the-wire encryption will be faster now, 
> which will keep Kafka fast and compute costs low when encryption is enabled.
> * In keeping with the security theme, KIP-152 cleans up the error handling on 
> Simple Authentication Security Layer (SASL) authentication attempts. 
> Previously, some authentication error conditions were indistinguishable from 
> broker failures and were not logged in a clear way. This is cleaner now.
> * Kafka can now tolerate disk failures better. Historically, JBOD storage 
> configurations have not been recommended, but the architecture has 
> nevertheless been tempting: after all, why not rely on Kafka’s own 
> replication mechanism to protect against storage failure rather than using 
> RAID? With KIP-112, Kafka now handles disk failure more gracefully. A single 
> disk failure in a JBOD broker will not bring the entire broker down; rather, 
> the broker will continue serving any log files that remain on functioning 
> disks.
> * Since release 0.11.0, the idempotent producer (which is the producer used 
> in the presence of a transaction, which of course is the producer we use for 
> exactly-once processing) required max.in.flight.requests.per.connection to be 
> equal to one. As anyone who has written or tested a wire protocol can attest, 
> this put an upper bound on throughput. Thanks to KAFKA-5949, this can now be 
> as large as five, relaxing the throughput constraint quite a bit.



--
This message was sent by Atlassian JIRA
(v7.

[GitHub] yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread GitBox
yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414957744
 
 
   @pnowojski  Ok, you are a gentleman and gave me a lot of good advice. I 
think it's necessary to create a new module for each large version of kafka, 
but if we create a separate module for each sub version of kafka, it does seem 
a bit more. But we did use some kafka non-public APIs, and I don't know if they 
have the possibility to be modified in the sub version. If you don't add a 
separate module for the sub version, you will only be able to use branch 
decisions like if/else in the future.
   
   Are you working in the same place with @aljoscha ? If so, can you discuss 
this issue offline? And I will first refactor the current duplicate test code. 
Do you agree?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-7964) Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588543#comment-16588543
 ] 

ASF GitHub Bot commented on FLINK-7964:
---

pnowojski commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414953729
 
 
   Yes, I missed those changes.
   
   Btw, don't get me wrong, with "depressing" I wasn't complaining about this 
contribution, since it's fully following already established pattern :) Also by 
me, since when I was creating `0.11` connector, I only refactored/deduplicated 
producer side of the tests and copied/pasted most of the consumer side tests.
   
   Regarding the @aljoscha 's comment, I will clarify this, since the issues 
that he mentioned/referred was a little bit irrelevant to question whether we 
need another module if our client is forward compatible (if it's able to talk 
to newer Kafka servers).  FLINK-9690/FLINK-9700 were/are about 
incompatibilities of our `0.11` connector when compiling it against Kafka `1.0` 
or `1.1`. That's something else.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Apache Kafka 1.0/1.1 connectors
> ---
>
> Key: FLINK-7964
> URL: https://issues.apache.org/jira/browse/FLINK-7964
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: Hai Zhou
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project 
> Management Committee has packed a number of valuable enhancements into the 
> release. Here is a summary of a few of them:
> * Since its introduction in version 0.10, the Streams API has become hugely 
> popular among Kafka users, including the likes of Pinterest, Rabobank, 
> Zalando, and The New York Times. In 1.0, the the API continues to evolve at a 
> healthy pace. To begin with, the builder API has been improved (KIP-120). A 
> new API has been added to expose the state of active tasks at runtime 
> (KIP-130). The new cogroup API makes it much easier to deal with partitioned 
> aggregates with fewer StateStores and fewer moving parts in your code 
> (KIP-150). Debuggability gets easier with enhancements to the print() and 
> writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 
> and KIP-161 too. For more on streams, check out the Apache Kafka Streams 
> documentation, including some helpful new tutorial videos.
> * Operating Kafka at scale requires that the system remain observable, and to 
> make that easier, we’ve made a number of improvements to metrics. These are 
> too many to summarize without becoming tedious, but Connect metrics have been 
> significantly improved (KIP-196), a litany of new health check metrics are 
> now exposed (KIP-188), and we now have a global topic and partition count 
> (KIP-168). Check out KIP-164 and KIP-187 for even more.
> * We now support Java 9, leading, among other things, to significantly faster 
> TLS and CRC32C implementations. Over-the-wire encryption will be faster now, 
> which will keep Kafka fast and compute costs low when encryption is enabled.
> * In keeping with the security theme, KIP-152 cleans up the error handling on 
> Simple Authentication Security Layer (SASL) authentication attempts. 
> Previously, some authentication error conditions were indistinguishable from 
> broker failures and were not logged in a clear way. This is cleaner now.
> * Kafka can now tolerate disk failures better. Historically, JBOD storage 
> configurations have not been recommended, but the architecture has 
> nevertheless been tempting: after all, why not rely on Kafka’s own 
> replication mechanism to protect against storage failure rather than using 
> RAID? With KIP-112, Kafka now handles disk failure more gracefully. A single 
> disk failure in a JBOD broker will not bring the entire broker down; rather, 
> the broker will continue serving any log files that remain on functioning 
> disks.
> * Since release 0.11.0, the idempotent producer (which is the producer used 
> in the presence of a transaction, which of course is the producer we use for 
> exactly-once processing) required max.in.flight.requests.per.connection to be 
> equal to one. As anyone who has written or tested a wire protocol can attest, 
> this put an upper bound on throughput. Thanks to KAFKA-5949, this can now be 
> as large as f

[GitHub] pnowojski commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread GitBox
pnowojski commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414953729
 
 
   Yes, I missed those changes.
   
   Btw, don't get me wrong, with "depressing" I wasn't complaining about this 
contribution, since it's fully following already established pattern :) Also by 
me, since when I was creating `0.11` connector, I only refactored/deduplicated 
producer side of the tests and copied/pasted most of the consumer side tests.
   
   Regarding the @aljoscha 's comment, I will clarify this, since the issues 
that he mentioned/referred was a little bit irrelevant to question whether we 
need another module if our client is forward compatible (if it's able to talk 
to newer Kafka servers).  FLINK-9690/FLINK-9700 were/are about 
incompatibilities of our `0.11` connector when compiling it against Kafka `1.0` 
or `1.1`. That's something else.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2018-08-22 Thread Chesnay Schepler (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588542#comment-16588542
 ] 

Chesnay Schepler commented on FLINK-7525:
-

That would still disable the cancel/stop/cancel-with-savepoint functionality in 
the client, which isn't a viable option.

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7525) Add config option to disable Cancel functionality on UI

2018-08-22 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588533#comment-16588533
 ] 

vinoyang commented on FLINK-7525:
-

That is, when the user initiates a cancel request through http, when the 
configuration is enabled, the logic of the cancel is not actually executed, but 
an exception or a prompt message is returned, indicating that the cancel job 
cannot be confirmed.

> Add config option to disable Cancel functionality on UI
> ---
>
> Key: FLINK-7525
> URL: https://issues.apache.org/jira/browse/FLINK-7525
> Project: Flink
>  Issue Type: Improvement
>  Components: Web Client, Webfrontend
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Major
>
> In this email thread 
> http://search-hadoop.com/m/Flink/VkLeQlf0QOnc7YA?subj=Security+Control+of+running+Flink+Jobs+on+Flink+UI
>  , Raja was asking for a way to control how users cancel Job(s).
> Robert proposed adding a config option which disables the Cancel 
> functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7964) Add Apache Kafka 1.0/1.1 connectors

2018-08-22 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16588528#comment-16588528
 ] 

ASF GitHub Bot commented on FLINK-7964:
---

yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-414951008
 
 
   @pnowojski I am refactoring the code to reduce unnecessary code. 
   
   Actually, the change from kafka v0.11 to v1.0, Not just a variable, there 
are some tests code that are not available, see 
[here](https://github.com/apache/flink/pull/6577/files#diff-527838fe63568e530d871d45a98e3fe4R157)
 and 
[here](https://github.com/apache/flink/pull/6577/files#diff-527838fe63568e530d871d45a98e3fe4R168).
 Some [old methods about using Zookeeper to get 
metadata](https://github.com/apache/kafka/commit/1d24e10aeab616eede416201336e928b9a8efa98#commitcomment-30035522)
 which are not available.
   
   for the third point, according @aljoscha 's 
[comment](https://issues.apache.org/jira/browse/FLINK-10067?focusedCommentId=16571628&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16571628),
 I think we also need a new connector for kafka v1.1 to categorize the problems 
encountered by different kafka versions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Apache Kafka 1.0/1.1 connectors
> ---
>
> Key: FLINK-7964
> URL: https://issues.apache.org/jira/browse/FLINK-7964
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: Hai Zhou
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project 
> Management Committee has packed a number of valuable enhancements into the 
> release. Here is a summary of a few of them:
> * Since its introduction in version 0.10, the Streams API has become hugely 
> popular among Kafka users, including the likes of Pinterest, Rabobank, 
> Zalando, and The New York Times. In 1.0, the the API continues to evolve at a 
> healthy pace. To begin with, the builder API has been improved (KIP-120). A 
> new API has been added to expose the state of active tasks at runtime 
> (KIP-130). The new cogroup API makes it much easier to deal with partitioned 
> aggregates with fewer StateStores and fewer moving parts in your code 
> (KIP-150). Debuggability gets easier with enhancements to the print() and 
> writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 
> and KIP-161 too. For more on streams, check out the Apache Kafka Streams 
> documentation, including some helpful new tutorial videos.
> * Operating Kafka at scale requires that the system remain observable, and to 
> make that easier, we’ve made a number of improvements to metrics. These are 
> too many to summarize without becoming tedious, but Connect metrics have been 
> significantly improved (KIP-196), a litany of new health check metrics are 
> now exposed (KIP-188), and we now have a global topic and partition count 
> (KIP-168). Check out KIP-164 and KIP-187 for even more.
> * We now support Java 9, leading, among other things, to significantly faster 
> TLS and CRC32C implementations. Over-the-wire encryption will be faster now, 
> which will keep Kafka fast and compute costs low when encryption is enabled.
> * In keeping with the security theme, KIP-152 cleans up the error handling on 
> Simple Authentication Security Layer (SASL) authentication attempts. 
> Previously, some authentication error conditions were indistinguishable from 
> broker failures and were not logged in a clear way. This is cleaner now.
> * Kafka can now tolerate disk failures better. Historically, JBOD storage 
> configurations have not been recommended, but the architecture has 
> nevertheless been tempting: after all, why not rely on Kafka’s own 
> replication mechanism to protect against storage failure rather than using 
> RAID? With KIP-112, Kafka now handles disk failure more gracefully. A single 
> disk failure in a JBOD broker will not bring the entire broker down; rather, 
> the broker will continue serving any log files that remain on functioning 
> disks.
> * Since release 0.11.0, the idempotent producer (which is the producer used 
> in the presence of a transaction, which of course is the producer we use for 
> exactly-once processing) required max.in.flight.requests.per.connection to be 
> equal to one.

  1   2   >