[GitHub] [flink] flinkbot edited a comment on pull request #17665: [docs][typos] Fix typos in fault_tolerance.md

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17665:
URL: https://github.com/apache/flink/pull/17665#issuecomment-960462017


   
   ## CI report:
   
   * 4d46a90b64d13546a93cc5a9f540575d6a38e0d9 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25908)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17665: [docs][typos] Fix typos in fault_tolerance.md

2021-11-03 Thread GitBox


flinkbot commented on pull request #17665:
URL: https://github.com/apache/flink/pull/17665#issuecomment-960462017


   
   ## CI report:
   
   * 4d46a90b64d13546a93cc5a9f540575d6a38e0d9 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17665: [docs][typos] Fix typos in fault_tolerance.md

2021-11-03 Thread GitBox


flinkbot commented on pull request #17665:
URL: https://github.com/apache/flink/pull/17665#issuecomment-960461887


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 4d46a90b64d13546a93cc5a9f540575d6a38e0d9 (Thu Nov 04 
05:05:10 UTC 2021)
   
   **Warnings:**
* Documentation files were touched, but no `docs/content.zh/` files: Update 
Chinese documentation or file Jira ticket.
* **Invalid pull request title: No valid Jira ID provided**
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] mincwang opened a new pull request #17665: [docs][typos]

2021-11-03 Thread GitBox


mincwang opened a new pull request #17665:
URL: https://github.com/apache/flink/pull/17665


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17532: [FLINK-22096][tests] Fix port conflict for ServerTransportErrorHandlingTest#testRemoteClose

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17532:
URL: https://github.com/apache/flink/pull/17532#issuecomment-948251540


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 633084a4927bccc3958373c12ccd3441d303be68 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25907)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17652: [FLINK-24717][table-planner]Push down partitions before filters

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17652:
URL: https://github.com/apache/flink/pull/17652#issuecomment-958647607


   
   ## CI report:
   
   * 2821a77a60feefd959cdf1f54e336ab4fcaa1a71 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25855)
 
   * 008027f8b93ee34ec258e91203cb2d46c9607462 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25906)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17605:
URL: https://github.com/apache/flink/pull/17605#issuecomment-954693023


   
   ## CI report:
   
   * 0056d2cc65fb1d0ff971c0394a32fd65972ff148 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25838)
 
   * c16aa625ec99f0e29bc9d768f027c9ee993b1b25 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25905)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17532: [FLINK-22096][tests] Fix port conflict for ServerTransportErrorHandlingTest#testRemoteClose

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17532:
URL: https://github.com/apache/flink/pull/17532#issuecomment-948251540


   
   ## CI report:
   
   *  Unknown: [CANCELED](TBD) 
   * 633084a4927bccc3958373c12ccd3441d303be68 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] TanYuxin-tyx removed a comment on pull request #17532: [FLINK-22096][tests] Fix port conflict for ServerTransportErrorHandlingTest#testRemoteClose

2021-11-03 Thread GitBox


TanYuxin-tyx removed a comment on pull request #17532:
URL: https://github.com/apache/flink/pull/17532#issuecomment-958978355






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] TanYuxin-tyx commented on pull request #17532: [FLINK-22096][tests] Fix port conflict for ServerTransportErrorHandlingTest#testRemoteClose

2021-11-03 Thread GitBox


TanYuxin-tyx commented on pull request #17532:
URL: https://github.com/apache/flink/pull/17532#issuecomment-960456231


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17652: [FLINK-24717][table-planner]Push down partitions before filters

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17652:
URL: https://github.com/apache/flink/pull/17652#issuecomment-958647607


   
   ## CI report:
   
   * 2821a77a60feefd959cdf1f54e336ab4fcaa1a71 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25855)
 
   * 008027f8b93ee34ec258e91203cb2d46c9607462 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17605:
URL: https://github.com/apache/flink/pull/17605#issuecomment-954693023


   
   ## CI report:
   
   * 0056d2cc65fb1d0ff971c0394a32fd65972ff148 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25838)
 
   * c16aa625ec99f0e29bc9d768f027c9ee993b1b25 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xuyangzhong commented on a change in pull request #17652: [FLINK-24717][table-planner]Push down partitions before filters

2021-11-03 Thread GitBox


xuyangzhong commented on a change in pull request #17652:
URL: https://github.com/apache/flink/pull/17652#discussion_r742532909



##
File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/rules/FlinkBatchRuleSets.scala
##
@@ -246,77 +260,80 @@ object FlinkBatchRuleSets {
 * RuleSet to do logical optimize.
 * This RuleSet is a sub-set of [[LOGICAL_OPT_RULES]].
 */
-  private val LOGICAL_RULES: RuleSet = RuleSets.ofList(
-// scan optimization
-PushProjectIntoTableSourceScanRule.INSTANCE,
-PushProjectIntoLegacyTableSourceScanRule.INSTANCE,
-PushFilterIntoTableSourceScanRule.INSTANCE,
-PushFilterIntoLegacyTableSourceScanRule.INSTANCE,
-
-// reorder sort and projection
-CoreRules.SORT_PROJECT_TRANSPOSE,
-// remove unnecessary sort rule
-CoreRules.SORT_REMOVE,
-
-// join rules
-FlinkJoinPushExpressionsRule.INSTANCE,
-SimplifyJoinConditionRule.INSTANCE,
-
-// remove union with only a single child
-CoreRules.UNION_REMOVE,
-// convert non-all union into all-union + distinct
-CoreRules.UNION_TO_DISTINCT,
-
-// aggregation and projection rules
-CoreRules.AGGREGATE_PROJECT_MERGE,
-CoreRules.AGGREGATE_PROJECT_PULL_UP_CONSTANTS,
-
-// remove aggregation if it does not aggregate and input is already 
distinct
-FlinkAggregateRemoveRule.INSTANCE,
-// push aggregate through join
-FlinkAggregateJoinTransposeRule.EXTENDED,
-// aggregate union rule
-CoreRules.AGGREGATE_UNION_AGGREGATE,
-// expand distinct aggregate to normal aggregate with groupby
-FlinkAggregateExpandDistinctAggregatesRule.INSTANCE,
-
-// reduce aggregate functions like AVG, STDDEV_POP etc.
-CoreRules.AGGREGATE_REDUCE_FUNCTIONS,
-WindowAggregateReduceFunctionsRule.INSTANCE,
-
-// reduce group by columns
-AggregateReduceGroupingRule.INSTANCE,
-// reduce useless aggCall
-PruneAggregateCallRule.PROJECT_ON_AGGREGATE,
-PruneAggregateCallRule.CALC_ON_AGGREGATE,
-
-// expand grouping sets
-DecomposeGroupingSetsRule.INSTANCE,
-
-// rank rules
-FlinkLogicalRankRule.CONSTANT_RANGE_INSTANCE,
-// transpose calc past rank to reduce rank input fields
-CalcRankTransposeRule.INSTANCE,
-// remove output of rank number when it is a constant
-ConstantRankNumberColumnRemoveRule.INSTANCE,
-
-// calc rules
-CoreRules.FILTER_CALC_MERGE,
-CoreRules.PROJECT_CALC_MERGE,
-CoreRules.FILTER_TO_CALC,
-CoreRules.PROJECT_TO_CALC,
-FlinkCalcMergeRule.INSTANCE,
-
-// semi/anti join transpose rule
-FlinkSemiAntiJoinJoinTransposeRule.INSTANCE,
-FlinkSemiAntiJoinProjectTransposeRule.INSTANCE,
-FlinkSemiAntiJoinFilterTransposeRule.INSTANCE,
-
-// set operators
-ReplaceIntersectWithSemiJoinRule.INSTANCE,
-RewriteIntersectAllRule.INSTANCE,
-ReplaceMinusWithAntiJoinRule.INSTANCE,
-RewriteMinusAllRule.INSTANCE
+  private val LOGICAL_RULES: RuleSet = RuleSets.ofList((
+RuleSets.ofList(

Review comment:
   I will reset the code to keep the previous format here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] xuyangzhong commented on a change in pull request #17652: [FLINK-24717][table-planner]Push down partitions before filters

2021-11-03 Thread GitBox


xuyangzhong commented on a change in pull request #17652:
URL: https://github.com/apache/flink/pull/17652#discussion_r742535921



##
File path: 
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/plan/batch/sql/PartitionableSourceTest.scala
##
@@ -76,30 +98,41 @@ class PartitionableSourceTest(
 val catalogPartitionSpec = new CatalogPartitionSpec(partition)
 val catalogPartition = new CatalogPartitionImpl(
   new java.util.HashMap[String, String](), "")
-catalog.createPartition(mytablePath, catalogPartitionSpec, 
catalogPartition, true)
+catalog.createPartition(
+  partitionableTablePath, catalogPartitionSpec, catalogPartition, true)
+catalog.createPartition(
+  partitionableAndFilterableTablePath, catalogPartitionSpec, 
catalogPartition, true)
   })
 }
   }
 
   @Test
   def testSimplePartitionFieldPredicate1(): Unit = {
-util.verifyExecPlan("SELECT * FROM MyTable WHERE part1 = 'A'")
+util.verifyExecPlan("SELECT * FROM PartitionableTable WHERE part1 = 'A'")
   }
 
   @Test
   def testPartialPartitionFieldPredicatePushDown(): Unit = {
-util.verifyExecPlan("SELECT * FROM MyTable WHERE (id > 2 OR part1 = 'A') 
AND part2 > 1")
+util.verifyExecPlan(
+  "SELECT * FROM PartitionableTable WHERE (id > 2 OR part1 = 'A') AND 
part2 > 1")
   }
 
   @Test
   def testWithUdfAndVirtualColumn(): Unit = {
 util.addFunction("MyUdf", Func1)
-util.verifyExecPlan("SELECT * FROM MyTable WHERE id > 2 AND MyUdf(part2) < 
3")
+util.verifyExecPlan("SELECT * FROM PartitionableTable WHERE id > 2 AND 
MyUdf(part2) < 3")
   }
 
   @Test
   def testUnconvertedExpression(): Unit = {
-util.verifyExecPlan("select * from MyTable where trim(part1) = 'A' and 
part2 > 1")
+util.verifyExecPlan("select * from PartitionableTable where trim(part1) = 
'A' and part2 > 1")
+  }
+
+  @Test
+  def testPushDownPartitionAndFiltersContainPartitionKeys(): Unit = {
+util.verifyExecPlan(
+  "select * from PartitionableAndFilterableTable " +

Review comment:
   Actually, this case will only remain `name` in the projected field. 
Because in this case, filters will be pushed down before projection, and 
projection will not contain the fields  in filters that will no be used later. 
   However, if the case contains watermark, filters will be pushed down after 
projection.And projection fields will contains the fields in filters, just like 
you said.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24759) Metrics on web UI is not correct in a filesystem source to filesystem sink streaming job

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24759:

Description: 
I'm running a yarn-per-job streaming job (submitted via SQL client) which reads 
a large csv file (flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on web UI seems strange. Bytes / Records received / sent are 
extremely small.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

!metrics.png!

SQL script
{code:sql}
CREATE TABLE `flights` (
  `_YEAR` CHAR(4),
  `_MONTH` CHAR(2),
  `_DAY` CHAR(2),
  `_DAY_OF_WEEK` TINYINT,
  `AIRLINE` CHAR(2),
  `FLIGHT_NUMBER` SMALLINT,
  `TAIL_NUMBER` CHAR(6),
  `ORIGIN_AIRPORT` CHAR(3),
  `DESTINATION_AIRPORT` CHAR(3),
  `_SCHEDULED_DEPARTURE` CHAR(4),
  `_DEPARTURE_TIME` CHAR(4),
  `DEPARTURE_DELAY` SMALLINT,
  `TAXI_OUT` SMALLINT,
  `WHEELS_OFF` CHAR(4),
  `SCHEDULED_TIME` SMALLINT,
  `ELAPSED_TIME` SMALLINT,
  `AIR_TIME` SMALLINT,
  `DISTANCE` SMALLINT,
  `WHEELS_ON` CHAR(4),
  `TAXI_IN` SMALLINT,
  `SCHEDULED_ARRIVAL` CHAR(4),
  `ARRIVAL_TIME` CHAR(4),
  `ARRIVAL_DELAY` SMALLINT,
  `DIVERTED` BOOLEAN,
  `CANCELLED` BOOLEAN,
  `CANCELLATION_REASON` CHAR(1),
  `AIR_SYSTEM_DELAY` SMALLINT,
  `SECURITY_DELAY` SMALLINT,
  `AIRLINE_DELAY` SMALLINT,
  `LATE_AIRCRAFT_DELAY` SMALLINT,
  `WEATHER_DELAY` SMALLINT
) WITH (
  'connector' = 'filesystem',
  'path' = 'hdfs://source.csv',
  'format' = 'csv',
  'csv.null-literal' = ''
);

CREATE TABLE `my_sink` (
  `_YEAR` CHAR(4),
  `_MONTH` CHAR(2),
  `_DAY` CHAR(2),
  `_DAY_OF_WEEK` TINYINT,
  `AIRLINE` CHAR(2),
  `FLIGHT_NUMBER` SMALLINT,
  `TAIL_NUMBER` CHAR(6),
  `ORIGIN_AIRPORT` CHAR(3),
  `DESTINATION_AIRPORT` CHAR(3),
  `_SCHEDULED_DEPARTURE` CHAR(4),
  `_DEPARTURE_TIME` CHAR(4),
  `DEPARTURE_DELAY` SMALLINT,
  `TAXI_OUT` SMALLINT,
  `WHEELS_OFF` CHAR(4),
  `SCHEDULED_TIME` SMALLINT,
  `ELAPSED_TIME` SMALLINT,
  `AIR_TIME` SMALLINT,
  `DISTANCE` SMALLINT,
  `WHEELS_ON` CHAR(4),
  `TAXI_IN` SMALLINT,
  `SCHEDULED_ARRIVAL` CHAR(4),
  `ARRIVAL_TIME` CHAR(4),
  `ARRIVAL_DELAY` SMALLINT,
  `DIVERTED` BOOLEAN,
  `CANCELLED` BOOLEAN,
  `CANCELLATION_REASON` CHAR(1),
  `AIR_SYSTEM_DELAY` SMALLINT,
  `SECURITY_DELAY` SMALLINT,
  `AIRLINE_DELAY` SMALLINT,
  `LATE_AIRCRAFT_DELAY` SMALLINT,
  `WEATHER_DELAY` SMALLINT
) WITH (
  'connector' = 'filesystem',
  'path' = 'hdfs://sink.csv',
  'format' = 'csv'
);

insert into my_sink select * from flights;
{code}

  was:
I'm running a yarn-per-job streaming job (submitted via SQL client) which reads 
a large csv file (flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on web UI seems strange.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

!metrics.png!

SQL script
{code:sql}
CREATE TABLE `flights` (
  `_YEAR` CHAR(4),
  `_MONTH` CHAR(2),
  `_DAY` CHAR(2),
  `_DAY_OF_WEEK` TINYINT,
  `AIRLINE` CHAR(2),
  `FLIGHT_NUMBER` SMALLINT,
  `TAIL_NUMBER` CHAR(6),
  `ORIGIN_AIRPORT` CHAR(3),
  `DESTINATION_AIRPORT` CHAR(3),
  `_SCHEDULED_DEPARTURE` CHAR(4),
  `_DEPARTURE_TIME` CHAR(4),
  `DEPARTURE_DELAY` SMALLINT,
  `TAXI_OUT` SMALLINT,
  `WHEELS_OFF` CHAR(4),
  `SCHEDULED_TIME` SMALLINT,
  `ELAPSED_TIME` SMALLINT,
  `AIR_TIME` SMALLINT,
  `DISTANCE` SMALLINT,
  `WHEELS_ON` CHAR(4),
  `TAXI_IN` SMALLINT,
  `SCHEDULED_ARRIVAL` CHAR(4),
  `ARRIVAL_TIME` CHAR(4),
  `ARRIVAL_DELAY` SMALLINT,
  `DIVERTED` BOOLEAN,
  `CANCELLED` BOOLEAN,
  `CANCELLATION_REASON` CHAR(1),
  `AIR_SYSTEM_DELAY` SMALLINT,
  `SECURITY_DELAY` SMALLINT,
  `AIRLINE_DELAY` SMALLINT,
  `LATE_AIRCRAFT_DELAY` SMALLINT,
  `WEATHER_DELAY` SMALLINT
) WITH (
  'connector' = 'filesystem',
  'path' = 'hdfs://source.csv',
  'format' = 'csv',
  'csv.null-literal' = ''
);

CREATE TABLE `my_sink` (
  `_YEAR` CHAR(4),
  `_MONTH` CHAR(2),
  `_DAY` CHAR(2),
  `_DAY_OF_WEEK` TINYINT,
  `AIRLINE` CHAR(2),
  `FLIGHT_NUMBER` SMALLINT,
  `TAIL_NUMBER` CHAR(6),
  `ORIGIN_AIRPORT` CHAR(3),
  `DESTINATION_AIRPORT` CHAR(3),
  `_SCHEDULED_DEPARTURE` CHAR(4),
  `_DEPARTURE_TIME` CHAR(4),
  `DEPARTURE_DELAY` SMALLINT,
  `TAXI_OUT` SMALLINT,
  `WHEELS_OFF` CHAR(4),
  `SCHEDULED_TIME` SMALLINT,
  `ELAPSED_TIME` SMALLINT,
  `AIR_TIME` SMALLINT,
  `DISTANCE` SMALLINT,
  `WHEELS_ON` CHAR(4),
  `TAXI_IN` SMALLINT,
  `SCHEDULED_ARRIVAL` CHAR(4),
  `ARRIVAL_TIME` CHAR(4),
  `ARRIVAL_DELAY` SMALLINT,
  `DIVERTED` BOOLEAN,
  `CANCELLED` BOOLEAN,
  `CANCELLATION_REASON` CHAR(1),
  `AIR_SYSTEM_DELAY` SMALLINT,
  `SECURITY_DELAY` SMALLINT,
  `AIRLINE_DELAY` SMALLINT,
  `LATE_AIRCRAFT_DELAY` SMALLINT,
  `WEATHER_DELAY` SMALLINT
) WITH (
  

[jira] [Updated] (FLINK-24759) Metrics on web UI is not correct in a filesystem source to filesystem sink streaming job

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24759:

Description: 
I'm running a yarn-per-job streaming job (submitted via SQL client) which reads 
a large csv file (flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on web UI seems strange.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

!metrics.png!

SQL script
{code:sql}
CREATE TABLE `flights` (
  `_YEAR` CHAR(4),
  `_MONTH` CHAR(2),
  `_DAY` CHAR(2),
  `_DAY_OF_WEEK` TINYINT,
  `AIRLINE` CHAR(2),
  `FLIGHT_NUMBER` SMALLINT,
  `TAIL_NUMBER` CHAR(6),
  `ORIGIN_AIRPORT` CHAR(3),
  `DESTINATION_AIRPORT` CHAR(3),
  `_SCHEDULED_DEPARTURE` CHAR(4),
  `_DEPARTURE_TIME` CHAR(4),
  `DEPARTURE_DELAY` SMALLINT,
  `TAXI_OUT` SMALLINT,
  `WHEELS_OFF` CHAR(4),
  `SCHEDULED_TIME` SMALLINT,
  `ELAPSED_TIME` SMALLINT,
  `AIR_TIME` SMALLINT,
  `DISTANCE` SMALLINT,
  `WHEELS_ON` CHAR(4),
  `TAXI_IN` SMALLINT,
  `SCHEDULED_ARRIVAL` CHAR(4),
  `ARRIVAL_TIME` CHAR(4),
  `ARRIVAL_DELAY` SMALLINT,
  `DIVERTED` BOOLEAN,
  `CANCELLED` BOOLEAN,
  `CANCELLATION_REASON` CHAR(1),
  `AIR_SYSTEM_DELAY` SMALLINT,
  `SECURITY_DELAY` SMALLINT,
  `AIRLINE_DELAY` SMALLINT,
  `LATE_AIRCRAFT_DELAY` SMALLINT,
  `WEATHER_DELAY` SMALLINT
) WITH (
  'connector' = 'filesystem',
  'path' = 'hdfs://source.csv',
  'format' = 'csv',
  'csv.null-literal' = ''
);

CREATE TABLE `my_sink` (
  `_YEAR` CHAR(4),
  `_MONTH` CHAR(2),
  `_DAY` CHAR(2),
  `_DAY_OF_WEEK` TINYINT,
  `AIRLINE` CHAR(2),
  `FLIGHT_NUMBER` SMALLINT,
  `TAIL_NUMBER` CHAR(6),
  `ORIGIN_AIRPORT` CHAR(3),
  `DESTINATION_AIRPORT` CHAR(3),
  `_SCHEDULED_DEPARTURE` CHAR(4),
  `_DEPARTURE_TIME` CHAR(4),
  `DEPARTURE_DELAY` SMALLINT,
  `TAXI_OUT` SMALLINT,
  `WHEELS_OFF` CHAR(4),
  `SCHEDULED_TIME` SMALLINT,
  `ELAPSED_TIME` SMALLINT,
  `AIR_TIME` SMALLINT,
  `DISTANCE` SMALLINT,
  `WHEELS_ON` CHAR(4),
  `TAXI_IN` SMALLINT,
  `SCHEDULED_ARRIVAL` CHAR(4),
  `ARRIVAL_TIME` CHAR(4),
  `ARRIVAL_DELAY` SMALLINT,
  `DIVERTED` BOOLEAN,
  `CANCELLED` BOOLEAN,
  `CANCELLATION_REASON` CHAR(1),
  `AIR_SYSTEM_DELAY` SMALLINT,
  `SECURITY_DELAY` SMALLINT,
  `AIRLINE_DELAY` SMALLINT,
  `LATE_AIRCRAFT_DELAY` SMALLINT,
  `WEATHER_DELAY` SMALLINT
) WITH (
  'connector' = 'filesystem',
  'path' = 'hdfs://sink.csv',
  'format' = 'csv'
);

insert into my_sink select * from flights;
{code}

  was:
I'm running a yarn-per-job streaming job which reads a large csv file 
(flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on web UI seems strange.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

!metrics.png!


> Metrics on web UI is not correct in a filesystem source to filesystem sink 
> streaming job
> 
>
> Key: FLINK-24759
> URL: https://issues.apache.org/jira/browse/FLINK-24759
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Affects Versions: 1.14.0
>Reporter: Caizhi Weng
>Priority: Major
> Attachments: metrics.png
>
>
> I'm running a yarn-per-job streaming job (submitted via SQL client) which 
> reads a large csv file (flights.csv in [Kaggle flight delay 
> data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and 
> writes to another location (csv format) on the same HDFS. The resulting file 
> seems OK, but the metrics on web UI seems strange.
> I've tried other formats like reading from csv and writing to avro and the 
> phenomenon is the same.
> !metrics.png!
> SQL script
> {code:sql}
> CREATE TABLE `flights` (
>   `_YEAR` CHAR(4),
>   `_MONTH` CHAR(2),
>   `_DAY` CHAR(2),
>   `_DAY_OF_WEEK` TINYINT,
>   `AIRLINE` CHAR(2),
>   `FLIGHT_NUMBER` SMALLINT,
>   `TAIL_NUMBER` CHAR(6),
>   `ORIGIN_AIRPORT` CHAR(3),
>   `DESTINATION_AIRPORT` CHAR(3),
>   `_SCHEDULED_DEPARTURE` CHAR(4),
>   `_DEPARTURE_TIME` CHAR(4),
>   `DEPARTURE_DELAY` SMALLINT,
>   `TAXI_OUT` SMALLINT,
>   `WHEELS_OFF` CHAR(4),
>   `SCHEDULED_TIME` SMALLINT,
>   `ELAPSED_TIME` SMALLINT,
>   `AIR_TIME` SMALLINT,
>   `DISTANCE` SMALLINT,
>   `WHEELS_ON` CHAR(4),
>   `TAXI_IN` SMALLINT,
>   `SCHEDULED_ARRIVAL` CHAR(4),
>   `ARRIVAL_TIME` CHAR(4),
>   `ARRIVAL_DELAY` SMALLINT,
>   `DIVERTED` BOOLEAN,
>   `CANCELLED` BOOLEAN,
>   `CANCELLATION_REASON` CHAR(1),
>   `AIR_SYSTEM_DELAY` SMALLINT,
>   `SECURITY_DELAY` SMALLINT,
>   `AIRLINE_DELAY` SMALLINT,
>   `LATE_AIRCRAFT_DELAY` SMALLINT,
>   

[jira] [Created] (FLINK-24759) Metrics on web UI is not correct in a filesystem source to filesystem sink streaming job

2021-11-03 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-24759:
---

 Summary: Metrics on web UI is not correct in a filesystem source 
to filesystem sink streaming job
 Key: FLINK-24759
 URL: https://issues.apache.org/jira/browse/FLINK-24759
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics, Runtime / Web Frontend
Affects Versions: 1.14.0
Reporter: Caizhi Weng
 Attachments: metrics.png

I'm running a yarn-per-job streaming job which reads a large csv file 
(flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on the web UI seems strage.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

 !metrics.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24759) Metrics on web UI is not correct in a filesystem source to filesystem sink streaming job

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24759:

Description: 
I'm running a yarn-per-job streaming job which reads a large csv file 
(flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on web UI seems strange.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

!metrics.png!

  was:
I'm running a yarn-per-job streaming job which reads a large csv file 
(flights.csv in [Kaggle flight delay 
data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and writes 
to another location (csv format) on the same HDFS. The resulting file seems OK, 
but the metrics on the web UI seems strage.

I've tried other formats like reading from csv and writing to avro and the 
phenomenon is the same.

 !metrics.png! 


> Metrics on web UI is not correct in a filesystem source to filesystem sink 
> streaming job
> 
>
> Key: FLINK-24759
> URL: https://issues.apache.org/jira/browse/FLINK-24759
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics, Runtime / Web Frontend
>Affects Versions: 1.14.0
>Reporter: Caizhi Weng
>Priority: Major
> Attachments: metrics.png
>
>
> I'm running a yarn-per-job streaming job which reads a large csv file 
> (flights.csv in [Kaggle flight delay 
> data|https://www.kaggle.com/usdot/flight-delays], ~500MB) from HDFS and 
> writes to another location (csv format) on the same HDFS. The resulting file 
> seems OK, but the metrics on web UI seems strange.
> I've tried other formats like reading from csv and writing to avro and the 
> phenomenon is the same.
> !metrics.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] xuyangzhong commented on a change in pull request #17652: [FLINK-24717][table-planner]Push down partitions before filters

2021-11-03 Thread GitBox


xuyangzhong commented on a change in pull request #17652:
URL: https://github.com/apache/flink/pull/17652#discussion_r742532909



##
File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/rules/FlinkBatchRuleSets.scala
##
@@ -246,77 +260,80 @@ object FlinkBatchRuleSets {
 * RuleSet to do logical optimize.
 * This RuleSet is a sub-set of [[LOGICAL_OPT_RULES]].
 */
-  private val LOGICAL_RULES: RuleSet = RuleSets.ofList(
-// scan optimization
-PushProjectIntoTableSourceScanRule.INSTANCE,
-PushProjectIntoLegacyTableSourceScanRule.INSTANCE,
-PushFilterIntoTableSourceScanRule.INSTANCE,
-PushFilterIntoLegacyTableSourceScanRule.INSTANCE,
-
-// reorder sort and projection
-CoreRules.SORT_PROJECT_TRANSPOSE,
-// remove unnecessary sort rule
-CoreRules.SORT_REMOVE,
-
-// join rules
-FlinkJoinPushExpressionsRule.INSTANCE,
-SimplifyJoinConditionRule.INSTANCE,
-
-// remove union with only a single child
-CoreRules.UNION_REMOVE,
-// convert non-all union into all-union + distinct
-CoreRules.UNION_TO_DISTINCT,
-
-// aggregation and projection rules
-CoreRules.AGGREGATE_PROJECT_MERGE,
-CoreRules.AGGREGATE_PROJECT_PULL_UP_CONSTANTS,
-
-// remove aggregation if it does not aggregate and input is already 
distinct
-FlinkAggregateRemoveRule.INSTANCE,
-// push aggregate through join
-FlinkAggregateJoinTransposeRule.EXTENDED,
-// aggregate union rule
-CoreRules.AGGREGATE_UNION_AGGREGATE,
-// expand distinct aggregate to normal aggregate with groupby
-FlinkAggregateExpandDistinctAggregatesRule.INSTANCE,
-
-// reduce aggregate functions like AVG, STDDEV_POP etc.
-CoreRules.AGGREGATE_REDUCE_FUNCTIONS,
-WindowAggregateReduceFunctionsRule.INSTANCE,
-
-// reduce group by columns
-AggregateReduceGroupingRule.INSTANCE,
-// reduce useless aggCall
-PruneAggregateCallRule.PROJECT_ON_AGGREGATE,
-PruneAggregateCallRule.CALC_ON_AGGREGATE,
-
-// expand grouping sets
-DecomposeGroupingSetsRule.INSTANCE,
-
-// rank rules
-FlinkLogicalRankRule.CONSTANT_RANGE_INSTANCE,
-// transpose calc past rank to reduce rank input fields
-CalcRankTransposeRule.INSTANCE,
-// remove output of rank number when it is a constant
-ConstantRankNumberColumnRemoveRule.INSTANCE,
-
-// calc rules
-CoreRules.FILTER_CALC_MERGE,
-CoreRules.PROJECT_CALC_MERGE,
-CoreRules.FILTER_TO_CALC,
-CoreRules.PROJECT_TO_CALC,
-FlinkCalcMergeRule.INSTANCE,
-
-// semi/anti join transpose rule
-FlinkSemiAntiJoinJoinTransposeRule.INSTANCE,
-FlinkSemiAntiJoinProjectTransposeRule.INSTANCE,
-FlinkSemiAntiJoinFilterTransposeRule.INSTANCE,
-
-// set operators
-ReplaceIntersectWithSemiJoinRule.INSTANCE,
-RewriteIntersectAllRule.INSTANCE,
-ReplaceMinusWithAntiJoinRule.INSTANCE,
-RewriteMinusAllRule.INSTANCE
+  private val LOGICAL_RULES: RuleSet = RuleSets.ofList((
+RuleSets.ofList(

Review comment:
   I keep the previous format here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17635: [FLINK-24724][buildsystem] Update Sun XML Bind dependencies to latest minor version

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17635:
URL: https://github.com/apache/flink/pull/17635#issuecomment-956476981


   
   ## CI report:
   
   * 052d2a12ed9a10c86db344dad25b292cbb1a Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25888)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17646: [FLINK-24740][testinfrastructure] Update testcontainers dependency to the latest version

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17646:
URL: https://github.com/apache/flink/pull/17646#issuecomment-957837700


   
   ## CI report:
   
   * b90ca62a59048e9ea369de6942203139e372533e Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25889)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24758) partition.time-extractor.kind support "yyyyMMdd"

2021-11-03 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-24758:
-
Description: 
Now, only supports -mm-dd hh:mm:ss, we can add a new time-extractor kind to 
support MMdd in a single partition field
.

  was:Now, only supports -mm-dd hh:mm:ss, we can add a new time-extractor 
kind to support MMdd.


> partition.time-extractor.kind support "MMdd"
> 
>
> Key: FLINK-24758
> URL: https://issues.apache.org/jira/browse/FLINK-24758
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Major
>
> Now, only supports -mm-dd hh:mm:ss, we can add a new time-extractor kind 
> to support MMdd in a single partition field
> .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24758) filesystem sink: partition.time-extractor.kind support "yyyyMMdd"

2021-11-03 Thread Jingsong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-24758:
-
Summary: filesystem sink: partition.time-extractor.kind support "MMdd"  
(was: partition.time-extractor.kind support "MMdd")

> filesystem sink: partition.time-extractor.kind support "MMdd"
> -
>
> Key: FLINK-24758
> URL: https://issues.apache.org/jira/browse/FLINK-24758
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Reporter: Jingsong Lee
>Priority: Major
>
> Now, only supports -mm-dd hh:mm:ss, we can add a new time-extractor kind 
> to support MMdd in a single partition field
> .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24758) partition.time-extractor.kind support "yyyyMMdd"

2021-11-03 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-24758:


 Summary: partition.time-extractor.kind support "MMdd"
 Key: FLINK-24758
 URL: https://issues.apache.org/jira/browse/FLINK-24758
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Reporter: Jingsong Lee


Now, only supports -mm-dd hh:mm:ss, we can add a new time-extractor kind to 
support MMdd.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhuzhurk commented on a change in pull request #15229: [FLINK-19142][runtime] Fix slot hijacking after task failover

2021-11-03 Thread GitBox


zhuzhurk commented on a change in pull request #15229:
URL: https://github.com/apache/flink/pull/15229#discussion_r742527302



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultScheduler.java
##
@@ -669,6 +704,11 @@ public SchedulingTopology getSchedulingTopology() {
 public Optional 
getStateLocation(ExecutionVertexID executionVertexId) {
 return stateLocationRetriever.getStateLocation(executionVertexId);
 }
+
+@Override
+public Set getAllocationsToReserve() {

Review comment:
   Agreed. `reservedAllocations` is more accurate.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24757) Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24757:

Description: 
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are specifying the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes Yarn application is not terminated.

This is because yarn job cluster is using {{MiniDispatcher}} and it will 
directly terminate only in detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.

  was:
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are specifying the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminate only in the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.


> Yarn application is not terminated after the job finishes when submitting a 
> yarn-per-job insert job with SQL client
> ---
>
> Key: FLINK-24757
> URL: https://issues.apache.org/jira/browse/FLINK-24757
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: Caizhi Weng
>Priority: Major
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
>
> I've seen this problem for about three times in the user mailing thread 
> (previously I suspect that the users are specifying the wrong 
> {{execution.target}}) until I myself also bumped into this problem. I've 
> submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
> job finishes Yarn application is not terminated.
> This is because yarn job cluster is using {{MiniDispatcher}} and it will 
> directly terminate only in detached execution mode. This execution mode is 
> (through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
> is true by default.
> When submitting an insert job, SQL client will not wait for the job to 
> finish. Instead it only reports the job id. So I think it is reasonable to 
> set detached mode for every insert job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24757) Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24757:

Description: 
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are specifying the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminate only in the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.

  was:
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are specifying the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminated only in the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.


> Yarn application is not terminated after the job finishes when submitting a 
> yarn-per-job insert job with SQL client
> ---
>
> Key: FLINK-24757
> URL: https://issues.apache.org/jira/browse/FLINK-24757
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: Caizhi Weng
>Priority: Major
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
>
> I've seen this problem for about three times in the user mailing thread 
> (previously I suspect that the users are specifying the wrong 
> {{execution.target}}) until I myself also bumped into this problem. I've 
> submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
> job finishes the Yarn application is not terminated.
> This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
> directly terminate only in the detached execution mode. This execution mode 
> is (through some function calls) related to {{DeploymentOptions#ATTACHED}} 
> which is true by default.
> When submitting an insert job, SQL client will not wait for the job to 
> finish. Instead it only reports the job id. So I think it is reasonable to 
> set detached mode for every insert job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] zhuzhurk commented on a change in pull request #15229: [FLINK-19142][runtime] Fix slot hijacking after task failover

2021-11-03 Thread GitBox


zhuzhurk commented on a change in pull request #15229:
URL: https://github.com/apache/flink/pull/15229#discussion_r742521923



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/DefaultSchedulerComponents.java
##
@@ -124,9 +126,19 @@ private static SlotSelectionStrategy 
selectSlotSelectionStrategy(
 ? 
LocationPreferenceSlotSelectionStrategy.createEvenlySpreadOut()
 : 
LocationPreferenceSlotSelectionStrategy.createDefault();
 
-return configuration.getBoolean(CheckpointingOptions.LOCAL_RECOVERY)
-? PreviousAllocationSlotSelectionStrategy.create(
-locationPreferenceSlotSelectionStrategy)
-: locationPreferenceSlotSelectionStrategy;
+final boolean isLocalRecoveryEnabled =
+configuration.getBoolean(CheckpointingOptions.LOCAL_RECOVERY);
+if (isLocalRecoveryEnabled) {
+if (jobType == JobType.STREAMING) {
+return PreviousAllocationSlotSelectionStrategy.create(
+locationPreferenceSlotSelectionStrategy);
+} else {
+throw new IllegalArgumentException(
+"Local recovery does not support batch jobs. Please 
set configuration "
++ "'state.backend.local-recovery' to false for 
batch jobs.");

Review comment:
   I think you are right. I will change it to override the strategy to 
`LocationPreferenceSlotSelectionStrategy` and log a warning for it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24757) Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24757:

Description: 
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are specifying the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminated only in the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.

  was:
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are suspecting the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminated only in the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.


> Yarn application is not terminated after the job finishes when submitting a 
> yarn-per-job insert job with SQL client
> ---
>
> Key: FLINK-24757
> URL: https://issues.apache.org/jira/browse/FLINK-24757
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: Caizhi Weng
>Priority: Major
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
>
> I've seen this problem for about three times in the user mailing thread 
> (previously I suspect that the users are specifying the wrong 
> {{execution.target}}) until I myself also bumped into this problem. I've 
> submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
> job finishes the Yarn application is not terminated.
> This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
> directly terminated only in the detached execution mode. This execution mode 
> is (through some function calls) related to {{DeploymentOptions#ATTACHED}} 
> which is true by default.
> When submitting an insert job, SQL client will not wait for the job to 
> finish. Instead it only reports the job id. So I think it is reasonable to 
> set detached mode for every insert job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24757) Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24757:

Description: 
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are suspecting the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminated only in the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.

  was:
I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are suspecting the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminated only the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.


> Yarn application is not terminated after the job finishes when submitting a 
> yarn-per-job insert job with SQL client
> ---
>
> Key: FLINK-24757
> URL: https://issues.apache.org/jira/browse/FLINK-24757
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: Caizhi Weng
>Priority: Major
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
>
> I've seen this problem for about three times in the user mailing thread 
> (previously I suspect that the users are suspecting the wrong 
> {{execution.target}}) until I myself also bumped into this problem. I've 
> submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
> job finishes the Yarn application is not terminated.
> This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
> directly terminated only in the detached execution mode. This execution mode 
> is (through some function calls) related to {{DeploymentOptions#ATTACHED}} 
> which is true by default.
> When submitting an insert job, SQL client will not wait for the job to 
> finish. Instead it only reports the job id. So I think it is reasonable to 
> set detached mode for every insert job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24757) Yarn application is not terminated after the job finishes when submitting a yarn-per-job insert job with SQL client

2021-11-03 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-24757:
---

 Summary: Yarn application is not terminated after the job finishes 
when submitting a yarn-per-job insert job with SQL client
 Key: FLINK-24757
 URL: https://issues.apache.org/jira/browse/FLINK-24757
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.14.0
Reporter: Caizhi Weng
 Fix For: 1.15.0, 1.14.1, 1.13.4


I've seen this problem for about three times in the user mailing thread 
(previously I suspect that the users are suspecting the wrong 
{{execution.target}}) until I myself also bumped into this problem. I've 
submitted a yarn-per-job batch insert SQL with Flink SQL client and after the 
job finishes the Yarn application is not terminated.

This is because yarn job cluster is using the {{MiniDispatcher}} and it will 
directly terminated only the detached execution mode. This execution mode is 
(through some function calls) related to {{DeploymentOptions#ATTACHED}} which 
is true by default.

When submitting an insert job, SQL client will not wait for the job to finish. 
Instead it only reports the job id. So I think it is reasonable to set detached 
mode for every insert job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17650: [FLINK-24475] Remove no longer used NestedMap* classes

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17650:
URL: https://github.com/apache/flink/pull/17650#issuecomment-958638599


   
   ## CI report:
   
   * 150b5f9f464913cc83756cbadd233cdb279db0c7 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25839)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25904)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] meetjunsu commented on pull request #17542: [FLINK-17782] Add array,map,row types support for parquet row writer

2021-11-03 Thread GitBox


meetjunsu commented on pull request #17542:
URL: https://github.com/apache/flink/pull/17542#issuecomment-960418603


   rebase onto latest master branch to fix python tests error


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


tsreaper commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742497844



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/UpdatableTopNFunction.java
##
@@ -312,6 +357,37 @@ private void processElementWithRowNumber(RowData inputRow, 
Collector ou
 "Failed to find the sortKey, rowkey in the buffer. This should 
never happen");
 }
 
+private void emitRecordsWithRowNumberIgnoreStateError(
+RowData newRow, int newRank, RankRow oldRow, int oldRank, 
Collector out) {
+Iterator>> iterator = 
buffer.entrySet().iterator();
+int currentRank = 0;
+RowData currentRow = null;
+RowData prevRow = null;
+
+while (iterator.hasNext() && currentRank <= newRank) {
+Map.Entry> entry = iterator.next();
+Collection rowKeys = entry.getValue();
+Iterator rowKeyIter = rowKeys.iterator();
+while (rowKeyIter.hasNext()) {
+RowData rowKey = rowKeyIter.next();
+currentRank += 1;
+currentRow = rowKeyMap.get(rowKey).row;
+if (oldRank <= currentRank) {
+if (currentRank == oldRank) {
+collectUpdateBefore(out, oldRow.row, oldRank);
+} else {
+collectUpdateBefore(out, prevRow, currentRank);
+collectUpdateAfter(out, prevRow, currentRank - 1);
+if (currentRank == newRank) {
+collectUpdateAfter(out, newRow, currentRank);
+}
+}
+}
+prevRow = currentRow;
+}
+}
+}

Review comment:
   This piece of algorithm seems awkward to me. Consider modifying it to:
   ```java
   while (iterator.hasNext() && currentRank < newRank) {
   // ...
   while (rowKeyIter.hasNext()) {
   // ...
   if (oldRank <= currentRank) {
   collectUpdateBefore(out, currentRow, currentRank + 1);
   collectUpdateAfter(out, currentRow, currentRank);
   }
   }
   }
   collectUpdateBefore(out, oldRow.row, oldRank);
   collectUpdateAfter(out, newRow, newRank);
   ```
   so that there is no `prevRow` thingy. It is misleading to see a `prevRow` 
and a `currentRank` sending within the same message.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17643:
URL: https://github.com/apache/flink/pull/17643#issuecomment-957416596


   
   ## CI report:
   
   * a4275b6ff459651c730cce9ddb48216dad635197 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25891)
 
   * 6e6edbd0fa4ef91ed9f92e93f676ef2607302e08 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25903)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] meetjunsu commented on pull request #17654: [FLINK-24614][Connectors/PARQUET] Add Complex types support for parquet reader

2021-11-03 Thread GitBox


meetjunsu commented on pull request #17654:
URL: https://github.com/apache/flink/pull/17654#issuecomment-960417686


   > @meetjunsu Thanks for the PR! Would this also make it possible to support 
this for DataStream API users?
   
   As far as I know `ParquetColumnarRowInputFormat` is a subclass of 
`BulkFormat`, users can use it through `FileSource.forBulkFileFormat`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17643:
URL: https://github.com/apache/flink/pull/17643#issuecomment-957416596


   
   ## CI report:
   
   * a4275b6ff459651c730cce9ddb48216dad635197 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25891)
 
   * 6e6edbd0fa4ef91ed9f92e93f676ef2607302e08 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24751) flink SQL comile failed cause by: java.lang.StackOverflowError

2021-11-03 Thread tao.yang03 (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438463#comment-17438463
 ] 

tao.yang03 commented on FLINK-24751:


thanks 。 I  add  env.java.opts.taskmanager= '-Xss100M' ,  this  Problem solved 
。 The following is my application.!image-2021-11-04-11-03-22-937.png!

> flink SQL comile  failed  cause by: java.lang.StackOverflowError
> 
>
> Key: FLINK-24751
> URL: https://issues.apache.org/jira/browse/FLINK-24751
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.1
>Reporter: tao.yang03
>Priority: Minor
> Attachments: image-2021-11-04-11-03-22-937.png
>
>
> I have two temporal join  , but  sometimes  failed Cause by 
> StackOverflowError. I try to add -Xss,but  have no effect .
>  
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
>  
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue. at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:89)
>  at 
> org.apache.flink.table.runtime.generated.CompileUtils.lambda$compile$1(CompileUtils.java:74)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
>  ... 15 more
>  
> Caused by: java.lang.StackOverflowError
> at org.codehaus.janino.CodeContext.flowAnalysis(CodeContext.java:429)
> at org.codehaus.janino.CodeContext.flowAnalysis(CodeContext.java:591)
> at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1357)
>  at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1357)
>  at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1330)
>  at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:822) at 
> org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:432) at 
> org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:215) at 
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:411)
>  at 
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:406)
>  at 
> org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1414) 
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:406) at 
> org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:378) at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:237) at 
> org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:465)
>  at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:216) at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:207) at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80) at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:75) at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:86)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (FLINK-24751) flink SQL comile failed cause by: java.lang.StackOverflowError

2021-11-03 Thread tao.yang03 (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tao.yang03 updated FLINK-24751:
---
Attachment: image-2021-11-04-11-03-22-937.png

> flink SQL comile  failed  cause by: java.lang.StackOverflowError
> 
>
> Key: FLINK-24751
> URL: https://issues.apache.org/jira/browse/FLINK-24751
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.1
>Reporter: tao.yang03
>Priority: Minor
> Attachments: image-2021-11-04-11-03-22-937.png
>
>
> I have two temporal join  , but  sometimes  failed Cause by 
> StackOverflowError. I try to add -Xss,but  have no effect .
>  
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
>  
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue. at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:89)
>  at 
> org.apache.flink.table.runtime.generated.CompileUtils.lambda$compile$1(CompileUtils.java:74)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
>  at 
> org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
>  ... 15 more
>  
> Caused by: java.lang.StackOverflowError
> at org.codehaus.janino.CodeContext.flowAnalysis(CodeContext.java:429)
> at org.codehaus.janino.CodeContext.flowAnalysis(CodeContext.java:591)
> at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1357)
>  at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1357)
>  at 
> org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1330)
>  at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:822) at 
> org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:432) at 
> org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:215) at 
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:411)
>  at 
> org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:406)
>  at 
> org.codehaus.janino.Java$PackageMemberClassDeclaration.accept(Java.java:1414) 
> at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:406) at 
> org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:378) at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:237) at 
> org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:465)
>  at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:216) at 
> org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:207) at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80) at 
> org.codehaus.commons.compiler.Cookable.cook(Cookable.java:75) at 
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:86)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wsz94 commented on a change in pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


wsz94 commented on a change in pull request #17643:
URL: https://github.com/apache/flink/pull/17643#discussion_r742503458



##
File path: 
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/reader/split/PulsarPartitionSplitReaderBase.java
##
@@ -114,6 +116,9 @@ protected PulsarPartitionSplitReaderBase(
 try {
 Duration timeout = deadline.timeLeftIfAny();
 Message message = pollMessage(timeout);
+if (message == null) {
+break;
+}
 
 // Deserialize message.
 collector.setMessage(message);

Review comment:
   `Duration timeout = deadline.timeLeftIfAny()`  indicates that it met the 
for loop condition `deadline.hasTimeLeft()`. So we should take another poll.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (FLINK-24728) Batch SQL file sink forgets to close the output stream

2021-11-03 Thread Caizhi Weng (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng updated FLINK-24728:

Affects Version/s: (was: 1.13.4)
   (was: 1.14.1)
   (was: 1.15.0)
   (was: 1.12.6)
   (was: 1.11.5)
   1.11.4
   1.14.0
   1.12.5
   1.13.3

> Batch SQL file sink forgets to close the output stream
> --
>
> Key: FLINK-24728
> URL: https://issues.apache.org/jira/browse/FLINK-24728
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.11.4, 1.14.0, 1.12.5, 1.13.3
>Reporter: Caizhi Weng
>Assignee: Caizhi Weng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0, 1.14.1, 1.13.4
>
>
> I tried to write a large avro file into HDFS and discover that the displayed 
> file size in HDFS is extremely small, but copying that file to local yields 
> the correct size. If we create another Flink job and read that avro file from 
> HDFS, the job will finish without outputting any record because the file size 
> Flink gets from HDFS is the very small file size.
> This is because the output format created in 
> {{FileSystemTableSink#createBulkWriterOutputFormat}} only finishes the 
> {{BulkWriter}}. According to the java doc of {{BulkWriter#finish}} bulk 
> writers should not close the output stream and should leave them to the 
> framework.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] syhily commented on pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


syhily commented on pull request #17643:
URL: https://github.com/apache/flink/pull/17643#issuecomment-960414913


   > > @wsz94 I have created a patch by using `consumer.receive(timeout, 
timeunit)`. Can you confirm this fix works?
   > > 
[gist.github.com/syhily/fbf47b581257f94869c59660002bbfa1](https://gist.github.com/syhily/fbf47b581257f94869c59660002bbfa1)_
 (too large to embed)_
   > > You can checkout a brand new branch and run the command below:
   > > ```shell
   > > curl 
https://gist.githubusercontent.com/syhily/fbf47b581257f94869c59660002bbfa1/raw/072f26a8946a753b41be832c54f1c54f342a41e8/FLINK-24733.patch
 | git apply
   > > ```
   > 
   > Thanks. According to your suggestion, I modified the commit to fix the 
`PulsarOrderedPartitionSplitReader ` too. Could you help me to confirm this PR?
   
   We may need some tests to confirm this issue has been fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] lincoln-lil commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


lincoln-lil commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742502402



##
File path: 
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/runtime/harness/RankHarnessTest.scala
##
@@ -220,4 +221,74 @@ class RankHarnessTest(mode: StateBackendMode) extends 
HarnessTestBase(mode) {
 assertor.assertOutputEqualsSorted("result mismatch", expectedOutput, 
result)
 testHarness.close()
   }
+
+  @Test
+  def testUpdateRankWithRowNumber(): Unit = {

Review comment:
   Thanks for your review, I'll update it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] lincoln-lil commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


lincoln-lil commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742502276



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/UpdatableTopNFunction.java
##
@@ -312,6 +357,37 @@ private void processElementWithRowNumber(RowData inputRow, 
Collector ou
 "Failed to find the sortKey, rowkey in the buffer. This should 
never happen");
 }
 
+private void emitRecordsWithRowNumberIgnoreStateError(
+RowData newRow, int newRank, RankRow oldRow, int oldRank, 
Collector out) {
+Iterator>> iterator = 
buffer.entrySet().iterator();
+int currentRank = 0;
+RowData currentRow = null;
+RowData prevRow = null;
+
+while (iterator.hasNext() && currentRank <= newRank) {
+Map.Entry> entry = iterator.next();
+Collection rowKeys = entry.getValue();
+Iterator rowKeyIter = rowKeys.iterator();
+while (rowKeyIter.hasNext()) {
+RowData rowKey = rowKeyIter.next();
+currentRank += 1;
+currentRow = rowKeyMap.get(rowKey).row;
+if (oldRank <= currentRank) {
+if (currentRank == oldRank) {
+collectUpdateBefore(out, oldRow.row, oldRank);
+} else {
+collectUpdateBefore(out, prevRow, currentRank);
+collectUpdateAfter(out, prevRow, currentRank - 1);
+if (currentRank == newRank) {
+collectUpdateAfter(out, newRow, currentRank);
+}
+}
+}
+prevRow = currentRow;
+}
+}
+}

Review comment:
   Cool!  It's more simpler and readable.  A small change is to emit the UB 
of old row before the following changes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] syhily commented on a change in pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


syhily commented on a change in pull request #17643:
URL: https://github.com/apache/flink/pull/17643#discussion_r742501504



##
File path: 
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/reader/split/PulsarPartitionSplitReaderBase.java
##
@@ -114,6 +116,9 @@ protected PulsarPartitionSplitReaderBase(
 try {
 Duration timeout = deadline.timeLeftIfAny();
 Message message = pollMessage(timeout);
+if (message == null) {
+break;
+}
 
 // Deserialize message.
 collector.setMessage(message);

Review comment:
   The null message indicates that this poll has exceeded the timeout 
limit, we should switch to another poll.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


tsreaper commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742501383



##
File path: 
flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/runtime/harness/RankHarnessTest.scala
##
@@ -220,4 +221,74 @@ class RankHarnessTest(mode: StateBackendMode) extends 
HarnessTestBase(mode) {
 assertor.assertOutputEqualsSorted("result mismatch", expectedOutput, 
result)
 testHarness.close()
   }
+
+  @Test
+  def testUpdateRankWithRowNumber(): Unit = {

Review comment:
   Add more test scenarios.
   1. Calculate top 5 but there are more than 5 candidates.
   2. Sort key drops but ranking does not change.
   3. Sort key drops but does not drop to the last ranking.
   4. Calculate top 5, 7 candidates, previous rank 3 drops to rank 6 (but it is 
still "rank 5").




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] lincoln-lil commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


lincoln-lil commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742500845



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/UpdatableTopNFunction.java
##
@@ -312,6 +357,37 @@ private void processElementWithRowNumber(RowData inputRow, 
Collector ou
 "Failed to find the sortKey, rowkey in the buffer. This should 
never happen");
 }
 
+private void emitRecordsWithRowNumberIgnoreStateError(
+RowData newRow, int newRank, RankRow oldRow, int oldRank, 
Collector out) {
+Iterator>> iterator = 
buffer.entrySet().iterator();
+int currentRank = 0;
+RowData currentRow = null;
+RowData prevRow = null;
+
+while (iterator.hasNext() && currentRank <= newRank) {
+Map.Entry> entry = iterator.next();
+Collection rowKeys = entry.getValue();
+Iterator rowKeyIter = rowKeys.iterator();
+while (rowKeyIter.hasNext()) {
+RowData rowKey = rowKeyIter.next();
+currentRank += 1;
+currentRow = rowKeyMap.get(rowKey).row;
+if (oldRank <= currentRank) {
+if (currentRank == oldRank) {
+collectUpdateBefore(out, oldRow.row, oldRank);
+} else {
+collectUpdateBefore(out, prevRow, currentRank);
+collectUpdateAfter(out, prevRow, currentRank - 1);
+if (currentRank == newRank) {
+collectUpdateAfter(out, newRow, currentRank);
+}
+}
+}

Review comment:
   Good catch!  will add case to cover it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17650: [FLINK-24475] Remove no longer used NestedMap* classes

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17650:
URL: https://github.com/apache/flink/pull/17650#issuecomment-958638599


   
   ## CI report:
   
   * 150b5f9f464913cc83756cbadd233cdb279db0c7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25839)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17650: [FLINK-24475] Remove no longer used NestedMap* classes

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17650:
URL: https://github.com/apache/flink/pull/17650#issuecomment-958638599


   
   ## CI report:
   
   * 150b5f9f464913cc83756cbadd233cdb279db0c7 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25839)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] Zakelly commented on pull request #17650: [FLINK-24475] Remove no longer used NestedMap* classes

2021-11-03 Thread GitBox


Zakelly commented on pull request #17650:
URL: https://github.com/apache/flink/pull/17650#issuecomment-960404229


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


tsreaper commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742497844



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/UpdatableTopNFunction.java
##
@@ -312,6 +357,37 @@ private void processElementWithRowNumber(RowData inputRow, 
Collector ou
 "Failed to find the sortKey, rowkey in the buffer. This should 
never happen");
 }
 
+private void emitRecordsWithRowNumberIgnoreStateError(
+RowData newRow, int newRank, RankRow oldRow, int oldRank, 
Collector out) {
+Iterator>> iterator = 
buffer.entrySet().iterator();
+int currentRank = 0;
+RowData currentRow = null;
+RowData prevRow = null;
+
+while (iterator.hasNext() && currentRank <= newRank) {
+Map.Entry> entry = iterator.next();
+Collection rowKeys = entry.getValue();
+Iterator rowKeyIter = rowKeys.iterator();
+while (rowKeyIter.hasNext()) {
+RowData rowKey = rowKeyIter.next();
+currentRank += 1;
+currentRow = rowKeyMap.get(rowKey).row;
+if (oldRank <= currentRank) {
+if (currentRank == oldRank) {
+collectUpdateBefore(out, oldRow.row, oldRank);
+} else {
+collectUpdateBefore(out, prevRow, currentRank);
+collectUpdateAfter(out, prevRow, currentRank - 1);
+if (currentRank == newRank) {
+collectUpdateAfter(out, newRow, currentRank);
+}
+}
+}
+prevRow = currentRow;
+}
+}
+}

Review comment:
   This piece of algorithm seems awkward to me. Consider modifying it to:
   ```java
   while (iterator.hasNext() && currentRank < newRank) {
   // ...
   while (rowKeyIter.hasNext()) {
   // ...
   if (oldRank <= currentRank) {
   collectUpdateBefore(out, currentRow, currentRank);
   collectUpdateAfter(out, currentRow, currentRank - 1);
   }
   }
   }
   collectUpdateBefore(out, oldRow.row, oldRank);
   collectUpdateAfter(out, newRow, newRank);
   ```
   so that there is no `prevRow` thingy. It is misleading to see a `prevRow` 
and a `currentRank` sending within the same message.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


tsreaper commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742494069



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/UpdatableTopNFunction.java
##
@@ -312,6 +357,37 @@ private void processElementWithRowNumber(RowData inputRow, 
Collector ou
 "Failed to find the sortKey, rowkey in the buffer. This should 
never happen");
 }
 
+private void emitRecordsWithRowNumberIgnoreStateError(
+RowData newRow, int newRank, RankRow oldRow, int oldRank, 
Collector out) {
+Iterator>> iterator = 
buffer.entrySet().iterator();
+int currentRank = 0;
+RowData currentRow = null;
+RowData prevRow = null;
+
+while (iterator.hasNext() && currentRank <= newRank) {
+Map.Entry> entry = iterator.next();
+Collection rowKeys = entry.getValue();
+Iterator rowKeyIter = rowKeys.iterator();
+while (rowKeyIter.hasNext()) {
+RowData rowKey = rowKeyIter.next();
+currentRank += 1;
+currentRow = rowKeyMap.get(rowKey).row;
+if (oldRank <= currentRank) {
+if (currentRank == oldRank) {
+collectUpdateBefore(out, oldRow.row, oldRank);
+} else {
+collectUpdateBefore(out, prevRow, currentRank);
+collectUpdateAfter(out, prevRow, currentRank - 1);
+if (currentRank == newRank) {
+collectUpdateAfter(out, newRow, currentRank);
+}
+}
+}

Review comment:
   It seems that a more proper solution to this is not to send any message 
if `oldRank == currentRank`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on a change in pull request #17605: [FLINK-24704][table] Fix exception when the input record loses monotonicity on the sort key field of UpdatableTopNFunction

2021-11-03 Thread GitBox


tsreaper commented on a change in pull request #17605:
URL: https://github.com/apache/flink/pull/17605#discussion_r742492674



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/UpdatableTopNFunction.java
##
@@ -312,6 +357,37 @@ private void processElementWithRowNumber(RowData inputRow, 
Collector ou
 "Failed to find the sortKey, rowkey in the buffer. This should 
never happen");
 }
 
+private void emitRecordsWithRowNumberIgnoreStateError(
+RowData newRow, int newRank, RankRow oldRow, int oldRank, 
Collector out) {
+Iterator>> iterator = 
buffer.entrySet().iterator();
+int currentRank = 0;
+RowData currentRow = null;
+RowData prevRow = null;
+
+while (iterator.hasNext() && currentRank <= newRank) {
+Map.Entry> entry = iterator.next();
+Collection rowKeys = entry.getValue();
+Iterator rowKeyIter = rowKeys.iterator();
+while (rowKeyIter.hasNext()) {
+RowData rowKey = rowKeyIter.next();
+currentRank += 1;
+currentRow = rowKeyMap.get(rowKey).row;
+if (oldRank <= currentRank) {
+if (currentRank == oldRank) {
+collectUpdateBefore(out, oldRow.row, oldRank);
+} else {
+collectUpdateBefore(out, prevRow, currentRank);
+collectUpdateAfter(out, prevRow, currentRank - 1);
+if (currentRank == newRank) {
+collectUpdateAfter(out, newRow, currentRank);
+}
+}
+}

Review comment:
   What if `newSortKey > oldSortKey` but `oldRank == currentRank`? The 
update after message will be lost. Please also add a test about this after 
fixing it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24687) flink-table uber jar should not include flink-connector-files dependency

2021-11-03 Thread Jingsong Lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438460#comment-17438460
 ] 

Jingsong Lee commented on FLINK-24687:
--

I think it should be possible, because the previous file related 
implementations were placed in the flink-core. After the flink-connector-files 
was introduced, it is still a basic component and is dependent. Therefore, 
flink-table-common used to rely on flink-connector-files. It's OK to flip the 
dependency now.

> flink-table uber jar should not include flink-connector-files dependency
> 
>
> Key: FLINK-24687
> URL: https://issues.apache.org/jira/browse/FLINK-24687
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.0, 1.13.3
>Reporter: Thomas Weise
>Priority: Major
>
> flink-table-common currently depends on flink-connector-files due to 
> BulkReaderFormatFactory. Since the connector isn't relocated and the jar file 
> is part of the distributions lib directory, it conflicts with application 
> packaged connector dependencies, including flink-connector-base
> https://lists.apache.org/thread.html/r4345a9ec53b1d5a1c3e4f6143ceb2fa0f950bd92d0266dc24b69a255%40%3Cdev.flink.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-24713) Postpone resourceManager serving after the recovery phase has finished

2021-11-03 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438459#comment-17438459
 ] 

Yangze Guo commented on FLINK-24713:


[~aitozi] I second Till's proposal. A configurable interval is more flexible 
than just waiting for all old TMs. I would suggest giving it a conservative 
default value for not introducing much regression of the job's failover. In our 
internal environment, we found that most of the old TMs can register back 
within 1s. So, maybe that value would be good as a first step.
Or we can disable this feature as default, users who suffer from this issue can 
configure it according to their own environment.

> Postpone resourceManager serving after the recovery phase has finished
> --
>
> Key: FLINK-24713
> URL: https://issues.apache.org/jira/browse/FLINK-24713
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0
>Reporter: Aitozi
>Priority: Major
>
> When ResourceManager started, JobManger will connect to the ResourceManager, 
> this means the ResourceManager will begin to try serve the resource requests 
> from SlotManager.
> If ResourceManager failover, although it will try to recover the pod / 
> container from previous attempt, But new resource requirements may happen 
> before the old taskManger register to slotManager. 
> In this case, it may double the required taskManager when jobManager 
> failover. We may need a mechanism to postpone resourceManager serving after 
> the recovery phase has finished



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25902)
 
   * 876e0ba044ff4f274814e9f22624fc3f2900c80a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17655: [FLINK-24728][tests] Add tests to ensure SQL file sink closes all created files

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17655:
URL: https://github.com/apache/flink/pull/17655#issuecomment-958717470


   
   ## CI report:
   
   * cbb562453f733959c1f7e4ca96f4c74309f1ee81 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25874)
 
   * 503b11ea2c10bef6caf3d4894d416bc19fb0b6c4 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] wenlong88 commented on a change in pull request #17509: [FLINK-24501][table-runtime] Stores window progress into state in order to check whether an input element is late or not for wi

2021-11-03 Thread GitBox


wenlong88 commented on a change in pull request #17509:
URL: https://github.com/apache/flink/pull/17509#discussion_r742482235



##
File path: 
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/window/slicing/SlicingWindowOperator.java
##
@@ -150,6 +158,8 @@ public void open() throws Exception {
 getKeyedStateBackend(),
 collector,
 getRuntimeContext()));
+// initialize progress of window processor
+windowProcessor.advanceProgress(currentWatermark);

Review comment:
   I think we should not call advanceProgress here, because it may be 
possible that advanceProgress would generate outputs, which is not allowed in 
open. maybe just add an initialProgess paramter in the constructor?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (FLINK-16833) Make JDBCDialect pluggable for JDBC SQL connector

2021-11-03 Thread Lijie Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-16833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438452#comment-17438452
 ] 

Lijie Wang commented on FLINK-16833:


Hi [~MartijnVisser] , sorry for the late reply. I see it's resolved, thanks for 
your great work. 

> Make JDBCDialect pluggable for JDBC SQL connector
> -
>
> Key: FLINK-16833
> URL: https://issues.apache.org/jira/browse/FLINK-16833
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / JDBC, Table SQL / Ecosystem
>Reporter: Jark Wu
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> Currently, we only natively support very limited JDBC dialects in flink-jdbc. 
> However, there are a lot of jdbc drivers in the world . We should expose the 
> ability to users to make it pluggable. 
> Some initial ideas:
>  - expose a connector configuration to accept a JDBCDialect class name.
>  - find supported JDBCDialect via SPI.
> A more detaited design should be proposed for disucssion. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] mans2singh commented on pull request #17661: [hotfix][connectors][table][jdbc][docs] - Updated key handling section text

2021-11-03 Thread GitBox


mans2singh commented on pull request #17661:
URL: https://github.com/apache/flink/pull/17661#issuecomment-959384274


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] slinkydeveloper commented on a change in pull request #17658: [FLINK-24684][table-planner] Add to string cast rules using the new CastRule stack

2021-11-03 Thread GitBox


slinkydeveloper commented on a change in pull request #17658:
URL: https://github.com/apache/flink/pull/17658#discussion_r742034054



##
File path: 
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/functions/casting/rules/AbstractCodeGeneratorCastRule.java
##
@@ -148,9 +147,10 @@ protected AbstractCodeGeneratorCastRule(CastRulePredicate 
predicate) {
 classCode)
 .getConstructors()[0]
 .newInstance(constructorArgs);
-} catch (InstantiationException | IllegalAccessException | 
InvocationTargetException e) {
+} catch (Throwable e) {

Review comment:
   IIRC I converted to `Throwable` to catch also `Error`s and not just 
`Exception`. I can revert if you want, but I think it's better to keep it this 
way because invoking this constructor should never fail, for any reason, and we 
need to catch it immediately if it fails in order to wrap it with the code 
associated.

##
File path: 
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/casting/CastRulesTest.java
##
@@ -76,7 +155,77 @@
 
Thread.currentThread().getContextClassLoader()),
 TimestampData.fromLocalDateTime(
 
LocalDateTime.parse("2021-09-24T12:34:56.123456")),
-StringData.fromString("2021-09-24 
14:34:56.123456")),
+StringData.fromString("2021-09-24 
14:34:56.123456"))

Review comment:
   Check now

##
File path: 
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/functions/casting/rules/RowToStringCastRule.java
##
@@ -0,0 +1,185 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.planner.functions.casting.rules;
+
+import org.apache.flink.table.data.ArrayData;
+import org.apache.flink.table.planner.functions.casting.CastCodeBlock;
+import org.apache.flink.table.planner.functions.casting.CastRulePredicate;
+import org.apache.flink.table.planner.functions.casting.CastRuleProvider;
+import org.apache.flink.table.planner.functions.casting.CodeGeneratorCastRule;
+import org.apache.flink.table.types.logical.LogicalType;
+import org.apache.flink.table.types.logical.LogicalTypeFamily;
+import org.apache.flink.table.types.logical.LogicalTypeRoot;
+import org.apache.flink.table.types.logical.RowType;
+
+import java.util.List;
+import java.util.stream.Collectors;
+
+import static org.apache.flink.table.planner.codegen.CodeGenUtils.className;
+import static org.apache.flink.table.planner.codegen.CodeGenUtils.newName;
+import static 
org.apache.flink.table.planner.codegen.CodeGenUtils.rowFieldReadAccess;
+import static 
org.apache.flink.table.planner.codegen.calls.BuiltInMethods.BINARY_STRING_DATA_FROM_STRING;
+import static 
org.apache.flink.table.planner.functions.casting.rules.CastRuleUtils.constructorCall;
+import static 
org.apache.flink.table.planner.functions.casting.rules.CastRuleUtils.functionCall;
+import static 
org.apache.flink.table.planner.functions.casting.rules.CastRuleUtils.methodCall;
+import static 
org.apache.flink.table.planner.functions.casting.rules.CastRuleUtils.strLiteral;
+
+/** {@link LogicalTypeRoot#ROW} to {@link LogicalTypeFamily#CHARACTER_STRING} 
cast rule. */
+public class RowToStringCastRule extends 
AbstractNullAwareCodeGeneratorCastRule {
+
+public static final RowToStringCastRule INSTANCE = new 
RowToStringCastRule();
+
+private RowToStringCastRule() {
+super(
+CastRulePredicate.builder()
+.predicate(
+(input, target) ->
+input.is(LogicalTypeRoot.ROW)
+&& 
target.is(LogicalTypeFamily.CHARACTER_STRING)
+&& ((RowType) input)
+.getFields().stream()
+.allMatch(
+

[jira] [Comment Edited] (FLINK-24755) sun.misc doesn't exist

2021-11-03 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438169#comment-17438169
 ] 

Aitozi edited comment on FLINK-24755 at 11/4/21, 2:01 AM:
--

[~chesnay]  You can also refer to 
[https://stackoverflow.com/questions/46477989/intellij-doesnt-see-some-non-public-jdk-9-classes-during-compilation]
 for some information .

I test by {{rm -rf flink-test-utils-parent/flink-test-utils-junit/target}} and 
run test with {{–release}}  option enable it can reproduce the sun.misc not 
exist.

 Regarding the previous ticket to change the target to 8 , I think it make 
sense to support flink still run on jdk8, but can we support jdk11 bytecode 
default, and give an opt to let user to build a target 8 (because build with 
jdk11 produce jdk11 bytecode is the standard way, jdk11 target 8 is the 
compatible way ). I think, In this way this problem will disappear


was (Author: aitozi):
[~chesnay]  You can also refer to 
[https://stackoverflow.com/questions/46477989/intellij-doesnt-see-some-non-public-jdk-9-classes-during-compilation]
 for some information .

I test by {{rm -rf flink-test-utils-parent/flink-test-utils-junit/target}} and 
run test with {{–release}}  option enable it can reproduce the sun.misc not 
exist.

 Regarding the previous ticket to change the target to 8 , I think it make 
sense to support flink still run on jdk8, but can we support jdk11 bytecode 
default, and give an opt to let user to build a target 8 (cause jdk11 bytecode 
run jdk11 is the standard way). I think, In this way this problem will disappear

> sun.misc doesn't exist
> --
>
> Key: FLINK-24755
> URL: https://issues.apache.org/jira/browse/FLINK-24755
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.15.0
>Reporter: Aitozi
>Priority: Major
> Attachments: image-2021-11-03-23-19-58-036.png, 
> image-2021-11-03-23-42-30-441.png
>
>
> After FLINK-24634 , the default target is jdk8. When I running tests from 
> IDE, some compiler error come up. I revert the change and test again, It can 
> work now. It's also ok to build from command.
> I can not quite figure it out , can you give some inputs [~chesnay]
> !image-2021-11-03-23-19-58-036.png|width=1227,height=108!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] wsz94 edited a comment on pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


wsz94 edited a comment on pull request #17643:
URL: https://github.com/apache/flink/pull/17643#issuecomment-959583041


   > @wsz94 I have created a patch by using `consumer.receive(timeout, 
timeunit)`. Can you confirm this fix works?
   > 
   > https://gist.github.com/syhily/fbf47b581257f94869c59660002bbfa1
   > 
   > You can checkout a brand new branch and run the command below:
   > 
   > ```shell
   > curl 
https://gist.githubusercontent.com/syhily/fbf47b581257f94869c59660002bbfa1/raw/072f26a8946a753b41be832c54f1c54f342a41e8/FLINK-24733.patch
 | git apply
   > ```
   
   Thanks. According to your suggestion, I modified the commit to fix the 
`PulsarOrderedPartitionSplitReader ` too. Could you help me to confirm this PR? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper commented on pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory

2021-11-03 Thread GitBox


tsreaper commented on pull request #17520:
URL: https://github.com/apache/flink/pull/17520#issuecomment-959002265


   @slinkydeveloper @fapaul @JingGe
   
   I've done some benchmarking on a testing yarn cluster.
   
   * Test data: The [Kaggle flight delay 
data](https://www.kaggle.com/usdot/flight-delays), a ~500MB csv file
   * Number of task slots: 8
   * Number of task manager: 1
   * Configuration
   ```yaml
   # common JVM configurations used in a lot of our production job, also for 
producing the TPCDS benchmark result in Flink 1.12
   env.java.opts.jobmanager: -XX:+UseParNewGC 
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 
-verbose:gc -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC 
-XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=1000 
-XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-XX:SurvivorRatio=5 -XX:ParallelGCThreads=4
   env.java.opts.taskmanager: -XX:+UseParNewGC 
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 
-verbose:gc -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC 
-XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=1000 
-XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-XX:SurvivorRatio=5 -XX:ParallelGCThreads=4
   
   # default memory configuration of Flink
   jobmanager.memory.process.size: 1600m
   taskmanager.memory.process.size: 1728m
   ```
   
   I've tested three implementations.
   * Bulk Format + Lazy Reading: This is the implementation of this PR.
   * Bulk Format + ArrayList: This implementation reads and deserialize all 
records of the whole block into an array list and send it to the reader thread. 
This implementation does not have a blocking pool as @JingGe suggested. See 
[here](https://github.com/tsreaper/flink/commit/3b86337cea499cd4245a34550a6b597239be3066)
 for code.
   * Stream Format: This is the implementation based on Stephan's 
[draft](https://github.com/apache/flink/commit/11c606096f6beeac45c4f4dabe0fde93cc91923d#diff-edfd2d187d920f781382054f22fb4e6e5b5d9361b95a87ebeda68ba3a49d5a55R51).
 See 
[here](https://github.com/tsreaper/flink/commit/6b3a65fd099fcffb4d7a5b20c9bde9aeace18f69)
 for code. I didn't implement projection pushdown for this but it should be 
fine because there is no projection pushdown in the benchmark.
   
   Here are the test results.
   
   ||xz compression, 64kb block size|xz compression, 2mb block size|
   |---|---|---|
   |bulk format + lazy reading|14s|10s|
   |bulk format + array list|14s|30s, due to GC|
   |stream format|2m24s, due to heavy GC|51s, due to GC|
   
   It is obvious that any implementation which loads all records of a block 
into memory at once will suffer from GC more or less. Also for smaller block 
sizes, blocking pool has almost no impact on performance. So I would say the 
implementation in this PR is the most suited implementation so far.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17651: [FLINK-24656][docs] Add documentation for window deduplication

2021-11-03 Thread GitBox


flinkbot commented on pull request #17651:
URL: https://github.com/apache/flink/pull/17651#issuecomment-958639580






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17662: [FLINK-24747][table-planner] Add producedDataType to SupportsProjectionPushDown.applyProjection

2021-11-03 Thread GitBox


flinkbot commented on pull request #17662:
URL: https://github.com/apache/flink/pull/17662#issuecomment-959309745






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] matriv commented on pull request #17658: [FLINK-24684][table-planner] Add to string cast rules using the new CastRule stack

2021-11-03 Thread GitBox


matriv commented on pull request #17658:
URL: https://github.com/apache/flink/pull/17658#issuecomment-959587919


   > @matriv regarding testing the rules in `CastFunctionsITCase`, I 
effectively found out that the row, array and map rules doesn't work, because 
some validation steps before (`SqlValidatorImpl` and `LogicalTypeCasts`) thinks 
the cast combination is invalid. Because the goal of this PR is to simply port 
the logic from `ScalarOperatorGens` to the new rules, perhaps should we tackle 
the issue of "exposing" the rules in the context of 
https://issues.apache.org/jira/browse/FLINK-21456?
   
   
   Yes of course, my mistake, thought those casts work end to end now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] tsreaper edited a comment on pull request #17520: [FLINK-24565][avro] Port avro file format factory to BulkReaderFormatFactory

2021-11-03 Thread GitBox


tsreaper edited a comment on pull request #17520:
URL: https://github.com/apache/flink/pull/17520#issuecomment-959002265






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] JingsongLi commented on pull request #17571: [FLINK-23015][table-runtime-blink] Implement streaming window Deduplicate operator

2021-11-03 Thread GitBox


JingsongLi commented on pull request #17571:
URL: https://github.com/apache/flink/pull/17571#issuecomment-958618670


   Minor: @beyond1920 We don't need add "blink" in the commit message, blink 
planner is the only planner.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Comment Edited] (FLINK-24755) sun.misc doesn't exist

2021-11-03 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438169#comment-17438169
 ] 

Aitozi edited comment on FLINK-24755 at 11/4/21, 2:00 AM:
--

[~chesnay]  You can also refer to 
[https://stackoverflow.com/questions/46477989/intellij-doesnt-see-some-non-public-jdk-9-classes-during-compilation]
 for some information .

I test by {{rm -rf flink-test-utils-parent/flink-test-utils-junit/target}} and 
run test with {{–release}}  option enable it can reproduce the sun.misc not 
exist.

 Regarding the previous ticket to change the target to 8 , I think it make 
sense to support flink still run on jdk8, but can we support jdk11 bytecode 
default, and give an opt to let user to build a target 8 (cause jdk11 bytecode 
run jdk11 is the standard way). I think, In this way this problem will disappear


was (Author: aitozi):
You can also refer to 
[https://stackoverflow.com/questions/46477989/intellij-doesnt-see-some-non-public-jdk-9-classes-during-compilation]
 for some information .

I test by {{rm -rf flink-test-utils-parent/flink-test-utils-junit/target}} and 
run test with {{–release}}  option enable it can reproduce the sun.misc not 
exist.

 

 

Regarding the previous ticket to change the target to 8 , I think it make sense 
to support flink still run on jdk8, but can we support jdk11 bytecode default, 
and give an opt to let user to build a target 8 (cause jdk11 bytecode run jdk11 
is the standard way). I think, In this way this problem will disappear

> sun.misc doesn't exist
> --
>
> Key: FLINK-24755
> URL: https://issues.apache.org/jira/browse/FLINK-24755
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.15.0
>Reporter: Aitozi
>Priority: Major
> Attachments: image-2021-11-03-23-19-58-036.png, 
> image-2021-11-03-23-42-30-441.png
>
>
> After FLINK-24634 , the default target is jdk8. When I running tests from 
> IDE, some compiler error come up. I revert the change and test again, It can 
> work now. It's also ok to build from command.
> I can not quite figure it out , can you give some inputs [~chesnay]
> !image-2021-11-03-23-19-58-036.png|width=1227,height=108!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [flink] nlu90 commented on a change in pull request #17643: [FLINK-24733][connector/pulsar] Data loss in pulsar source when using shared mode

2021-11-03 Thread GitBox


nlu90 commented on a change in pull request #17643:
URL: https://github.com/apache/flink/pull/17643#discussion_r742253290



##
File path: 
flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/reader/split/PulsarPartitionSplitReaderBase.java
##
@@ -114,6 +116,9 @@ protected PulsarPartitionSplitReaderBase(
 try {
 Duration timeout = deadline.timeLeftIfAny();
 Message message = pollMessage(timeout);
+if (message == null) {
+break;
+}
 
 // Deserialize message.
 collector.setMessage(message);

Review comment:
   why break directly after one message polling failure? should the change 
here be:
   ```
   if (message != null) {
   collector.setMessage(message)
   }
   ```
   
   This way, the for loop will keep trying until one of the exit condition met




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] zentol commented on a change in pull request #17608: [FLINK-24550][rpc] Use ContextClassLoader for message deserialization

2021-11-03 Thread GitBox


zentol commented on a change in pull request #17608:
URL: https://github.com/apache/flink/pull/17608#discussion_r742040650



##
File path: 
flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/akka/AkkaInvocationHandler.java
##
@@ -413,7 +413,7 @@ static Object deserializeValueIfNeeded(Object o, Method 
method) {
 if (o instanceof AkkaRpcSerializedValue) {
 try {
 return ((AkkaRpcSerializedValue) o)
-
.deserializeValue(AkkaInvocationHandler.class.getClassLoader());
+
.deserializeValue(Thread.currentThread().getContextClassLoader());

Review comment:
   of course, good catch




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] CrynetLogistics commented on pull request #17345: [FLINK-24227][connectors] Added Kinesis Data Streams Sink i…

2021-11-03 Thread GitBox


CrynetLogistics commented on pull request #17345:
URL: https://github.com/apache/flink/pull/17345#issuecomment-959763267






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] slinkydeveloper commented on a change in pull request #17656: [FLINK-24741]Deprecate FileRecordFormat

2021-11-03 Thread GitBox


slinkydeveloper commented on a change in pull request #17656:
URL: https://github.com/apache/flink/pull/17656#discussion_r742125995



##
File path: 
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/impl/FileRecordFormatAdapter.java
##
@@ -37,7 +37,12 @@
 import static org.apache.flink.util.Preconditions.checkArgument;
 import static org.apache.flink.util.Preconditions.checkNotNull;
 
-/** The FormatReaderAdapter turns a {@link FileRecordFormat} into a {@link 
BulkFormat}. */
+/**
+ * This interface is Deprecated, use {@link StreamFormatAdapter} instead.

Review comment:
   Same

##
File path: 
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/reader/FileRecordFormat.java
##
@@ -34,7 +34,14 @@
 import java.io.Serializable;
 
 /**
- * A reader format that reads individual records from a file.
+ * This interface is Deprecated, use {@link StreamFormat} instead. The main 
motivation for removing

Review comment:
   Same

##
File path: 
flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java
##
@@ -181,12 +178,15 @@ private FileSource(
 }
 
 /**
- * Builds a new {@code FileSource} using a {@link FileRecordFormat} to 
read record-by-record
+ * This method is Deprecated, use {@link 
#forRecordStreamFormat(StreamFormat, Path...)} instead.

Review comment:
   Please use the javadoc `@deprecated` tag




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17525: [FLINK-24597][state] fix the problem that RocksStateKeysAndNamespaceIterator would return duplicate data

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17525:
URL: https://github.com/apache/flink/pull/17525#issuecomment-947380227






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot commented on pull request #17660: [FLINK-24657][runtime] Added metric of the total real size of input/o…

2021-11-03 Thread GitBox


flinkbot commented on pull request #17660:
URL: https://github.com/apache/flink/pull/17660#issuecomment-958945516






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [flink] flinkbot edited a comment on pull request #17649: [FLINK-24742][table][docs] Add info about SQL client key strokes to docs

2021-11-03 Thread GitBox


flinkbot edited a comment on pull request #17649:
URL: https://github.com/apache/flink/pull/17649#issuecomment-958288236


   
   ## CI report:
   
   * 7081987de10cb9601d16a1292862fb062cf002d5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=25830)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   5   6   7   8   9   10   >