[jira] [Assigned] (FLINK-10866) Queryable state can prevent cluster from starting
[ https://issues.apache.org/jira/browse/FLINK-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BoWang reassigned FLINK-10866: -- Assignee: BoWang > Queryable state can prevent cluster from starting > - > > Key: FLINK-10866 > URL: https://issues.apache.org/jira/browse/FLINK-10866 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.5, 1.6.2, 1.7.0 >Reporter: Till Rohrmann >Assignee: BoWang >Priority: Critical > Fix For: 1.8.0 > > > The {{KvStateServerImpl}} can currently prevent the {{TaskExecutor}} from > starting. > Currently, the QS server starts per default on port {{9067}}. If this port is > not free, then it fails and stops the whole initialization of the > {{TaskExecutor}}. I think the QS server should not stop the {{TaskExecutor}} > from starting. > We should at least change the default port to {{0}} to avoid port conflicts. > However, this will break all setups which don't explicitly set the QS port > because now it either needs to be setup or extracted from the logs. > Additionally, we should think about whether a QS server startup failure > should lead to a {{TaskExecutor}} failure or simply be logged. Both > approaches have pros and cons. Currently, a failing QS server will also > affect users which don't want to use QS. If we tolerate failures in the QS > server, then a user who wants to use QS might run into problems with state > not being reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10866) Queryable state can prevent cluster from starting
[ https://issues.apache.org/jira/browse/FLINK-10866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733902#comment-16733902 ] BoWang commented on FLINK-10866: hi, has anyone taken this jira? If not, I would like to take this. > Queryable state can prevent cluster from starting > - > > Key: FLINK-10866 > URL: https://issues.apache.org/jira/browse/FLINK-10866 > Project: Flink > Issue Type: Improvement > Components: Local Runtime >Affects Versions: 1.5.5, 1.6.2, 1.7.0 >Reporter: Till Rohrmann >Priority: Critical > Fix For: 1.8.0 > > > The {{KvStateServerImpl}} can currently prevent the {{TaskExecutor}} from > starting. > Currently, the QS server starts per default on port {{9067}}. If this port is > not free, then it fails and stops the whole initialization of the > {{TaskExecutor}}. I think the QS server should not stop the {{TaskExecutor}} > from starting. > We should at least change the default port to {{0}} to avoid port conflicts. > However, this will break all setups which don't explicitly set the QS port > because now it either needs to be setup or extracted from the logs. > Additionally, we should think about whether a QS server startup failure > should lead to a {{TaskExecutor}} failure or simply be logged. Both > approaches have pros and cons. Currently, a failing QS server will also > affect users which don't want to use QS. If we tolerate failures in the QS > server, then a user who wants to use QS might run into problems with state > not being reachable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua commented on issue #7405: [FLINK-11249] FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
yanghua commented on issue #7405: [FLINK-11249] FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer URL: https://github.com/apache/flink/pull/7405#issuecomment-451369403 cc @pnowojski and @aljoscha This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #7407: [FLINK-11260] Bump the Janino compiler to 3.0.11
Fokko commented on issue #7407: [FLINK-11260] Bump the Janino compiler to 3.0.11 URL: https://github.com/apache/flink/pull/7407#issuecomment-451364596 It passed on my fork: https://travis-ci.org/Fokko/flink/builds/474918896 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko commented on issue #7406: [FLINK-11259] Bump Zookeeper to 3.4.13
Fokko commented on issue #7406: [FLINK-11259] Bump Zookeeper to 3.4.13 URL: https://github.com/apache/flink/pull/7406#issuecomment-451364495 The error looks related, let me dive into it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-11264) Fail to start scala shell
[ https://issues.apache.org/jira/browse/FLINK-11264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shuiqiangchen reassigned FLINK-11264: - Assignee: shuiqiangchen > Fail to start scala shell > - > > Key: FLINK-11264 > URL: https://issues.apache.org/jira/browse/FLINK-11264 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.7.1 >Reporter: Jeff Zhang >Assignee: shuiqiangchen >Priority: Major > > {code:java} > ➜ build-target git:(master) ✗ bin/start-scala-shell.sh local > bin/start-scala-shell.sh: line 57: cd: ../bin/opt: No such file or directory > Starting Flink Shell: > log4j:WARN No appenders could be found for logger > (org.apache.flink.configuration.GlobalConfiguration). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Starting local Flink cluster (host: localhost, port: 8083). > Connecting to Flink cluster (host: localhost, port: 8083). > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/flink/table/api/TableEnvironment$ > at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:103) > at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:57) > at org.apache.flink.api.scala.FlinkShell$.liftedTree1$1(FlinkShell.scala:211) > at org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:197) > at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:135) > at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.table.api.TableEnvironment$ > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 6 more{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11230) Sum of FlinkSql after two table union all.The value is too large.
[ https://issues.apache.org/jira/browse/FLINK-11230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733806#comment-16733806 ] jiwei commented on FLINK-11230: --- [~hequn8128],it is useful for me maybe .I will try it. Thanks for your advice! > Sum of FlinkSql after two table union all.The value is too large. > - > > Key: FLINK-11230 > URL: https://issues.apache.org/jira/browse/FLINK-11230 > Project: Flink > Issue Type: Bug > Components: Table API SQL >Affects Versions: 1.7.0 >Reporter: jiwei >Priority: Blocker > Labels: test > Attachments: image-2019-01-02-14-18-33-890.png, > image-2019-01-02-14-18-43-710.png, screenshot-1.png > > > SELECT k AS KEY, SUM(p) AS pv > FROM ( > SELECT tumble_start(stime, INTERVAL '1' minute) AS k > , COUNT(*) AS p > FROM flink_test1 > GROUP BY tumble(stime, INTERVAL '1' minute) > UNION ALL > SELECT tumble_start(stime, INTERVAL '1' minute) AS k > , COUNT(*) AS p > FROM flink_test2 > GROUP BY tumble(stime, INTERVAL '1' minute) > ) t > GROUP BY k > The Result of executing this sql is about 7000 per minute and keeping > increasing.But the result is 60 per minute for per table.Is there an error in > my SQL statement? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11263) yarn session script doesn't boot up task managers
[ https://issues.apache.org/jira/browse/FLINK-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733813#comment-16733813 ] Hongtao Zhang commented on FLINK-11263: --- Hi [~Tison] [~csq] so currently, the per job mode has the same behavior with the yarn session other than pre-start the dispatcher and resource manager the question is that if i want to deploy a job to the cluster. say the job need 6 parallelism and the each tm only has 4 slots. how does flink handle this ? does flink will create two TMs since one can't feed the need ? > yarn session script doesn't boot up task managers > - > > Key: FLINK-11263 > URL: https://issues.apache.org/jira/browse/FLINK-11263 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Affects Versions: 1.6.3 > Environment: Flink 1.6.3 > Hadoop 2.7.5 >Reporter: Hongtao Zhang >Priority: Critical > > {{./bin/yarn-session.sh -n 4 -jm 1024m -tm 4096m}} > {{I want to boot up a Yarn Session Cluster use the command above}} > {{but Flink doesn't create the task executors but only the > applicationMaster(YarnSessionClusterEntryPoint)}} > > {{the Task Executors will be created when a job was submitted.}} > > {{the point is that Yarn ResourceManager doesn't know how many task executors > should be created because the -n option was not accept by ResourceMananger}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11263) yarn session script doesn't boot up task managers
[ https://issues.apache.org/jira/browse/FLINK-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733757#comment-16733757 ] shuiqiangchen commented on FLINK-11263: --- Hi HongTao,when started the yarn session, the flink cluster deployed in yarn containers will only start up a [dispatcher] which currently called JobManager but a misleading name. Only when submit a job will the dispatcher start a JobMaster and request TaskManager slots from ResourceManager. More detail please refer to FLIP-6 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 Best regards. > yarn session script doesn't boot up task managers > - > > Key: FLINK-11263 > URL: https://issues.apache.org/jira/browse/FLINK-11263 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Affects Versions: 1.6.3 > Environment: Flink 1.6.3 > Hadoop 2.7.5 >Reporter: Hongtao Zhang >Priority: Critical > > {{./bin/yarn-session.sh -n 4 -jm 1024m -tm 4096m}} > {{I want to boot up a Yarn Session Cluster use the command above}} > {{but Flink doesn't create the task executors but only the > applicationMaster(YarnSessionClusterEntryPoint)}} > > {{the Task Executors will be created when a job was submitted.}} > > {{the point is that Yarn ResourceManager doesn't know how many task executors > should be created because the -n option was not accept by ResourceMananger}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11263) yarn session script doesn't boot up task managers
[ https://issues.apache.org/jira/browse/FLINK-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733749#comment-16733749 ] TisonKun commented on FLINK-11263: -- [~hongtao12310] It's an unimplemented future, see also FLINK-11078. I'm working on it and what you see for now is how Flink(after FLIP-6) works currently. > yarn session script doesn't boot up task managers > - > > Key: FLINK-11263 > URL: https://issues.apache.org/jira/browse/FLINK-11263 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Affects Versions: 1.6.3 > Environment: Flink 1.6.3 > Hadoop 2.7.5 >Reporter: Hongtao Zhang >Priority: Critical > > {{./bin/yarn-session.sh -n 4 -jm 1024m -tm 4096m}} > {{I want to boot up a Yarn Session Cluster use the command above}} > {{but Flink doesn't create the task executors but only the > applicationMaster(YarnSessionClusterEntryPoint)}} > > {{the Task Executors will be created when a job was submitted.}} > > {{the point is that Yarn ResourceManager doesn't know how many task executors > should be created because the -n option was not accept by ResourceMananger}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11264) Fail to start scala shell
[ https://issues.apache.org/jira/browse/FLINK-11264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated FLINK-11264: --- Component/s: Scala Shell > Fail to start scala shell > - > > Key: FLINK-11264 > URL: https://issues.apache.org/jira/browse/FLINK-11264 > Project: Flink > Issue Type: Bug > Components: Scala Shell >Affects Versions: 1.7.1 >Reporter: Jeff Zhang >Priority: Major > > {code:java} > ➜ build-target git:(master) ✗ bin/start-scala-shell.sh local > bin/start-scala-shell.sh: line 57: cd: ../bin/opt: No such file or directory > Starting Flink Shell: > log4j:WARN No appenders could be found for logger > (org.apache.flink.configuration.GlobalConfiguration). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Starting local Flink cluster (host: localhost, port: 8083). > Connecting to Flink cluster (host: localhost, port: 8083). > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/flink/table/api/TableEnvironment$ > at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:103) > at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:57) > at org.apache.flink.api.scala.FlinkShell$.liftedTree1$1(FlinkShell.scala:211) > at org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:197) > at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:135) > at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.table.api.TableEnvironment$ > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 6 more{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11264) Fail to start scala shell
[ https://issues.apache.org/jira/browse/FLINK-11264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated FLINK-11264: --- Affects Version/s: 1.7.1 > Fail to start scala shell > - > > Key: FLINK-11264 > URL: https://issues.apache.org/jira/browse/FLINK-11264 > Project: Flink > Issue Type: Bug >Affects Versions: 1.7.1 >Reporter: Jeff Zhang >Priority: Major > > {code:java} > ➜ build-target git:(master) ✗ bin/start-scala-shell.sh local > bin/start-scala-shell.sh: line 57: cd: ../bin/opt: No such file or directory > Starting Flink Shell: > log4j:WARN No appenders could be found for logger > (org.apache.flink.configuration.GlobalConfiguration). > log4j:WARN Please initialize the log4j system properly. > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > Starting local Flink cluster (host: localhost, port: 8083). > Connecting to Flink cluster (host: localhost, port: 8083). > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/flink/table/api/TableEnvironment$ > at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:103) > at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:57) > at org.apache.flink.api.scala.FlinkShell$.liftedTree1$1(FlinkShell.scala:211) > at org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:197) > at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:135) > at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala) > Caused by: java.lang.ClassNotFoundException: > org.apache.flink.table.api.TableEnvironment$ > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 6 more{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11264) Fail to start scala shell
Jeff Zhang created FLINK-11264: -- Summary: Fail to start scala shell Key: FLINK-11264 URL: https://issues.apache.org/jira/browse/FLINK-11264 Project: Flink Issue Type: Bug Reporter: Jeff Zhang {code:java} ➜ build-target git:(master) ✗ bin/start-scala-shell.sh local bin/start-scala-shell.sh: line 57: cd: ../bin/opt: No such file or directory Starting Flink Shell: log4j:WARN No appenders could be found for logger (org.apache.flink.configuration.GlobalConfiguration). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Starting local Flink cluster (host: localhost, port: 8083). Connecting to Flink cluster (host: localhost, port: 8083). Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/table/api/TableEnvironment$ at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:103) at org.apache.flink.api.scala.FlinkILoop.(FlinkILoop.scala:57) at org.apache.flink.api.scala.FlinkShell$.liftedTree1$1(FlinkShell.scala:211) at org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:197) at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:135) at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala) Caused by: java.lang.ClassNotFoundException: org.apache.flink.table.api.TableEnvironment$ at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 6 more{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11263) yarn session script doesn't boot up task managers
[ https://issues.apache.org/jira/browse/FLINK-11263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hongtao Zhang updated FLINK-11263: -- Priority: Critical (was: Major) > yarn session script doesn't boot up task managers > - > > Key: FLINK-11263 > URL: https://issues.apache.org/jira/browse/FLINK-11263 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Affects Versions: 1.6.3 > Environment: Flink 1.6.3 > Hadoop 2.7.5 >Reporter: Hongtao Zhang >Priority: Critical > > {{./bin/yarn-session.sh -n 4 -jm 1024m -tm 4096m}} > {{I want to boot up a Yarn Session Cluster use the command above}} > {{but Flink doesn't create the task executors but only the > applicationMaster(YarnSessionClusterEntryPoint)}} > > {{the Task Executors will be created when a job was submitted.}} > > {{the point is that Yarn ResourceManager doesn't know how many task executors > should be created because the -n option was not accept by ResourceMananger}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-10333) Rethink ZooKeeper based stores (SubmittedJobGraph, MesosWorker, CompletedCheckpoints)
[ https://issues.apache.org/jira/browse/FLINK-10333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16729516#comment-16729516 ] TisonKun edited comment on FLINK-10333 at 1/4/19 2:20 AM: -- Following the ZK transaction idea, I drafted a design doc on how we rework ZK based stores to achieve atomicity. Mainly we will re-implement leader election service to get the {{election-node-path}}. It could be done by porting the implementation of {{LeaderLatch}} in Curator to Flink. Then we generate dispatcher/jm/rm's {{session id}} based on {{election-node-path}}, and when communicate with ZK, we pass the {{session id}}, which underneath converted to the {{election-node-path}} to ensure that the caller is the leader, just as code snipper above. Subtasks can be separated as # re-layout zookeeper content # re-implement ZK leader election, expose election node path, which can be converted to a session id # re-implement ZK submitted job graph store, r/w with a session id(the session id is then converted to the election node path) # re-implement ZK completed checkpoint store, r/w with a session id(the session id is then converted to the election node path) # let only dispatcher maintain running job registry What do you think? [~till.rohrmann] [~StephanEwen] was (Author: tison): Following the ZK transaction idea, I drafted a design doc on how we rework ZK based stores to achieve atomicity. Mainly we will re-implement leader election service to get the {{election-node-path}}. It could be done by porting the implementation of {{LeaderLatch}} in Curator to Flink. Then we generate dispatcher/jm/rm's {{session id}} based on {{election-node-path}}, and when communicate with ZK, we pass the {{session id}}, which underneath converted to the {{election-node-path}} to ensure that the caller is the leader, just as code snipper above. Subtasks can be separated as # re-implement ZK leader election, expose election node path, which can be converted to a session id # re-implement ZK submitted job graph store, r/w with a session id(the session id is then converted to the election node path) # re-implement ZK completed checkpoint store, r/w with a session id(the session id is then converted to the election node path) # discuss how running job registry should work (this is an on-going issue, mainly about how we publish the status of a job # (maybe) re-layout zookeeper content What do you think? [~till.rohrmann] [~StephanEwen] > Rethink ZooKeeper based stores (SubmittedJobGraph, MesosWorker, > CompletedCheckpoints) > - > > Key: FLINK-10333 > URL: https://issues.apache.org/jira/browse/FLINK-10333 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.5.3, 1.6.0, 1.7.0 >Reporter: Till Rohrmann >Priority: Major > Fix For: 1.8.0 > > > While going over the ZooKeeper based stores > ({{ZooKeeperSubmittedJobGraphStore}}, {{ZooKeeperMesosWorkerStore}}, > {{ZooKeeperCompletedCheckpointStore}}) and the underlying > {{ZooKeeperStateHandleStore}} I noticed several inconsistencies which were > introduced with past incremental changes. > * Depending whether {{ZooKeeperStateHandleStore#getAllSortedByNameAndLock}} > or {{ZooKeeperStateHandleStore#getAllAndLock}} is called, deserialization > problems will either lead to removing the Znode or not > * {{ZooKeeperStateHandleStore}} leaves inconsistent state in case of > exceptions (e.g. {{#getAllAndLock}} won't release the acquired locks in case > of a failure) > * {{ZooKeeperStateHandleStore}} has too many responsibilities. It would be > better to move {{RetrievableStateStorageHelper}} out of it for a better > separation of concerns > * {{ZooKeeperSubmittedJobGraphStore}} overwrites a stored {{JobGraph}} even > if it is locked. This should not happen since it could leave another system > in an inconsistent state (imagine a changed {{JobGraph}} which restores from > an old checkpoint) > * Redundant but also somewhat inconsistent put logic in the different stores > * Shadowing of ZooKeeper specific exceptions in {{ZooKeeperStateHandleStore}} > which were expected to be caught in {{ZooKeeperSubmittedJobGraphStore}} > * Getting rid of the {{SubmittedJobGraphListener}} would be helpful > These problems made me think how reliable these components actually work. > Since these components are very important, I propose to refactor them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11263) yarn session script doesn't boot up task managers
Hongtao Zhang created FLINK-11263: - Summary: yarn session script doesn't boot up task managers Key: FLINK-11263 URL: https://issues.apache.org/jira/browse/FLINK-11263 Project: Flink Issue Type: Bug Components: Cluster Management Affects Versions: 1.6.3 Environment: Flink 1.6.3 Hadoop 2.7.5 Reporter: Hongtao Zhang {{./bin/yarn-session.sh -n 4 -jm 1024m -tm 4096m}} {{I want to boot up a Yarn Session Cluster use the command above}} {{but Flink doesn't create the task executors but only the applicationMaster(YarnSessionClusterEntryPoint)}} {{the Task Executors will be created when a job was submitted.}} {{the point is that Yarn ResourceManager doesn't know how many task executors should be created because the -n option was not accept by ResourceMananger}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] KarmaGYZ commented on issue #7380: [hotfix][docs] Fix incorrect example in cep doc
KarmaGYZ commented on issue #7380: [hotfix][docs] Fix incorrect example in cep doc URL: https://github.com/apache/flink/pull/7380#issuecomment-451328811 @dawidwys Thanks for the review! I've added a new commit fix the problems and it now passes the Travis-ci. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] HuangZhenQiu commented on issue #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager
HuangZhenQiu commented on issue #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager URL: https://github.com/apache/flink/pull/7356#issuecomment-451305740 @suez1224 Thanks for the input. I will revise the PR tonight. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] HuangZhenQiu commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager
HuangZhenQiu commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager URL: https://github.com/apache/flink/pull/7356#discussion_r245161387 ## File path: flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java ## @@ -381,11 +393,20 @@ public void onContainersAllocated(List containers) { } catch (Throwable t) { log.error("Could not start TaskManager in container {}.", container.getId(), t); + failedContainerSoFar.getAndAdd(1); // release the failed container workerNodeMap.remove(resourceId); resourceManagerClient.releaseAssignedContainer(container.getId()); - // and ask for a new one - requestYarnContainerIfRequired(container.getResource(), container.getPriority()); + + if (failedContainerSoFar.intValue() < maxFailedContainers) { Review comment: Yes, you are right. Will change also. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] HuangZhenQiu commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager
HuangZhenQiu commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager URL: https://github.com/apache/flink/pull/7356#discussion_r245161384 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java ## @@ -145,6 +146,9 @@ /** All registered listeners for status updates of the ResourceManager. */ private ConcurrentMap infoMessageListeners; + /** The number of failed containers since the master became active. */ + protected AtomicInteger failedContainerSoFar = new AtomicInteger(0); Review comment: There are three code paths that will create failed container. One is the container can't be started from YARN, which is notified by NMClient callback, the other two are failed to registerTastManager and disconnectTaskManager (When tasks fail). Thus, the concurrent is between akka thread and main thread? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] suez1224 commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager
suez1224 commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager URL: https://github.com/apache/flink/pull/7356#discussion_r245156585 ## File path: flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java ## @@ -381,11 +393,20 @@ public void onContainersAllocated(List containers) { } catch (Throwable t) { log.error("Could not start TaskManager in container {}.", container.getId(), t); + failedContainerSoFar.getAndAdd(1); // release the failed container workerNodeMap.remove(resourceId); resourceManagerClient.releaseAssignedContainer(container.getId()); - // and ask for a new one - requestYarnContainerIfRequired(container.getResource(), container.getPriority()); + + if (failedContainerSoFar.intValue() < maxFailedContainers) { Review comment: I think we need to fail the cluster when failedContainerSoFar exceed the threshold after line 362 as well, right? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] suez1224 commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager
suez1224 commented on a change in pull request #7356: [FLINK-10868][flink-yarn] Enforce maximum failed TMs in YarnResourceManager URL: https://github.com/apache/flink/pull/7356#discussion_r245151839 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java ## @@ -145,6 +146,9 @@ /** All registered listeners for status updates of the ResourceManager. */ private ConcurrentMap infoMessageListeners; + /** The number of failed containers since the master became active. */ + protected AtomicInteger failedContainerSoFar = new AtomicInteger(0); Review comment: Do we Atomic here? Is there concurrent access? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] stevenzwu commented on issue #7020: [FLINK-10774] [Kafka] connection leak when partition discovery is disabled an…
stevenzwu commented on issue #7020: [FLINK-10774] [Kafka] connection leak when partition discovery is disabled an… URL: https://github.com/apache/flink/pull/7020#issuecomment-451278571 @tillrohrmann can you take a look at latest update? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zentol closed pull request #7342: [FLINK-11023][ES] Add NOTICE file
zentol closed pull request #7342: [FLINK-11023][ES] Add NOTICE file URL: https://github.com/apache/flink/pull/7342 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-connectors/flink-sql-connector-elasticsearch6/pom.xml b/flink-connectors/flink-sql-connector-elasticsearch6/pom.xml index 85b73a9c837..7438566105c 100644 --- a/flink-connectors/flink-sql-connector-elasticsearch6/pom.xml +++ b/flink-connectors/flink-sql-connector-elasticsearch6/pom.xml @@ -115,6 +115,7 @@ under the License. META-INF/services/com.fasterxml.** META-INF/services/org.apache.lucene.** META-INF/services/org.elasticsearch.** + META-INF/LICENSE.txt diff --git a/flink-connectors/flink-sql-connector-elasticsearch6/src/main/resources/META-INF/NOTICE b/flink-connectors/flink-sql-connector-elasticsearch6/src/main/resources/META-INF/NOTICE new file mode 100644 index 000..9880ee926b3 --- /dev/null +++ b/flink-connectors/flink-sql-connector-elasticsearch6/src/main/resources/META-INF/NOTICE @@ -0,0 +1,45 @@ +flink-sql-connector-elasticsearch6 +Copyright 2014-2018 The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). + +This project bundles the following dependencies under the Apache Software License 2.0. (http://www.apache.org/licenses/LICENSE-2.0.txt) + +- com.fasterxml.jackson.core:jackson-core:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-smile:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.8.10 +- commons-codec:commons-codec:1.10 +- commons-logging:commons-logging:1.1.3 +- org.apache.httpcomponents:httpasyncclient:4.1.2 +- org.apache.httpcomponents:httpclient:4.5.3 +- org.apache.httpcomponents:httpcore:4.4.6 +- org.apache.httpcomponents:httpcore-nio:4.4.5 +- org.apache.logging.log4j:log4j-api:2.9.1 +- org.apache.logging.log4j:log4j-to-slf4j:2.9.1 +- org.apache.lucene:lucene-analyzers-common:7.3.1 +- org.apache.lucene:lucene-backward-codecs:7.3.1 +- org.apache.lucene:lucene-core:7.3.1 +- org.apache.lucene:lucene-grouping:7.3.1 +- org.apache.lucene:lucene-highlighter:7.3.1 +- org.apache.lucene:lucene-join:7.3.1 +- org.apache.lucene:lucene-memory:7.3.1 +- org.apache.lucene:lucene-misc:7.3.1 +- org.apache.lucene:lucene-queries:7.3.1 +- org.apache.lucene:lucene-queryparser:7.3.1 +- org.apache.lucene:lucene-sandbox:7.3.1 +- org.apache.lucene:lucene-spatial:7.3.1 +- org.apache.lucene:lucene-spatial-extras:7.3.1 +- org.apache.lucene:lucene-spatial3d:7.3.1 +- org.apache.lucene:lucene-suggest:7.3.1 +- org.elasticsearch:elasticsearch:6.3.1 +- org.elasticsearch:elasticsearch-cli:6.3.1 +- org.elasticsearch:elasticsearch-core:6.3.1 +- org.elasticsearch:elasticsearch-secure-sm:6.3.1 +- org.elasticsearch:elasticsearch-x-content:6.3.1 +- org.elasticsearch.client:elasticsearch-rest-client:6.3.1 +- org.elasticsearch.client:elasticsearch-rest-high-level-client:6.3.1 +- org.elasticsearch.plugin:aggs-matrix-stats-client:6.3.1 +- org.elasticsearch.plugin:parent-join-client:6.3.1 +- org.elasticsearch.plugin:rank-eval-client:6.3.1 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-11079) Skip deployment for flnk-storm-examples
[ https://issues.apache.org/jira/browse/FLINK-11079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11079. Resolution: Fixed master: 9e0ea45645b242c1b13f59c684fbe9d47eaeda06 1.7: 7679a84f5cae29ae429ba5ef867f883e0acf38aa 1.6: 437f06bc6d99e101ff2d62bf4b829f038e0c5d9d > Skip deployment for flnk-storm-examples > --- > > Key: FLINK-11079 > URL: https://issues.apache.org/jira/browse/FLINK-11079 > Project: Flink > Issue Type: Bug > Components: Storm Compatibility >Affects Versions: 1.5.5, 1.6.2, 1.7.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.6.4, 1.7.2, 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Similar to FLINK-10987 we should also update the {{LICENSE}} and {{NOTICE}} > for {{flink-storm-examples}}. > > This project creates several fat example jars that are deployed to maven > central. > Alternatively we could about dropping these examples. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11194) missing Scala 2.12 build of HBase connector
[ https://issues.apache.org/jira/browse/FLINK-11194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11194. Resolution: Fixed Fix Version/s: 1.8.0 1.7.2 master: 7a0a525504ec4b5e82bbdb3138f6ee4170b48350 1.7: bb560c55c662675611a4921ff38da27cc435426d > missing Scala 2.12 build of HBase connector > > > Key: FLINK-11194 > URL: https://issues.apache.org/jira/browse/FLINK-11194 > Project: Flink > Issue Type: Bug > Components: Batch Connectors and Input/Output Formats, Build System >Affects Versions: 1.7.0 > Environment: Scala version 2.12.7 > Flink version 1.7.0 >Reporter: Zhenhao Li >Assignee: Chesnay Schepler >Priority: Major > Labels: artifact, build, hbase, pull-request-available, scala > Fix For: 1.7.2, 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > See the following SBT log. > ``` > [error] (update) sbt.librarymanagement.ResolveException: unresolved > dependency: org.apache.flink#flink-hbase_2.12;1.7.0: Resolution failed > several times for dependency: org.apache.flink#flink-hbase_2.12;1.7.0 > \{compile=[default(compile)]}:: > [error] java.text.ParseException: inconsistent module descriptor file > found in > 'https://repo1.maven.org/maven2/org/apache/flink/flink-hbase_2.12/1.7.0/flink-hbase_2.12-1.7.0.pom': > bad module name: expected='flink-hbase_2.12' found='flink-hbase_2.11'; > [error] java.text.ParseException: inconsistent module descriptor file > found in > 'https://repo1.maven.org/maven2/org/apache/flink/flink-hbase_2.12/1.7.0/flink-hbase_2.12-1.7.0.pom': > bad module name: expected='flink-hbase_2.12' found='flink-hbase_2.11'; > ``` -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zentol closed pull request #7336: [FLINK-11194][hbase] Use type instead of classifier
zentol closed pull request #7336: [FLINK-11194][hbase] Use type instead of classifier URL: https://github.com/apache/flink/pull/7336 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-connectors/flink-hbase/pom.xml b/flink-connectors/flink-hbase/pom.xml index 75b9470f45c..2ab0ec4c2aa 100644 --- a/flink-connectors/flink-hbase/pom.xml +++ b/flink-connectors/flink-hbase/pom.xml @@ -49,19 +49,6 @@ under the License. 1 - - org.apache.maven.plugins - maven-shade-plugin - - - - - - shade-flink - none - - - @@ -255,7 +242,7 @@ under the License. org.apache.hbase hbase-server ${hbase.version} - tests + test-jar test This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zentol closed pull request #7343: [FLINK-11079][storm] Skip deployment of storm-examples
zentol closed pull request #7343: [FLINK-11079][storm] Skip deployment of storm-examples URL: https://github.com/apache/flink/pull/7343 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-contrib/flink-storm-examples/pom.xml b/flink-contrib/flink-storm-examples/pom.xml index 42038cd1e11..47323c05b70 100644 --- a/flink-contrib/flink-storm-examples/pom.xml +++ b/flink-contrib/flink-storm-examples/pom.xml @@ -111,6 +111,14 @@ under the License. + + org.apache.maven.plugins + maven-deploy-plugin + + true + + + org.apache.maven.plugins This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] tillrohrmann commented on a change in pull request #7342: [FLINK-11023][ES] Add NOTICE file
tillrohrmann commented on a change in pull request #7342: [FLINK-11023][ES] Add NOTICE file URL: https://github.com/apache/flink/pull/7342#discussion_r245084804 ## File path: flink-connectors/flink-sql-connector-elasticsearch6/src/main/resources/META-INF/NOTICE ## @@ -0,0 +1,45 @@ +flink-sql-connector-elasticsearch6 +Copyright 2014-2018 The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). + +This project bundles the following dependencies under the Apache Software License 2.0. (http://www.apache.org/licenses/LICENSE-2.0.txt) + +- com.fasterxml.jackson.core:jackson-core:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-smile:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.8.10 +- commons-codec:commons-codec:1.10 +- commons-logging:commons-logging:1.1.3 +- org.apache.httpcomponents:httpasyncclient:4.1.2 +- org.apache.httpcomponents:httpclient:4.5.3 +- org.apache.httpcomponents:httpcore-nio:4.4.5 +- org.apache.httpcomponents:httpcore:4.4.6 +- org.apache.logging.log4j:log4j-api:2.9.1 +- org.apache.logging.log4j:log4j-to-slf4j:2.9.1 +- org.apache.lucene:lucene-analyzers-common:7.3.1 +- org.apache.lucene:lucene-backward-codecs:7.3.1 +- org.apache.lucene:lucene-core:7.3.1 +- org.apache.lucene:lucene-grouping:7.3.1 +- org.apache.lucene:lucene-highlighter:7.3.1 +- org.apache.lucene:lucene-join:7.3.1 +- org.apache.lucene:lucene-memory:7.3.1 +- org.apache.lucene:lucene-misc:7.3.1 +- org.apache.lucene:lucene-queries:7.3.1 +- org.apache.lucene:lucene-queryparser:7.3.1 +- org.apache.lucene:lucene-sandbox:7.3.1 +- org.apache.lucene:lucene-spatial-extras:7.3.1 +- org.apache.lucene:lucene-spatial3d:7.3.1 +- org.apache.lucene:lucene-spatial:7.3.1 Review comment: wrong sort order. Should be before `lucene-spatial-extras` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] tillrohrmann commented on a change in pull request #7342: [FLINK-11023][ES] Add NOTICE file
tillrohrmann commented on a change in pull request #7342: [FLINK-11023][ES] Add NOTICE file URL: https://github.com/apache/flink/pull/7342#discussion_r245084507 ## File path: flink-connectors/flink-sql-connector-elasticsearch6/src/main/resources/META-INF/NOTICE ## @@ -0,0 +1,45 @@ +flink-sql-connector-elasticsearch6 +Copyright 2014-2018 The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). + +This project bundles the following dependencies under the Apache Software License 2.0. (http://www.apache.org/licenses/LICENSE-2.0.txt) + +- com.fasterxml.jackson.core:jackson-core:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-smile:2.8.10 +- com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.8.10 +- commons-codec:commons-codec:1.10 +- commons-logging:commons-logging:1.1.3 +- org.apache.httpcomponents:httpasyncclient:4.1.2 +- org.apache.httpcomponents:httpclient:4.5.3 +- org.apache.httpcomponents:httpcore-nio:4.4.5 +- org.apache.httpcomponents:httpcore:4.4.6 Review comment: wrong sort order between `httpcore-nio` and `httpcore` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-10848) Flink's Yarn ResourceManager can allocate too many excess containers
[ https://issues.apache.org/jira/browse/FLINK-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733245#comment-16733245 ] Shuyi Chen commented on FLINK-10848: [~xinpu], thanks for sharing your thoughts. You are right, ideally, we should prevent step 2. However, in AMRMClientAsync.CallbackHandler, AFAIK, we can only know that the previous n container requests has been sent to RM through the onContainersAllocated callback, so step 2 might be difficult to prevent. > Flink's Yarn ResourceManager can allocate too many excess containers > > > Key: FLINK-10848 > URL: https://issues.apache.org/jira/browse/FLINK-10848 > Project: Flink > Issue Type: Bug > Components: YARN >Affects Versions: 1.3.3, 1.4.2, 1.5.5, 1.6.2 >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > Labels: pull-request-available > > Currently, both the YarnFlinkResourceManager and YarnResourceManager do not > call removeContainerRequest() on container allocation success. Because the > YARN AM-RM protocol is not a delta protocol (please see YARN-1902), > AMRMClient will keep all ContainerRequests that are added and send them to RM. > In production, we observe the following that verifies the theory: 16 > containers are allocated and used upon cluster startup; when a TM is killed, > 17 containers are allocated, 1 container is used, and 16 excess containers > are returned; when another TM is killed, 18 containers are allocated, 1 > container is used, and 17 excess containers are returned. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-10761) MetricGroup#getAllVariables can deadlock
[ https://issues.apache.org/jira/browse/FLINK-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-10761. Resolution: Fixed Fix Version/s: 1.7.2 1.6.4 master: aa92438c4aa6218c5d7e67083152733271697089 1.7: 45d53d27e4dec934057404674e38c6eab2e66e62 1.6: cb85e62b5b122b7cbb9eeb72a116dd133467a920 > MetricGroup#getAllVariables can deadlock > > > Key: FLINK-10761 > URL: https://issues.apache.org/jira/browse/FLINK-10761 > Project: Flink > Issue Type: Bug > Components: Metrics >Affects Versions: 1.5.5, 1.6.2, 1.7.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Critical > Labels: pull-request-available > Fix For: 1.6.4, 1.7.2, 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{AbstractMetricGroup#getAllVariables}} acquires the locks of both the > current and all parent groups when assembling the variables map. This can > lead to a deadlock if metrics are registered concurrently on a child and > parent if the child registration is applied first and the reporter uses said > method (which many do). > Assume we have a MetricGroup Mc(hild) and Mp(arent). > 2 separate threads Tc and Tp each register a metric on their respective > group, acquiring the lock. > Let's assume that Tc has a slight headstart. > Tc will now call {{MetricRegistry#register}} first, acquiring the MR lock. > Tp will block on this lock. > Tc now iterates over all reporters calling > {{MetricReporter#notifyOfAddedMetric}}. Assume that in this method > {{MetricGroup#getAllVariables}} is called on Mc by Tc. > Tc still holds the lock to Mc, and attempts to acquire the lock to Mp. > The lock to Mp is still held by Tp however, which waits for the MR lock to be > released by Tc. > Thus a deadlock is created. This may deadlock anything, be it minor threads, > tasks, or entire components. > This has not surfaced so far since usually metrics are no longer added to a > group once children have been created (since the component initialization at > that point is complete). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zentol closed pull request #7347: [FLINK-10761][metrics] Do not acquire lock for getAllVariables()
zentol closed pull request #7347: [FLINK-10761][metrics] Do not acquire lock for getAllVariables() URL: https://github.com/apache/flink/pull/7347 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java b/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java index 4400b14a10d..fb321303222 100644 --- a/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java +++ b/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java @@ -111,17 +111,13 @@ public AbstractMetricGroup(MetricRegistry registry, String[] scope, A parent) { } public Map getAllVariables() { - if (variables == null) { // avoid synchronization for common case - synchronized (this) { - if (variables == null) { - Map tmpVariables = new HashMap<>(); - putVariables(tmpVariables); - if (parent != null) { // not true for Job-/TaskManagerMetricGroup - tmpVariables.putAll(parent.getAllVariables()); - } - variables = tmpVariables; - } + if (variables == null) { + Map tmpVariables = new HashMap<>(); + putVariables(tmpVariables); + if (parent != null) { // not true for Job-/TaskManagerMetricGroup + tmpVariables.putAll(parent.getAllVariables()); } + variables = tmpVariables; } return variables; } diff --git a/flink-runtime/src/test/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroupTest.java b/flink-runtime/src/test/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroupTest.java index f3f8b42b851..d750f63cfd8 100644 --- a/flink-runtime/src/test/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroupTest.java +++ b/flink-runtime/src/test/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroupTest.java @@ -21,17 +21,22 @@ import org.apache.flink.configuration.ConfigConstants; import org.apache.flink.configuration.Configuration; import org.apache.flink.configuration.MetricOptions; +import org.apache.flink.core.testutils.BlockerSync; import org.apache.flink.metrics.CharacterFilter; import org.apache.flink.metrics.Metric; import org.apache.flink.metrics.MetricGroup; import org.apache.flink.metrics.reporter.MetricReporter; +import org.apache.flink.runtime.metrics.MetricRegistry; import org.apache.flink.runtime.metrics.MetricRegistryConfiguration; import org.apache.flink.runtime.metrics.MetricRegistryImpl; import org.apache.flink.runtime.metrics.dump.QueryScopeInfo; +import org.apache.flink.runtime.metrics.scope.ScopeFormats; import org.apache.flink.runtime.metrics.util.TestReporter; import org.junit.Test; +import javax.annotation.Nullable; + import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertTrue; @@ -252,4 +257,89 @@ public void testScopeGenerationWithoutReporters() throws Exception { testRegistry.shutdown().get(); } } + + @Test + public void testGetAllVariablesDoesNotDeadlock() throws InterruptedException { + final TestMetricRegistry registry = new TestMetricRegistry(); + + final MetricGroup parent = new GenericMetricGroup(registry, UnregisteredMetricGroups.createUnregisteredTaskManagerMetricGroup(), "parent"); + final MetricGroup child = parent.addGroup("child"); + + final Thread parentRegisteringThread = new Thread(() -> parent.counter("parent_counter")); + final Thread childRegisteringThread = new Thread(() -> child.counter("child_counter")); + + final BlockerSync parentSync = new BlockerSync(); + final BlockerSync childSync = new BlockerSync(); + + try { + // start both threads and have them block in the registry, so they acquire the lock of their respective group + registry.setOnRegistrationAction(childSync::blockNonInterruptible); + childRegisteringThread.start(); + childSync.awaitBlocker(); + + registry.setOnRegistrationAction(parentSync::blockNonInterruptible); +
[GitHub] zentol closed pull request #7246: [FLINK-11023] Add LICENSE & NOTICE files for connectors (batch #1)
zentol closed pull request #7246: [FLINK-11023] Add LICENSE & NOTICE files for connectors (batch #1) URL: https://github.com/apache/flink/pull/7246 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-connectors/flink-connector-cassandra/src/main/resources/META-INF/NOTICE b/flink-connectors/flink-connector-cassandra/src/main/resources/META-INF/NOTICE new file mode 100644 index 000..a0449ff3007 --- /dev/null +++ b/flink-connectors/flink-connector-cassandra/src/main/resources/META-INF/NOTICE @@ -0,0 +1,16 @@ +flink-connector-cassandra +Copyright 2014-2018 The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). + +This project bundles the following dependencies under the Apache Software License 2.0. (http://www.apache.org/licenses/LICENSE-2.0.txt) + +- com.datastax.cassandra:cassandra-driver-core:3.0.0 +- com.datastax.cassandra:cassandra-driver-mapping:3.0.0 +- com.google.guava:guava:18.0 +- io.netty:netty-handler:4.0.33.Final +- io.netty:netty-buffer:4.0.33.Final +- io.netty:netty-common:4.0.33.Final +- io.netty:netty-transport:4.0.33.Final +- io.netty:netty-codec:4.0.33.Final diff --git a/flink-connectors/flink-connector-elasticsearch/src/main/resources/META-INF/NOTICE b/flink-connectors/flink-connector-elasticsearch/src/main/resources/META-INF/NOTICE index f588e0e873c..a58d21ae6c6 100644 --- a/flink-connectors/flink-connector-elasticsearch/src/main/resources/META-INF/NOTICE +++ b/flink-connectors/flink-connector-elasticsearch/src/main/resources/META-INF/NOTICE @@ -1,109 +1,28 @@ -This project includes software developed at -The Apache Software Foundation (http://www.apache.org/). - -- - -This project bundles the following dependencies under -the Apache Software License 2.0 - - - org.apache.lucene : lucene-core version 4.10.4 - - org.apache.lucene : lucene-analyzers-common version 4.10.4 - - org.apache.lucene : lucene-grouping version 4.10.4 - - org.apache.lucene : lucene-highlighter version 4.10.4 - - org.apache.lucene : lucene-join version 4.10.4 - - org.apache.lucene : lucene-memory version 4.10.4 - - org.apache.lucene : lucene-misc version 4.10.4 - - org.apache.lucene : lucene-queries version 4.10.4 - - org.apache.lucene : lucene-queryparser version 4.10.4 - - org.apache.lucene : lucene-sandbox version 4.10.4 - - org.apache.lucene : lucene-spatial version 4.10.4 - - org.apache.lucene : lucene-suggest version 4.10.4 - - com.spatial4j : spatial4j version 0.4.1 - - com.fasterxml.jackson.core : jackson-core version 2.5.3 - - com.fasterxml.jackson.dataformat : jackson-dataformat-smile version 2.5.3 - - com.fasterxml.jackson.dataformat : jackson-dataformat-yaml version 2.5.3 - - com.fasterxml.jackson.dataformat : jackson-dataformat-cbor version 2.5.3 - - org.joda : joda-convert (copied classes) - -=== - Notice for Yaml -=== - -This project bundles yaml (v. 1.12) under the Creative Commons License (CC-BY 2.0). - -Original project website: http://www.yaml.de - -Copyright (c) 2005-2013, Dirk Jesse - -YAML under Creative Commons License (CC-BY 2.0) -=== - -The YAML framework is published under the Creative Commons Attribution 2.0 License (CC-BY 2.0), which permits -both private and commercial use (http://creativecommons.org/licenses/by/2.0/). - -Condition: For the free use of the YAML framework, a backlink to the YAML homepage (http://www.yaml.de) in a -suitable place (e.g.: footer of the website or in the imprint) is required. - -=== - Notice for Tartarus -=== - -This project bundles tartarus under the MIT License. +flink-connector-elasticsearch +Copyright 2014-2018 The Apache Software Foundation -Original source repository: https://github.com/sergiooramas/tartarus - -Copyright (c) 2017 Sergio Oramas and Oriol Nieto - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
[jira] [Closed] (FLINK-11163) RestClusterClientTest#testSendIsNotRetriableIfHttpNotFound with BindException
[ https://issues.apache.org/jira/browse/FLINK-11163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11163. Resolution: Fixed master: 903a9090324f8d14af15621b7ec026f4a8485050 > RestClusterClientTest#testSendIsNotRetriableIfHttpNotFound with BindException > - > > Key: FLINK-11163 > URL: https://issues.apache.org/jira/browse/FLINK-11163 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.8.0 >Reporter: TisonKun >Assignee: Chesnay Schepler >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {quote} > 03:14:22.321 [ERROR] Tests run: 10, Failures: 0, Errors: 1, Skipped: 0, Time > elapsed: 3.189 s <<< FAILURE! - in > org.apache.flink.client.program.rest.RestClusterClientTest > 03:14:22.322 [ERROR] > testSendIsNotRetriableIfHttpNotFound(org.apache.flink.client.program.rest.RestClusterClientTest) > Time elapsed: 0.043 s <<< ERROR! > java.net.BindException: Address already in use > {quote} > https://api.travis-ci.org/v3/job/467812798/log.txt -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zentol closed pull request #7345: [FLINK-11163][tests] Use random port in RestClusterClientTest
zentol closed pull request #7345: [FLINK-11163][tests] Use random port in RestClusterClientTest URL: https://github.com/apache/flink/pull/7345 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/flink-clients/src/test/java/org/apache/flink/client/program/rest/RestClusterClientTest.java b/flink-clients/src/test/java/org/apache/flink/client/program/rest/RestClusterClientTest.java index 3e0f2f525a0..bc0432ea4ee 100644 --- a/flink-clients/src/test/java/org/apache/flink/client/program/rest/RestClusterClientTest.java +++ b/flink-clients/src/test/java/org/apache/flink/client/program/rest/RestClusterClientTest.java @@ -32,7 +32,6 @@ import org.apache.flink.runtime.client.JobStatusMessage; import org.apache.flink.runtime.clusterframework.ApplicationStatus; import org.apache.flink.runtime.concurrent.FutureUtils; -import org.apache.flink.runtime.dispatcher.Dispatcher; import org.apache.flink.runtime.dispatcher.DispatcherGateway; import org.apache.flink.runtime.jobgraph.JobGraph; import org.apache.flink.runtime.jobgraph.JobStatus; @@ -88,7 +87,9 @@ import org.apache.flink.runtime.rest.util.RestClientException; import org.apache.flink.runtime.rpc.RpcUtils; import org.apache.flink.runtime.util.ExecutorThreadFactory; +import org.apache.flink.runtime.webmonitor.TestingDispatcherGateway; import org.apache.flink.runtime.webmonitor.retriever.GatewayRetriever; +import org.apache.flink.util.ConfigurationException; import org.apache.flink.util.ExceptionUtils; import org.apache.flink.util.FlinkException; import org.apache.flink.util.OptionalFailure; @@ -105,8 +106,6 @@ import org.junit.Assert; import org.junit.Before; import org.junit.Test; -import org.mockito.Mock; -import org.mockito.MockitoAnnotations; import javax.annotation.Nonnull; @@ -150,16 +149,12 @@ */ public class RestClusterClientTest extends TestLogger { - @Mock - private Dispatcher mockRestfulGateway; + private final DispatcherGateway mockRestfulGateway = new TestingDispatcherGateway.Builder().build(); - @Mock private GatewayRetriever mockGatewayRetriever; private RestServerEndpointConfiguration restServerEndpointConfiguration; - private RestClusterClient restClusterClient; - private volatile FailHttpRequestPredicate failHttpRequest = FailHttpRequestPredicate.never(); private ExecutorService executor; @@ -167,28 +162,58 @@ private JobGraph jobGraph; private JobID jobId; - @Before - public void setUp() throws Exception { - MockitoAnnotations.initMocks(this); + private static final Configuration restConfig; + static { final Configuration config = new Configuration(); config.setString(JobManagerOptions.ADDRESS, "localhost"); config.setInteger(RestOptions.RETRY_MAX_ATTEMPTS, 10); config.setLong(RestOptions.RETRY_DELAY, 0); + config.setInteger(RestOptions.PORT, 0); + + restConfig = config; + } - restServerEndpointConfiguration = RestServerEndpointConfiguration.fromConfiguration(config); + @Before + public void setUp() throws Exception { + restServerEndpointConfiguration = RestServerEndpointConfiguration.fromConfiguration(restConfig); mockGatewayRetriever = () -> CompletableFuture.completedFuture(mockRestfulGateway); executor = Executors.newSingleThreadExecutor(new ExecutorThreadFactory(RestClusterClientTest.class.getSimpleName())); - final RestClient restClient = new RestClient(RestClientConfiguration.fromConfiguration(config), executor) { + + jobGraph = new JobGraph("testjob"); + jobId = jobGraph.getJobID(); + } + + @After + public void tearDown() { + if (executor != null) { + executor.shutdown(); + } + } + + private RestClusterClient createRestClusterClient(final int port) throws Exception { + final Configuration clientConfig = new Configuration(restConfig); + clientConfig.setInteger(RestOptions.PORT, port); + return new RestClusterClient<>( + clientConfig, + createRestClient(), + StandaloneClusterId.getInstance(), + (attempt) -> 0, + null); + } + + @Nonnull + private RestClient createRestClient() throws ConfigurationException { + return new RestClient(RestClientConfiguration.fromConfiguration(restConfig), executor) { @Override public , U extends
[GitHub] pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT
pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT URL: https://github.com/apache/flink/pull/6445#discussion_r245056827 ## File path: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarOperatorsTest.scala ## @@ -74,6 +74,238 @@ class ScalarOperatorsTest extends ScalarOperatorsTestBase { "true") } + @Test + def testShiftLeft(): Unit = { +testAllApis( + 3.shiftLeft(3), + "3.shiftLeft(3)", + "SHIFTLEFT(3,3)", + "24" +) + +testAllApis( + 2147483647.shiftLeft(-2147483648), + "2147483647.shiftLeft(-2147483648)", + "SHIFTLEFT(2147483647,-2147483648)", + "2147483647" +) + +testAllApis( + -2147483648.shiftLeft(2147483647), + "-2147483648.shiftLeft(2147483647)", + "SHIFTLEFT(-2147483648,2147483647)", + "0" +) + +testAllApis( + 9223372036854775807L.shiftLeft(-2147483648), + "9223372036854775807L.shiftLeft(-2147483648)", + "SHIFTLEFT(9223372036854775807,-2147483648)", + "9223372036854775807" +) + +testAllApis( + 'f3.shiftLeft(5), + "f3.shiftLeft(5)", + "SHIFTLEFT(f3,5)", + "32" +) + +testAllApis( + 1.shiftLeft(Null(Types.INT)), + "1.shiftLeft(Null(INT))", + "SHIFTLEFT(1, CAST(NULL AS INT))", + "null" +) + +testAllApis( // test tinyint Review comment: regarding the tinyint, smallint, int and bigint tests. I think we need the following tests that show the quirky nature of java bit shifts: ``` select cast(1 as tinyint) << 9, cast(1 as tinyint) << 17; select cast(1 as smallint) << 17, cast(1 as smallint) << 33; select 1 << 17, 1 << 33, select cast(1 as bigint) << 33, cast(1 as bigint) << 65; ``` expected results: ``` 0, 2 0, 2 131072, 2 8589934592, 2 ``` Bonus points for anyone that understands those results For me the most confusing part is why `select cast(1 as smallint) << 17` returns `0`, while `select 1 << 33` returns 2... Also we need the same test cases (shifting by 9, 17, 33 and 65) for both versions of right shifts, but instead of right shifting `1`, right shift min values (`Byte.MIN_VALUE`, `Short.MIN_VALUE`, ...) (and results should also be different) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT
pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT URL: https://github.com/apache/flink/pull/6445#discussion_r245056827 ## File path: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarOperatorsTest.scala ## @@ -74,6 +74,238 @@ class ScalarOperatorsTest extends ScalarOperatorsTestBase { "true") } + @Test + def testShiftLeft(): Unit = { +testAllApis( + 3.shiftLeft(3), + "3.shiftLeft(3)", + "SHIFTLEFT(3,3)", + "24" +) + +testAllApis( + 2147483647.shiftLeft(-2147483648), + "2147483647.shiftLeft(-2147483648)", + "SHIFTLEFT(2147483647,-2147483648)", + "2147483647" +) + +testAllApis( + -2147483648.shiftLeft(2147483647), + "-2147483648.shiftLeft(2147483647)", + "SHIFTLEFT(-2147483648,2147483647)", + "0" +) + +testAllApis( + 9223372036854775807L.shiftLeft(-2147483648), + "9223372036854775807L.shiftLeft(-2147483648)", + "SHIFTLEFT(9223372036854775807,-2147483648)", + "9223372036854775807" +) + +testAllApis( + 'f3.shiftLeft(5), + "f3.shiftLeft(5)", + "SHIFTLEFT(f3,5)", + "32" +) + +testAllApis( + 1.shiftLeft(Null(Types.INT)), + "1.shiftLeft(Null(INT))", + "SHIFTLEFT(1, CAST(NULL AS INT))", + "null" +) + +testAllApis( // test tinyint Review comment: regarding the tinyint, smallint, int and bigint tests. I think we need the following tests that show the quirky nature of java bit shifts: ``` select cast(1 as tinyint) << 9, cast(1 as tinyint) << 17; select cast(1 as smallint) << 17, cast(1 as smallint) << 33; select 1 << 17, 1 << 33, select cast(1 as bigint) << 33, cast(1 as bigint) << 65; ``` expected results: ``` 0, 2 0, 2 131072, 2 8589934592, 2 ``` Bonus points for anyone that understands those results Also we need the same test cases (shifting by 9, 17, 33 and 65) for both versions of right shifts, but instead of right shifting `1`, right shift min values (`Byte.MIN_VALUE`, `Short.MIN_VALUE`, ...) (and results should also be different) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT
pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT URL: https://github.com/apache/flink/pull/6445#discussion_r245056827 ## File path: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarOperatorsTest.scala ## @@ -74,6 +74,238 @@ class ScalarOperatorsTest extends ScalarOperatorsTestBase { "true") } + @Test + def testShiftLeft(): Unit = { +testAllApis( + 3.shiftLeft(3), + "3.shiftLeft(3)", + "SHIFTLEFT(3,3)", + "24" +) + +testAllApis( + 2147483647.shiftLeft(-2147483648), + "2147483647.shiftLeft(-2147483648)", + "SHIFTLEFT(2147483647,-2147483648)", + "2147483647" +) + +testAllApis( + -2147483648.shiftLeft(2147483647), + "-2147483648.shiftLeft(2147483647)", + "SHIFTLEFT(-2147483648,2147483647)", + "0" +) + +testAllApis( + 9223372036854775807L.shiftLeft(-2147483648), + "9223372036854775807L.shiftLeft(-2147483648)", + "SHIFTLEFT(9223372036854775807,-2147483648)", + "9223372036854775807" +) + +testAllApis( + 'f3.shiftLeft(5), + "f3.shiftLeft(5)", + "SHIFTLEFT(f3,5)", + "32" +) + +testAllApis( + 1.shiftLeft(Null(Types.INT)), + "1.shiftLeft(Null(INT))", + "SHIFTLEFT(1, CAST(NULL AS INT))", + "null" +) + +testAllApis( // test tinyint Review comment: regarding the tinyint, smallint, int and bigint tests. I think we need the following tests that show the quirky nature of java bit shifts: ``` select cast(1 as tinyint) << 9, cast(1 as tinyint) << 17; select cast(1 as smallint) << 17, cast(1 as smallint) << 33; select 1 << 17, 1 << 33, select cast(1 as bigint) << 33, cast(1 as bigint) << 65; ``` expected results: ``` 0, 2 0, 2 131072, 2 8589934592, 2 ``` Bonus points for anyone that understands those results Also we need the same for both versions of right shifts, but instead of right shifting `1`, right shift min values (`Byte.MIN_VALUE`, `Short.MIN_VALUE`, ...) (and results should also be different) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT
pnowojski commented on a change in pull request #6445: [FLINK-8302] [table] Add SHIFT_LEFT and SHIFT_RIGHT URL: https://github.com/apache/flink/pull/6445#discussion_r244966034 ## File path: docs/dev/table/functions.md ## @@ -1095,6 +1095,40 @@ MOD(numeric1, numeric2) + + +{% highlight text %} +SHIFTLEFT(numeric1, numeric2) +{% endhighlight %} + + +Returns numeric1 shifted left of numeric2. The result is numeric1 << numeric2 + Review comment: Here and for `SHIFTRIGHT` and `SHIFTRIGHTUNSIGNED ` please add the information about the return type: ``` The return type is the same as numeric1. ``` and please also state that the semantic of those operations is the same as equivalent java operation and casting it to the return type. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-11261) BlobServer moves file with open OutputStream
[ https://issues.apache.org/jira/browse/FLINK-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vinoyang reassigned FLINK-11261: Assignee: vinoyang > BlobServer moves file with open OutputStream > > > Key: FLINK-11261 > URL: https://issues.apache.org/jira/browse/FLINK-11261 > Project: Flink > Issue Type: Bug > Components: Local Runtime >Affects Versions: 1.8.0 >Reporter: Chesnay Schepler >Assignee: vinoyang >Priority: Major > Fix For: 1.8.0 > > > Various tests fail on Windows because the BlobServer attempts to move a file > while a {{FileOutputStream}} is still open: > BlobServer#putInputStream(): > {code} > try (FileOutputStream fos = new FileOutputStream(incomingFile)) { > [ ... use fos ... ] > // moves file even though fos is still open > blobKey = moveTempFileToStore(incomingFile, jobId, md.digest(), > blobType); > return blobKey; > } finally { > ... > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11064) Setup a new flink-table module structure
[ https://issues.apache.org/jira/browse/FLINK-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11064: --- Labels: pull-request-available (was: ) > Setup a new flink-table module structure > > > Key: FLINK-11064 > URL: https://issues.apache.org/jira/browse/FLINK-11064 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > Labels: pull-request-available > > This issue covers the first step of the migration plan mentioned in > [FLIP-28|https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free]. > Move all files to their corresponding modules as they are. No migration > happens at this stage. Modules might contain both Scala and Java classes. > Classes that should be placed in `flink-table-common` but are in Scala so far > remain in `flink-table-api-planner` for now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] twalthr opened a new pull request #7409: [FLINK-11064] [table] Setup a new flink-table module structure
twalthr opened a new pull request #7409: [FLINK-11064] [table] Setup a new flink-table module structure URL: https://github.com/apache/flink/pull/7409 ## What is the purpose of the change This commit splits the `flink-table` module into multiple submodules in accordance with FLIP-28 (step 1). The new module structure looks as follows: ``` flink-table-common ^ | flink-table-api-base ^ | flink-table-api-java/scala <-- flink-table-planner --> flink-table-runtime | `-> Calcite ``` The module `flink-table-planner` contains the content of the old `flink-table` module. From there we can distribute ported classes to their final module without breaking backwards compatibility or force users to update their dependencies again. If a user wants to implement a table program in Scala, `flink-table-api-scala` and `flink-table-planner` need to be added to the project. If a user wants to implement a UDF or format, `flink-table-common` needs to be added. If a user wants to implement a table source, `flink-table-api-java` and `flink-table-planner` need to be added for now. But soon only `flink-table-api-java` is required (see FLIP-28 step 5). All Flink modules have been updated to the new dependency design. End-to-end tests succeed and docs have been updated. ## Brief change log - Update modules `flink-table` modules ## Verifying this change This change is already covered by existing tests, such as SQL Client, Streaming SQL e2e tests. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? docs This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zentol commented on issue #7346: [FLINK-11134][rest] Do not log stacktrace handled exceptions
zentol commented on issue #7346: [FLINK-11134][rest] Do not log stacktrace handled exceptions URL: https://github.com/apache/flink/pull/7346#issuecomment-451198244 Yeah the error handling is really all over the place. I've consolidated it in `AbstractHandler` as you suggested. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11064) Setup a new flink-table module structure
[ https://issues.apache.org/jira/browse/FLINK-11064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timo Walther updated FLINK-11064: - Description: This issue covers the first step of the migration plan mentioned in [FLIP-28|https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free]. Move all files to their corresponding modules as they are. No migration happens at this stage. Modules might contain both Scala and Java classes. Classes that should be placed in `flink-table-common` but are in Scala so far remain in `flink-table-api-planner` for now. was: This issue covers the first step of the migration plan mentioned in [FLIP-28|https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free]. Move all files to their corresponding modules as they are. No migration happens at this stage. Modules might contain both Scala and Java classes. Classes that should be placed in `flink-table-spi` but are in Scala so far remain in `flink-table-api-base` for now. > Setup a new flink-table module structure > > > Key: FLINK-11064 > URL: https://issues.apache.org/jira/browse/FLINK-11064 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Timo Walther >Priority: Major > > This issue covers the first step of the migration plan mentioned in > [FLIP-28|https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free]. > Move all files to their corresponding modules as they are. No migration > happens at this stage. Modules might contain both Scala and Java classes. > Classes that should be placed in `flink-table-common` but are in Scala so far > remain in `flink-table-api-planner` for now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11262) Bump jython-standalone to 2.7.1
[ https://issues.apache.org/jira/browse/FLINK-11262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11262: --- Labels: pull-request-available (was: ) > Bump jython-standalone to 2.7.1 > --- > > Key: FLINK-11262 > URL: https://issues.apache.org/jira/browse/FLINK-11262 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > > Due to security issue: > https://ossindex.sonatype.org/vuln/7a4be7b3-74f5-4a9b-a24f-d1fd80ed6bbca -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zentol commented on issue #7347: [FLINK-10761][metrics] Do not acquire lock for getAllVariables()
zentol commented on issue #7347: [FLINK-10761][metrics] Do not acquire lock for getAllVariables() URL: https://github.com/apache/flink/pull/7347#issuecomment-451193997 Well _performance_ of course. if no reporter is configured and the WebUI is not used we would spend memory on maps and strings that are never gonna be used. Personally I don't think this is worth the additional headache, but it is the conventions we went with since the metric system was added. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Fokko opened a new pull request #7408: [FLINK-11262] Bump jython-standalone to 2.7.1
Fokko opened a new pull request #7408: [FLINK-11262] Bump jython-standalone to 2.7.1 URL: https://github.com/apache/flink/pull/7408 Bump the jython depdendency because of a security issue. https://ossindex.sonatype.org/vuln/7a4be7b3-74f5-4a9b-a24f-d1fd80ed6bbc ## What is the purpose of the change Bump minor version to patch the a security issue. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11261) BlobServer moves file with open OutputStream
Chesnay Schepler created FLINK-11261: Summary: BlobServer moves file with open OutputStream Key: FLINK-11261 URL: https://issues.apache.org/jira/browse/FLINK-11261 Project: Flink Issue Type: Bug Components: Local Runtime Affects Versions: 1.8.0 Reporter: Chesnay Schepler Fix For: 1.8.0 Various tests fail on Windows because the BlobServer attempts to move a file while a {{FileOutputStream}} is still open: BlobServer#putInputStream(): {code} try (FileOutputStream fos = new FileOutputStream(incomingFile)) { [ ... use fos ... ] // moves file even though fos is still open blobKey = moveTempFileToStore(incomingFile, jobId, md.digest(), blobType); return blobKey; } finally { ... } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11262) Bump jython-standalone to 2.7.1
Fokko Driesprong created FLINK-11262: Summary: Bump jython-standalone to 2.7.1 Key: FLINK-11262 URL: https://issues.apache.org/jira/browse/FLINK-11262 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Fokko Driesprong Assignee: Fokko Driesprong Due to security issue: https://ossindex.sonatype.org/vuln/7a4be7b3-74f5-4a9b-a24f-d1fd80ed6bbca -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] tillrohrmann commented on issue #7246: [FLINK-11023] Add LICENSE & NOTICE files for connectors (batch #1)
tillrohrmann commented on issue #7246: [FLINK-11023] Add LICENSE & NOTICE files for connectors (batch #1) URL: https://github.com/apache/flink/pull/7246#issuecomment-451190133 Thanks for addressing my comments @zentol. +1 for merging. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11260) Bump Janino compiler dependency
[ https://issues.apache.org/jira/browse/FLINK-11260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong updated FLINK-11260: - Description: Bump the Janino dependency: http://janino-compiler.github.io/janino/changelog.html (was: Bump the Janino depdency: http://janino-compiler.github.io/janino/changelog.html) > Bump Janino compiler dependency > --- > > Key: FLINK-11260 > URL: https://issues.apache.org/jira/browse/FLINK-11260 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.7.1 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2 > > Time Spent: 10m > Remaining Estimate: 0h > > Bump the Janino dependency: > http://janino-compiler.github.io/janino/changelog.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11260) Bump Janino compiler dependency
[ https://issues.apache.org/jira/browse/FLINK-11260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fokko Driesprong updated FLINK-11260: - Description: Bump the Janino depdency: http://janino-compiler.github.io/janino/changelog.html (was: Bump the commons-compiler) > Bump Janino compiler dependency > --- > > Key: FLINK-11260 > URL: https://issues.apache.org/jira/browse/FLINK-11260 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.7.1 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2 > > Time Spent: 10m > Remaining Estimate: 0h > > Bump the Janino depdency: > http://janino-compiler.github.io/janino/changelog.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11259) Bump Zookeeper dependency to 3.4.13
[ https://issues.apache.org/jira/browse/FLINK-11259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11259: --- Labels: pull-request-available (was: ) > Bump Zookeeper dependency to 3.4.13 > --- > > Key: FLINK-11259 > URL: https://issues.apache.org/jira/browse/FLINK-11259 > Project: Flink > Issue Type: Improvement > Components: Cluster Management >Affects Versions: 1.7.1 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2 > > > Bump Zookeeper to 3.4.13 > https://zookeeper.apache.org/doc/r3.4.13/releasenotes.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11260) Bump Janino compiler dependency
[ https://issues.apache.org/jira/browse/FLINK-11260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11260: --- Labels: pull-request-available (was: ) > Bump Janino compiler dependency > --- > > Key: FLINK-11260 > URL: https://issues.apache.org/jira/browse/FLINK-11260 > Project: Flink > Issue Type: Improvement > Components: Core >Affects Versions: 1.7.1 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 1.7.2 > > > Bump the commons-compiler -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko opened a new pull request #7407: [FLINK-11260] Bump the Janino compiler
Fokko opened a new pull request #7407: [FLINK-11260] Bump the Janino compiler URL: https://github.com/apache/flink/pull/7407 Fixes minor (NPE) issues and performance improvoments: http://janino-compiler.github.io/janino/changelog.html ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log Bump the Janino compiler dependency ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11260) Bump Janino compiler dependency
Fokko Driesprong created FLINK-11260: Summary: Bump Janino compiler dependency Key: FLINK-11260 URL: https://issues.apache.org/jira/browse/FLINK-11260 Project: Flink Issue Type: Improvement Components: Core Affects Versions: 1.7.1 Reporter: Fokko Driesprong Assignee: Fokko Driesprong Fix For: 1.7.2 Bump the commons-compiler -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko opened a new pull request #7406: [FLINK-11259] Bump Zookeeper to 3.4.13
Fokko opened a new pull request #7406: [FLINK-11259] Bump Zookeeper to 3.4.13 URL: https://github.com/apache/flink/pull/7406 The patched version provides some hotfixes: https://zookeeper.apache.org/doc/r3.4.13/releasenotes.html ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. - Dependencies (does it add or upgrade a dependency): yes - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: yes - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11249) FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
[ https://issues.apache.org/jira/browse/FLINK-11249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11249: --- Labels: pull-request-available (was: ) > FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer > --- > > Key: FLINK-11249 > URL: https://issues.apache.org/jira/browse/FLINK-11249 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Affects Versions: 1.7.0, 1.7.1 >Reporter: Piotr Nowojski >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > > As reported by a user on the mailing list "How to migrate Kafka Producer ?" > (on 18th December 2018), {{FlinkKafkaProducer011}} can not be migrated to > {{FlinkKafkaProducer}} and the same problem can occur in the future Kafka > producer versions/refactorings. > The issue is that {{ListState > FlinkKafkaProducer#nextTransactionalIdHintState}} field is serialized using > java serializers and this is causing problems/collisions on > {{FlinkKafkaProducer011.NextTransactionalIdHint}} vs > {{FlinkKafkaProducer.NextTransactionalIdHint}}. > To fix that we probably need to release new versions of those classes, that > will rewrite/upgrade this state field to a new one, that doesn't relay on > java serialization. After this, we could drop the support for the old field > and that in turn will allow users to upgrade from 0.11 connector to the > universal one. > One bright side is that technically speaking our {{FlinkKafkaProducer011}} > has the same compatibility matrix as the universal one (it's also forward & > backward compatible with the same Kafka versions), so for the time being > users can stick to {{FlinkKafkaProducer011}}. > FYI [~tzulitai] [~yanghua] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] yanghua opened a new pull request #7405: [FLINK-11249] FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
yanghua opened a new pull request #7405: [FLINK-11249] FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer URL: https://github.com/apache/flink/pull/7405 ## What is the purpose of the change *This pull request fixed FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer* ## Brief change log - *Implemented a new `TypeSerializer` for state class `NextTransactionalIdHint`* ## Verifying this change This change added tests and can be verified as follows: - *NextTransactionalIdHintSerializerTest* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (**yes** / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / **not documented**) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11259) Bump Zookeeper dependency to 3.4.13
Fokko Driesprong created FLINK-11259: Summary: Bump Zookeeper dependency to 3.4.13 Key: FLINK-11259 URL: https://issues.apache.org/jira/browse/FLINK-11259 Project: Flink Issue Type: Improvement Components: Cluster Management Affects Versions: 1.7.1 Reporter: Fokko Driesprong Assignee: Fokko Driesprong Fix For: 1.7.2 Bump Zookeeper to 3.4.13 https://zookeeper.apache.org/doc/r3.4.13/releasenotes.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11258) Add badge to the README
[ https://issues.apache.org/jira/browse/FLINK-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11258: --- Labels: pull-request-available (was: ) > Add badge to the README > --- > > Key: FLINK-11258 > URL: https://issues.apache.org/jira/browse/FLINK-11258 > Project: Flink > Issue Type: Improvement > Components: Documentation >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > > I think we should add the badge to the docs to check if master is still happy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko opened a new pull request #7404: [FLINK-11258] Add badge to the README
Fokko opened a new pull request #7404: [FLINK-11258] Add badge to the README URL: https://github.com/apache/flink/pull/7404 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11258) Add badge to the README
Fokko Driesprong created FLINK-11258: Summary: Add badge to the README Key: FLINK-11258 URL: https://issues.apache.org/jira/browse/FLINK-11258 Project: Flink Issue Type: Improvement Components: Documentation Reporter: Fokko Driesprong Assignee: Fokko Driesprong I think we should add the badge to the docs to check if master is still happy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-10960) CEP: Job Failure when .times(2) is used
[ https://issues.apache.org/jira/browse/FLINK-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Wozniakowski closed FLINK-10960. --- Resolution: Workaround Release Note: It seems this issue was as David described, due to restoring into a smaller state machine than had existed before > CEP: Job Failure when .times(2) is used > --- > > Key: FLINK-10960 > URL: https://issues.apache.org/jira/browse/FLINK-10960 > Project: Flink > Issue Type: Bug > Components: CEP >Affects Versions: 1.6.2 >Reporter: Thomas Wozniakowski >Priority: Critical > > Hi Guys, > Encountered a strange one today. We use the CEP library in a configurable way > where we plug a config file into the Flink Job JAR and it programmatically > sets up a bunch of CEP operators matching the config file. > I encountered a strange bug when I was testing with some artificially low > numbers in our testing environment today. The CEP code we're using (modified > slightly) is: > {code:java} > Pattern.begin(EVENT_SEQUENCE, AfterMatchSkipStrategy.skipPastLastEvent()) > .times(config.getNumberOfUniqueEvents()) > .where(uniquenessCheckOnAlreadyMatchedEvents()) > .within(seconds(config.getWithinSeconds())); > {code} > When using the {{numberOfUniqueEvents: 2}}, I started seeing the following > error killing the job whenever a match was detected: > {quote} > ava.lang.RuntimeException: Exception occurred while processing valve output > watermark: > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:265) > at > org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:189) > at > org.apache.flink.streaming.runtime.streamstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:111) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:184) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.flink.util.FlinkRuntimeException: State eventSequence:2 > does not exist in the NFA. NFA has states [Final State $endState$ [ > ]), Normal State eventSequence [ > StateTransition(TAKE, from eventSequenceto $endState$, with condition), > StateTransition(IGNORE, from eventSequenceto eventSequence, with > condition), > ]), Start State eventSequence:0 [ > StateTransition(TAKE, from eventSequence:0to eventSequence, with > condition), > ])] > at org.apache.flink.cep.nfa.NFA.isStartState(NFA.java:144) > at org.apache.flink.cep.nfa.NFA.isStateTimedOut(NFA.java:270) > at org.apache.flink.cep.nfa.NFA.advanceTime(NFA.java:244) > at > org.apache.flink.cep.operator.AbstractKeyedCEPPatternOperator.advanceTime(AbstractKeyedCEPPatternOperator.java:389) > at > org.apache.flink.cep.operator.AbstractKeyedCEPPatternOperator.onEventTime(AbstractKeyedCEPPatternOperator.java:293) > at > org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:251) > at > org.apache.flink.streaming.api.operators.InternalTimeServiceManager.advanceWatermark(InternalTimeServiceManager.java:128) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:746) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor$ForwardingValveOutputHandler.handleWatermark(StreamInputProcessor.java:262) > {quote} > Changing the config to {{numberOfUniqueEvents: 3}} fixed the problem. > Changing it back to 2 brought the problem back. It seems to be specifically > related to the value of 2. > This is not a blocking issue for me because we typically use much higher > numbers than this in production anyway, but I figured you guys might want to > know about this issue. > Let me know if you need any more information. > Tom -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] tillrohrmann commented on a change in pull request #7345: [FLINK-11163][tests] Use random port in RestClusterClientTest
tillrohrmann commented on a change in pull request #7345: [FLINK-11163][tests] Use random port in RestClusterClientTest URL: https://github.com/apache/flink/pull/7345#discussion_r245025401 ## File path: flink-clients/src/test/java/org/apache/flink/client/program/rest/RestClusterClientTest.java ## @@ -167,20 +163,25 @@ private JobGraph jobGraph; private JobID jobId; - @Before - public void setUp() throws Exception { - MockitoAnnotations.initMocks(this); + private static final Configuration restConfig; + static { final Configuration config = new Configuration(); config.setString(JobManagerOptions.ADDRESS, "localhost"); config.setInteger(RestOptions.RETRY_MAX_ATTEMPTS, 10); config.setLong(RestOptions.RETRY_DELAY, 0); + config.setInteger(RestOptions.PORT, 0); + + restConfig = config; + } - restServerEndpointConfiguration = RestServerEndpointConfiguration.fromConfiguration(config); + @Before + public void setUp() throws Exception { + restServerEndpointConfiguration = RestServerEndpointConfiguration.fromConfiguration(restConfig); mockGatewayRetriever = () -> CompletableFuture.completedFuture(mockRestfulGateway); executor = Executors.newSingleThreadExecutor(new ExecutorThreadFactory(RestClusterClientTest.class.getSimpleName())); - final RestClient restClient = new RestClient(RestClientConfiguration.fromConfiguration(config), executor) { + restClient = new RestClient(RestClientConfiguration.fromConfiguration(restConfig), executor) { Review comment: I think we should only create a `RestClient` if we also create a `RestClusterClient`. Otherwise, we might not properly close the `RestClient` (e.g. in the `testRESTManualConfigurationOverride` test). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11220) Can not Select row time field in JOIN query
[ https://issues.apache.org/jira/browse/FLINK-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733129#comment-16733129 ] Fabian Hueske commented on FLINK-11220: --- Hi, I think manually assigning timestamps and watermarks should be the last resort. Users would need to have detailed knowledge about the execution strategy of the window join (or any other operator that affects timestamps and watermarks) to make a sound choice about the watermark generation strategy. In fact, I don't think that an average user will be able to choose the right watermark generation strategy. Hence, I think we should always try to forward watermarks and time attributes as record timestamps. There are a few things that we could do: 1) Add a query configuration switch to drop all timestamps and watermarks and keep the CAST option to choose the timestamp from multiple time attributes. 2) Add a method (or method variant) to choose the timestamp that should be forwarded (in case more than one time attribute is available) and a method (or method variant) to drop all time attributes and watermarks. What do you think? Fabian > Can not Select row time field in JOIN query > --- > > Key: FLINK-11220 > URL: https://issues.apache.org/jira/browse/FLINK-11220 > Project: Flink > Issue Type: Bug > Components: Table API SQL >Affects Versions: 1.8.0 >Reporter: sunjincheng >Priority: Major > > SQL: > {code:java} > Orders...toTable(tEnv, 'orderId, 'orderTime.rowtime) > Payment...toTable(tEnv, 'orderId, 'payTime.rowtime) > SELECT orderTime, o.orderId, payTime > FROM Orders AS o JOIN Payment AS p > ON o.orderId = p.orderId AND > p.payTime BETWEEN orderTime AND orderTime + INTERVAL '1' HOUR > {code} > Execption: > {code:java} > org.apache.flink.table.api.TableException: Found more than one rowtime field: > [orderTime, payTime] in the table that should be converted to a DataStream. > Please select the rowtime field that should be used as event-time timestamp > for the DataStream by casting all other fields to TIMESTAMP. > at > org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:906) > {code} > The reason for the error is that we have 2 time fields `orderTime` and > `payTime`. I think we do not need throw the exception, and we can remove > the logic of `plan.process(new OutputRowtimeProcessFunction[A](conversion, > rowtimeFields.head.getIndex))`, if we want using the timestamp after > toDataSteram, we should using `assignTimestampsAndWatermarks()`. > What do you think ? [~twalthr] [~fhueske] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] lamber-ken commented on issue #7396: [FLINK-11250][streaming] fix thread lack when StreamTask switched from DEPLOYING to CANCELING
lamber-ken commented on issue #7396: [FLINK-11250][streaming] fix thread lack when StreamTask switched from DEPLOYING to CANCELING URL: https://github.com/apache/flink/pull/7396#issuecomment-451164639 @zentol,I close the streamRecordWriters in `StreamTask#cancel` method and test it at my flink cluster. it works well now. Best, lamber-ken. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11256) Referencing StreamNode objects directly in StreamEdge causes the sizes of JobGraph and TDD to become unnecessarily large
[ https://issues.apache.org/jira/browse/FLINK-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11256: --- Labels: pull-request-available (was: ) > Referencing StreamNode objects directly in StreamEdge causes the sizes of > JobGraph and TDD to become unnecessarily large > > > Key: FLINK-11256 > URL: https://issues.apache.org/jira/browse/FLINK-11256 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.7.0, 1.7.1 >Reporter: Haibo Suen >Assignee: Haibo Suen >Priority: Major > Labels: pull-request-available > > When a job graph is generated from StreamGraph, StreamEdge(s) on the stream > graph are serialized to StreamConfig and stored into the job graph. After > that, the serialized bytes will be included in the TDD and distributed to TM. > Because StreamEdge directly reference to StreamNode objects including > sourceVertex and targetVertex, these objects are also written transitively on > serializing StreamEdge. But these StreamNode objects are not needed in JM and > Task. For a large size topology, this will causes JobGraph/TDD to become much > larger than that actually need, and more likely to occur rpc timeout when > transmitted. > In StreamEdge, only the ID of StreamNode should be stored to avoid this > situation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] sunhaibotb opened a new pull request #7403: [FLINK-11256] Replace the reference of StreamNode object with ID in S…
sunhaibotb opened a new pull request #7403: [FLINK-11256] Replace the reference of StreamNode object with ID in S… URL: https://github.com/apache/flink/pull/7403 ## What is the purpose of the change This pull request replaces the reference of StreamNode object with ID in StreamEdge to reduce the sizes of JobGraph and TDD. Those referenced objects including sourceVertex and targetVertex will be written transitively on serializing StreamEdge, but they are not needed in JM and Task. For a large size topology, this will causes JobGraph and TDD to become much larger than that actually need, and more likely to occur rpc timeout when transmitted. ## Brief change log - Replaces the reference of StreamNode object with ID in StreamEdge - Migrates methods getSourceVertex() and getTargetVertex() in StreamEdge to StreamGraph ## Verifying this change This change is already covered by existing tests, such as StreamGraphGeneratorTest, StreamingJobGraphGeneratorTest, etc. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? ( no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11256) Referencing StreamNode objects directly in StreamEdge causes the sizes of JobGraph and TDD to become unnecessarily large
[ https://issues.apache.org/jira/browse/FLINK-11256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733025#comment-16733025 ] Haibo Suen commented on FLINK-11256: The simplest solution is to put the ID of StreamNode in StreamEdge. But as Till said, introducing a runtime StreamEdge type is cleaner and more reasonable. Also, I think the StreamConfig type is confusing. The chained head StreamConfig contains the information about task-wide, while the non-head StreamConfig contains the information about operator-wide. So I consider adopting the simplest solution to make a pull request for this JIRA, then creating another JIRA issue to introduce a runtime StreamEdge type and to rework StreamConfig. That issue is bigger than this. What do you think? > Referencing StreamNode objects directly in StreamEdge causes the sizes of > JobGraph and TDD to become unnecessarily large > > > Key: FLINK-11256 > URL: https://issues.apache.org/jira/browse/FLINK-11256 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.7.0, 1.7.1 >Reporter: Haibo Suen >Assignee: Haibo Suen >Priority: Major > > When a job graph is generated from StreamGraph, StreamEdge(s) on the stream > graph are serialized to StreamConfig and stored into the job graph. After > that, the serialized bytes will be included in the TDD and distributed to TM. > Because StreamEdge directly reference to StreamNode objects including > sourceVertex and targetVertex, these objects are also written transitively on > serializing StreamEdge. But these StreamNode objects are not needed in JM and > Task. For a large size topology, this will causes JobGraph/TDD to become much > larger than that actually need, and more likely to occur rpc timeout when > transmitted. > In StreamEdge, only the ID of StreamNode should be stored to avoid this > situation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] KarmaGYZ opened a new pull request #7402: [hotfix][docs] Fulfill some empty description in Config Doc
KarmaGYZ opened a new pull request #7402: [hotfix][docs] Fulfill some empty description in Config Doc URL: https://github.com/apache/flink/pull/7402 ## What is the purpose of the change This PR fulfill some empty description in Config Doc and relevance javadoc. ## Brief change log Fulfill empty description in - common section - core configuration section - history server configuration - job manager configuration - metric configuration - resource manager configuration - task manager configuration - web configuration javadoc in ResourceManagerOptions.java ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts(all no): - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? no This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dawidwys commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc
dawidwys commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc URL: https://github.com/apache/flink/pull/7380#discussion_r244991333 ## File path: docs/dev/libs/cep.md ## @@ -1324,7 +1324,7 @@ For example, for a given pattern `b+ c` and a data stream `b1 b2 b3 c`, the diff Have a look also at another example to better see the difference between NO_SKIP and SKIP_TO_FIRST: -Pattern: `(a | c) (b | c) c+.greedy d` and sequence: `a b c1 c2 c3 d` Then the results will be: +Pattern: `(a | b | c) (b | c) c+.greedy d` and sequence: `a b c1 c2 c3 d` Then the results will be: Review comment: You're right. :( I'm afraid that if we remove the `(b | c)` it will not differ from `SKIP_TO_NEXT` which I would prefer to keep distinguished. Let's keep your suggestion from this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] KarmaGYZ commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc
KarmaGYZ commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc URL: https://github.com/apache/flink/pull/7380#discussion_r244988094 ## File path: docs/dev/libs/cep.md ## @@ -1373,7 +1372,7 @@ Pattern: `a b+` and sequence: `a b1 b2 b3` Then the results will be: After found matching a b1, the match process will not discard any result. -SKIP_TO_NEXT[b*] +SKIP_TO_NEXT[b] Review comment: Good Catch! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] KarmaGYZ commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc
KarmaGYZ commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc URL: https://github.com/apache/flink/pull/7380#discussion_r244988003 ## File path: docs/dev/libs/cep.md ## @@ -1324,7 +1324,7 @@ For example, for a given pattern `b+ c` and a data stream `b1 b2 b3 c`, the diff Have a look also at another example to better see the difference between NO_SKIP and SKIP_TO_FIRST: -Pattern: `(a | c) (b | c) c+.greedy d` and sequence: `a b c1 c2 c3 d` Then the results will be: +Pattern: `(a | b | c) (b | c) c+.greedy d` and sequence: `a b c1 c2 c3 d` Then the results will be: Review comment: That was my first idea. But it seems that this pattern could not match c1 c2 c3 d too. IMO, just skip the middle sequence while keeping the first and the last could help to understand. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11222) Change api.scala.DataStream to api.datastream.DataStream for createHarnessTester in HarnessTestBase
[ https://issues.apache.org/jira/browse/FLINK-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732989#comment-16732989 ] Hequn Cheng commented on FLINK-11222: - [~aljoscha] Great to have your suggestions! > Change api.scala.DataStream to api.datastream.DataStream for > createHarnessTester in HarnessTestBase > --- > > Key: FLINK-11222 > URL: https://issues.apache.org/jira/browse/FLINK-11222 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Hequn Cheng >Assignee: Dian Fu >Priority: Major > > Thanks to FLINK-11074, we can create harness tester from a DataStream which > makes easier to write harness test. > However, it would be better if we change the parameter type from > api.scala.DataStream to api.datastream.DataStream for the > \{{createHarnessTester()}} method, so that both java.DataStream and > scala.DataStream can use this method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11230) Sum of FlinkSql after two table union all.The value is too large.
[ https://issues.apache.org/jira/browse/FLINK-11230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732986#comment-16732986 ] Hequn Cheng commented on FLINK-11230: - [~jiweiautohome] The bottom aggregate after window will still output values bigger than what you expected even for a single stream. If you want to get early fired results, you can use non-window aggregate. For example, you can write sql as follows: {code:java} SELECT minuteTime, count(*) AS pv FROM ( SELECT getMinute(stime) as minuteTime FROM flink_test1 UNION ALL SELECT getMinute(stime) as minuteTime FROM flink_test2 ) t GROUP BY minuteTime {code} > Sum of FlinkSql after two table union all.The value is too large. > - > > Key: FLINK-11230 > URL: https://issues.apache.org/jira/browse/FLINK-11230 > Project: Flink > Issue Type: Bug > Components: Table API SQL >Affects Versions: 1.7.0 >Reporter: jiwei >Priority: Blocker > Labels: test > Attachments: image-2019-01-02-14-18-33-890.png, > image-2019-01-02-14-18-43-710.png, screenshot-1.png > > > SELECT k AS KEY, SUM(p) AS pv > FROM ( > SELECT tumble_start(stime, INTERVAL '1' minute) AS k > , COUNT(*) AS p > FROM flink_test1 > GROUP BY tumble(stime, INTERVAL '1' minute) > UNION ALL > SELECT tumble_start(stime, INTERVAL '1' minute) AS k > , COUNT(*) AS p > FROM flink_test2 > GROUP BY tumble(stime, INTERVAL '1' minute) > ) t > GROUP BY k > The Result of executing this sql is about 7000 per minute and keeping > increasing.But the result is 60 per minute for per table.Is there an error in > my SQL statement? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11187) StreamingFileSink with S3 backend transient socket timeout issues
[ https://issues.apache.org/jira/browse/FLINK-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-11187: -- Fix Version/s: 1.8.0 > StreamingFileSink with S3 backend transient socket timeout issues > -- > > Key: FLINK-11187 > URL: https://issues.apache.org/jira/browse/FLINK-11187 > Project: Flink > Issue Type: Bug > Components: FileSystem, Streaming Connectors >Affects Versions: 1.7.0, 1.7.1 >Reporter: Addison Higham >Assignee: Addison Higham >Priority: Major > Fix For: 1.7.2, 1.8.0 > > > When using the StreamingFileSink with S3A backend, occasionally, errors like > this will occur: > {noformat} > Caused by: > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: > Your socket connection to the server was not read from or written to within > the timeout period. Idle connections will be closed. (Service: Amazon S3; > Status Code: 400; Error Code: RequestTimeout; Request ID: xxx; S3 Extended > Request ID: xxx, S3 Extended Request ID: xxx > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056){noformat} > This causes a restart of flink job, which is often able to recover from, but > under heavy load, this can become very frequent. > > Turning on debug logs you can find the following relevant stack trace: > {noformat} > 2018-12-17 05:55:46,546 DEBUG > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient - FYI: > failed to reset content inputstream before throwing up > java.io.IOException: Resetting to invalid mark > at java.io.BufferedInputStream.reset(BufferedInputStream.java:448) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:168) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.lastReset(AmazonHttpClient.java:1145) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1070) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3306) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3291) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:1576) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$8(WriteOperationHelper.java:474) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256) > at >
[jira] [Commented] (FLINK-8823) Add network profiling/diagnosing metrics.
[ https://issues.apache.org/jira/browse/FLINK-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732964#comment-16732964 ] boshu Zheng commented on FLINK-8823: Haven't started working on this issue, do you have any idea yet? [~pnowojski] > Add network profiling/diagnosing metrics. > - > > Key: FLINK-8823 > URL: https://issues.apache.org/jira/browse/FLINK-8823 > Project: Flink > Issue Type: Sub-task > Components: Network >Affects Versions: 1.5.0 >Reporter: Piotr Nowojski >Assignee: boshu Zheng >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kisimple opened a new pull request #7401: [hotfix][doc] Fix an error in state_backends
kisimple opened a new pull request #7401: [hotfix][doc] Fix an error in state_backends URL: https://github.com/apache/flink/pull/7401 Class of state backend factory to be implemented is `StateBackendFactory`, not `FsStateBackendFactory`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11254) Unify serialization format of savepoint for switching state backends
[ https://issues.apache.org/jira/browse/FLINK-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732948#comment-16732948 ] Congxian Qiu commented on FLINK-11254: -- Thanks for your quick response [~tzulitai], I'll first write the document to clarify the format of each different backend/state structure as you suggested. > Unify serialization format of savepoint for switching state backends > > > Key: FLINK-11254 > URL: https://issues.apache.org/jira/browse/FLINK-11254 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.7.1 >Reporter: Congxian Qiu >Assignee: Congxian Qiu >Priority: Major > > For the current version, the serialization formats of savepoint between > HeapKeyedStateBackend and RocksDBStateBackend are different, so we can not > switch state backend when using savepoint. We should unify the serialization > formats of the savepoint to support state backend switch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-11226) Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-11226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732942#comment-16732942 ] Albert Bikeev edited comment on FLINK-11226 at 1/3/19 11:27 AM: Yep. In short, in my case - a window size (`EventTimeSessionWindows.withDynamicGap`) is determined by the key value. And key value itself is determined by the underlying data. So I need a way to access key by a data-point. I see now, that I could drag key-extracting lambda all around with me, but it seems rather tedious. Why not access it via KeyedStream? Thanks. was (Author: albert.bik...@gmail.com): Yep. In short, in my case - a window size (`EventTimeSessionWindows.withDynamicGap`) is determined by the key value. And key value itself is determined by the underlying data. So I need a way to access key a data-point. I see now, that I could drag key-extracting lambda all around with me, but it seems rather tedious. Why not access it via KeyedStream? Thanks. > Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream > --- > > Key: FLINK-11226 > URL: https://issues.apache.org/jira/browse/FLINK-11226 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.7.1 >Reporter: Albert Bikeev >Assignee: vinoyang >Priority: Major > > There is no simple way to access key via Scala KeyedStream API because there > is no > getKeySelector method, unlike in Java KeyedStream. > Temporary workarounds are appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11226) Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-11226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732942#comment-16732942 ] Albert Bikeev commented on FLINK-11226: --- Yep. In short, in my case - a window size (`EventTimeSessionWindows.withDynamicGap`) is determined by the key value. And key value itself is determined by the underlying data. So I need a way to access key a data-point. I see now, that I could drag key-extracting lambda all around with me, but it seems rather tedious. Why not access it via KeyedStream? Thanks. > Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream > --- > > Key: FLINK-11226 > URL: https://issues.apache.org/jira/browse/FLINK-11226 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.7.1 >Reporter: Albert Bikeev >Assignee: vinoyang >Priority: Major > > There is no simple way to access key via Scala KeyedStream API because there > is no > getKeySelector method, unlike in Java KeyedStream. > Temporary workarounds are appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-11254) Unify serialization format of savepoint for switching state backends
[ https://issues.apache.org/jira/browse/FLINK-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732931#comment-16732931 ] Tzu-Li (Gordon) Tai edited comment on FLINK-11254 at 1/3/19 11:20 AM: -- Yes, we've discussed on this a bit offline. I'll be willing to shepherd this effort [~klion26]. Before anything else, to kick this off, it would be best if we can start with a document that first clarifies the format of each different backend / state structure. This is something that is already missing as of the current state - there is no clear documentation of the serialization formats of state backends. With this sorted out first, it'll then be easier to discuss towards a draft solution for the unified format and migration paths, which we can then propose to the community as a FLIP. was (Author: tzulitai): Yes, we've discussed on this a bit offline during Flink Forward. I'll be willing to shepherd this effort [~klion26]. Before anything else, to kick this off, it would be best if we can start with a document that first clarifies the format of each different backend / state structure. This is something that is already missing as of the current state - there is no clear documentation of the serialization formats of state backends. With this sorted out first, it'll then be easier to discuss towards a draft solution for the unified format and migration paths, which we can then propose to the community as a FLIP. > Unify serialization format of savepoint for switching state backends > > > Key: FLINK-11254 > URL: https://issues.apache.org/jira/browse/FLINK-11254 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.7.1 >Reporter: Congxian Qiu >Assignee: Congxian Qiu >Priority: Major > > For the current version, the serialization formats of savepoint between > HeapKeyedStateBackend and RocksDBStateBackend are different, so we can not > switch state backend when using savepoint. We should unify the serialization > formats of the savepoint to support state backend switch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11254) Unify serialization format of savepoint for switching state backends
[ https://issues.apache.org/jira/browse/FLINK-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732931#comment-16732931 ] Tzu-Li (Gordon) Tai commented on FLINK-11254: - Yes, we've discussed on this a bit offline during Flink Forward. I'll be willing to shepherd this effort [~klion26]. Before anything else, to kick this off, it would be best if we can start with a document that first clarifies the format of each different backend / state structure. This is something that is already missing as of the current state - there is no clear documentation of the serialization formats of state backends. With this sorted out first, it'll then be easier to discuss towards a draft solution for the unified format and migration paths, which we can then propose to the community as a FLIP. > Unify serialization format of savepoint for switching state backends > > > Key: FLINK-11254 > URL: https://issues.apache.org/jira/browse/FLINK-11254 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.7.1 >Reporter: Congxian Qiu >Assignee: Congxian Qiu >Priority: Major > > For the current version, the serialization formats of savepoint between > HeapKeyedStateBackend and RocksDBStateBackend are different, so we can not > switch state backend when using savepoint. We should unify the serialization > formats of the savepoint to support state backend switch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11249) FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
[ https://issues.apache.org/jira/browse/FLINK-11249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732912#comment-16732912 ] vinoyang commented on FLINK-11249: -- [~aljoscha] Understand, I will provide a custom serializer for the {{NextTransactionalIdHint}} state class. > FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer > --- > > Key: FLINK-11249 > URL: https://issues.apache.org/jira/browse/FLINK-11249 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Affects Versions: 1.7.0, 1.7.1 >Reporter: Piotr Nowojski >Assignee: vinoyang >Priority: Blocker > Fix For: 1.7.2, 1.8.0 > > > As reported by a user on the mailing list "How to migrate Kafka Producer ?" > (on 18th December 2018), {{FlinkKafkaProducer011}} can not be migrated to > {{FlinkKafkaProducer}} and the same problem can occur in the future Kafka > producer versions/refactorings. > The issue is that {{ListState > FlinkKafkaProducer#nextTransactionalIdHintState}} field is serialized using > java serializers and this is causing problems/collisions on > {{FlinkKafkaProducer011.NextTransactionalIdHint}} vs > {{FlinkKafkaProducer.NextTransactionalIdHint}}. > To fix that we probably need to release new versions of those classes, that > will rewrite/upgrade this state field to a new one, that doesn't relay on > java serialization. After this, we could drop the support for the old field > and that in turn will allow users to upgrade from 0.11 connector to the > universal one. > One bright side is that technically speaking our {{FlinkKafkaProducer011}} > has the same compatibility matrix as the universal one (it's also forward & > backward compatible with the same Kafka versions), so for the time being > users can stick to {{FlinkKafkaProducer011}}. > FYI [~tzulitai] [~yanghua] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11226) Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-11226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732894#comment-16732894 ] vinoyang commented on FLINK-11226: -- I guess there may be some need for extension and customization. [~albert.bik...@gmail.com] can you tell your reason? > Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream > --- > > Key: FLINK-11226 > URL: https://issues.apache.org/jira/browse/FLINK-11226 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.7.1 >Reporter: Albert Bikeev >Assignee: vinoyang >Priority: Major > > There is no simple way to access key via Scala KeyedStream API because there > is no > getKeySelector method, unlike in Java KeyedStream. > Temporary workarounds are appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] dawidwys commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc
dawidwys commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc URL: https://github.com/apache/flink/pull/7380#discussion_r244961085 ## File path: docs/dev/libs/cep.md ## @@ -1373,7 +1372,7 @@ Pattern: `a b+` and sequence: `a b1 b2 b3` Then the results will be: After found matching a b1, the match process will not discard any result. -SKIP_TO_NEXT[b*] +SKIP_TO_NEXT[b] Review comment: The pattern variable here is invalid at all. `SKIP_TO_NEXT` does not take any. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] dawidwys commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc
dawidwys commented on a change in pull request #7380: [hotfix][docs] Fix incorrect example in cep doc URL: https://github.com/apache/flink/pull/7380#discussion_r244960959 ## File path: docs/dev/libs/cep.md ## @@ -1324,7 +1324,7 @@ For example, for a given pattern `b+ c` and a data stream `b1 b2 b3 c`, the diff Have a look also at another example to better see the difference between NO_SKIP and SKIP_TO_FIRST: -Pattern: `(a | c) (b | c) c+.greedy d` and sequence: `a b c1 c2 c3 d` Then the results will be: +Pattern: `(a | b | c) (b | c) c+.greedy d` and sequence: `a b c1 c2 c3 d` Then the results will be: Review comment: I think `(a | b) (b | c) c+.greedy d` would be simpler and still valid, right? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11257) FlinkKafkaConsumer should support assgin partition
[ https://issues.apache.org/jira/browse/FLINK-11257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732855#comment-16732855 ] shengjk1 commented on FLINK-11257: -- hi ,[~till.rohrmann],In order to ensure the compatibility of subsequent code. i think should change FlinkKafkaConsumerBase ,such as add one class KafkaTopicAndPartition and one interface KafkaDescriptor ,KafkaTopicAndPartition and KafkaTopicsDescriptor implements KafkaDescriptor,FlinkKafkaConsumerBase's topicsDescriptor change to KafkaDescriptor,then FlinkKafkaConsumerBase add one Constructor,this constructor modify this.kafkaDescriptor = new KafkaTopicsAndPartitionDescriptor(topicAndPartitions, topicAndPartitionPattern); may be have better ways ,welcome the discussion > FlinkKafkaConsumer should support assgin partition > --- > > Key: FLINK-11257 > URL: https://issues.apache.org/jira/browse/FLINK-11257 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Affects Versions: 1.7.1 >Reporter: shengjk1 >Assignee: vinoyang >Priority: Major > > i find flink 1.7 also has universal Kafka connector ,if the kakfa-connector > support assgin partition ,the the kakfa-connector should prefect. such as a > kafka topci has 3 partition, i only use 1 partition,but i should read all > partition then filter.this method Not only waste resources but also > relatively low efficiency.so i suggest FlinkKafkaConsumer should support > assgin partition -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11213) Is there any condition that large amount of redis connection created on each TM?
[ https://issues.apache.org/jira/browse/FLINK-11213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek closed FLINK-11213. Resolution: Invalid Please ask questions like this on the user mailing lists. > Is there any condition that large amount of redis connection created on each > TM? > > > Key: FLINK-11213 > URL: https://issues.apache.org/jira/browse/FLINK-11213 > Project: Flink > Issue Type: Task >Reporter: lzh9 >Priority: Major > > In the job, large amount of redis connections are created on each TM, is > there some ideas? code like: > def main(args:Array[String]): Unit = { > val env = StreamExecutionEnvironment.getExecutionEnvironment > env.addSource(new Source).setParallelism(4).addSink(new > Sinker).setParallelism(4) > env.execute() > } > class Sinker extends RichSinkFunction[String]{ > lazy val applicationContext = new > ClassPathXmlApplicationContext("application-redis-context.xml") > lazy val redisTemplate =applicationContext.getBean("redisTemplate", > classOf[RedisTemplate[String, String]]) > override def invoke(value: String, context: SinkFunction.Context[_]): Unit = > { > val value = redisTemplate.opsForValue().get("key01") > println(value) > } > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11222) Change api.scala.DataStream to api.datastream.DataStream for createHarnessTester in HarnessTestBase
[ https://issues.apache.org/jira/browse/FLINK-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732844#comment-16732844 ] Aljoscha Krettek commented on FLINK-11222: -- This change makes sense! +1 > Change api.scala.DataStream to api.datastream.DataStream for > createHarnessTester in HarnessTestBase > --- > > Key: FLINK-11222 > URL: https://issues.apache.org/jira/browse/FLINK-11222 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Hequn Cheng >Assignee: Dian Fu >Priority: Major > > Thanks to FLINK-11074, we can create harness tester from a DataStream which > makes easier to write harness test. > However, it would be better if we change the parameter type from > api.scala.DataStream to api.datastream.DataStream for the > \{{createHarnessTester()}} method, so that both java.DataStream and > scala.DataStream can use this method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11254) Unify serialization format of savepoint for switching state backends
[ https://issues.apache.org/jira/browse/FLINK-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732839#comment-16732839 ] Congxian Qiu commented on FLINK-11254: -- Thanks for the quick response [~till.rohrmann]. We had some offline discussion with [~tzulitai] and will discuss more to form a design doc for review before implementation. Does this sound good to you? Thanks > Unify serialization format of savepoint for switching state backends > > > Key: FLINK-11254 > URL: https://issues.apache.org/jira/browse/FLINK-11254 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.7.1 >Reporter: Congxian Qiu >Assignee: Congxian Qiu >Priority: Major > > For the current version, the serialization formats of savepoint between > HeapKeyedStateBackend and RocksDBStateBackend are different, so we can not > switch state backend when using savepoint. We should unify the serialization > formats of the savepoint to support state backend switch. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11226) Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-11226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732836#comment-16732836 ] Aljoscha Krettek commented on FLINK-11226: -- {{getKeySelector()}} is an internal method that shouldn't be exposed. What's the your reason for using it? > Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream > --- > > Key: FLINK-11226 > URL: https://issues.apache.org/jira/browse/FLINK-11226 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.7.1 >Reporter: Albert Bikeev >Assignee: vinoyang >Priority: Major > > There is no simple way to access key via Scala KeyedStream API because there > is no > getKeySelector method, unlike in Java KeyedStream. > Temporary workarounds are appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11228) Flink web UI can not read stdout logs
[ https://issues.apache.org/jira/browse/FLINK-11228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-11228: -- Priority: Minor (was: Blocker) > Flink web UI can not read stdout logs > - > > Key: FLINK-11228 > URL: https://issues.apache.org/jira/browse/FLINK-11228 > Project: Flink > Issue Type: Bug > Components: Web Client >Affects Versions: 1.7.1 >Reporter: wxmimperio >Assignee: leesf >Priority: Minor > Attachments: image-2018-12-28-18-58-41-111.png, > image-2018-12-28-19-00-17-652.png, image-2018-12-28-19-01-33-793.png, > image-2018-12-28-19-02-55-770.png > > > I start flink as local cluster. > $ tar xzf flink-*.tgz > $ cd flink-1.6.1 > $ ./bin/start-cluster.sh # Start Flink > I createRemoteEnvironment to submit a remote job and job run successful. > However when I read stdout log from web ui, I find nothing. And can see job > manager logs. > I can find logs on linux server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11231) Unified the definition of Java DataStream and Scala DataStream interface for set parallelism
[ https://issues.apache.org/jira/browse/FLINK-11231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732833#comment-16732833 ] Aljoscha Krettek commented on FLINK-11231: -- Also, {{getTransformation()}} is an internal method that shouldn't really be exposed. > Unified the definition of Java DataStream and Scala DataStream interface for > set parallelism > > > Key: FLINK-11231 > URL: https://issues.apache.org/jira/browse/FLINK-11231 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.8.0 >Reporter: sunjincheng >Priority: Major > > For Unified the definition of Java DataStream and Scala DataStream interface, > in this JIRA. will do the following changes: > # Remove or deprecated `setParallelism` and `setMaxParallelism` for > `DataStream.scala` > # Add `getTransformation` for for `DataStream.scala`. > What to you think? [~till.rohrmann] welcome any feedback... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11226) Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream
[ https://issues.apache.org/jira/browse/FLINK-11226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11226: - Issue Type: Improvement (was: Bug) > Lack of getKeySelector in Scala KeyedStream API unlike Java KeyedStream > --- > > Key: FLINK-11226 > URL: https://issues.apache.org/jira/browse/FLINK-11226 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.7.1 >Reporter: Albert Bikeev >Assignee: vinoyang >Priority: Major > > There is no simple way to access key via Scala KeyedStream API because there > is no > getKeySelector method, unlike in Java KeyedStream. > Temporary workarounds are appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11250) fix thread leaked when StreamTask switched from DEPLOYING to CANCELING
[ https://issues.apache.org/jira/browse/FLINK-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Rohrmann updated FLINK-11250: -- Description: begin flink-1.5.x version, streamRecordWriters was created in StreamTask's constructor, which start OutputFlusher daemon thread. so when task switched from DEPLOYING to CANCELING state, the daemon thread will be leaked. *reproducible example* {code:java} public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(5000); env .addSource(new SourceFunction() { @Override public void run(SourceContext ctx) throws Exception { for (int i = 0; i < 1; i++) { Thread.sleep(100); ctx.collect("data " + i); } } @Override public void cancel() { } }) .addSink(new RichSinkFunction() { @Override public void open(Configuration parameters) throws Exception { System.out.println(1 / 0); } @Override public void invoke(String value, Context context) throws Exception { } }).setParallelism(2); env.execute(); }{code} *some useful log* {code:java} 2019-01-02 03:03:47.525 [thread==> jobmanager-future-thread-2] executiongraph.Execution#transitionState:1316 Source: Custom Source (1/1) (74a4ed4bb2f80aa2b98e11bd09ea64ef) switched from CREATED to SCHEDULED. 2019-01-02 03:03:47.526 [thread==> flink-akka.actor.default-dispatcher-5] slotpool.SlotPool#allocateSlot:326 Received slot request [SlotRequestId{12bfcf1674f5b96567a076086dbbfd1b}] for task: Attempt #1 (Source: Custom Source (1/1)) @ (unassigned) - [SCHEDULED] 2019-01-02 03:03:47.527 [thread==> flink-akka.actor.default-dispatcher-5] slotpool.SlotSharingManager#createRootSlot:151 Create multi task slot [SlotRequestId{494e47eb8318e2c0a1db91dda6b8}] in slot [SlotRequestId{6d7f0173c1d48e5559f6a14b080ee817}]. 2019-01-02 03:03:47.527 [thread==> flink-akka.actor.default-dispatcher-5] slotpool.SlotSharingManager$MultiTaskSlot#allocateSingleTaskSlot:426 Create single task slot [SlotRequestId{12bfcf1674f5b96567a076086dbbfd1b}] in multi task slot [SlotRequestId{494e47eb8318e2c0a1db91dda6b8}] for group bc764cd8ddf7a0cff126f51c16239658. 2019-01-02 03:03:47.528 [thread==> flink-akka.actor.default-dispatcher-2] slotpool.SlotSharingManager$MultiTaskSlot#allocateSingleTaskSlot:426 Create single task slot [SlotRequestId{8a877431375df8aeadb2fd845cae15fc}] in multi task slot [SlotRequestId{494e47eb8318e2c0a1db91dda6b8}] for group 0a448493b4782967b150582570326227. 2019-01-02 03:03:47.528 [thread==> flink-akka.actor.default-dispatcher-2] slotpool.SlotSharingManager#createRootSlot:151 Create multi task slot [SlotRequestId{56a36d3902ee1a7d0e2e84f50039c1ca}] in slot [SlotRequestId{dbf5c9fa39f1e5a0b34a4a8c10699ee5}]. 2019-01-02 03:03:47.528 [thread==> flink-akka.actor.default-dispatcher-2] slotpool.SlotSharingManager$MultiTaskSlot#allocateSingleTaskSlot:426 Create single task slot [SlotRequestId{5929c12b52dccee682f86afbe1cff5cf}] in multi task slot [SlotRequestId{56a36d3902ee1a7d0e2e84f50039c1ca}] for group 0a448493b4782967b150582570326227. 2019-01-02 03:03:47.529 [thread==> flink-akka.actor.default-dispatcher-5] executiongraph.Execution#transitionState:1316 Source: Custom Source (1/1) (74a4ed4bb2f80aa2b98e11bd09ea64ef) switched from SCHEDULED to DEPLOYING. 2019-01-02 03:03:47.529 [thread==> flink-akka.actor.default-dispatcher-5] executiongraph.Execution#deploy:576 Deploying Source: Custom Source (1/1) (attempt #1) to localhost 2019-01-02 03:03:47.530 [thread==> flink-akka.actor.default-dispatcher-2] state.TaskExecutorLocalStateStoresManager#localStateStoreForSubtask:162 Registered new local state store with configuration LocalRecoveryConfig{localRecoveryMode=false, localStateDirectories=LocalRecoveryDirectoryProvider{rootDirectories=[/tmp/localState/aid_AllocationID{7b5faad9073d7fac6759e40981197b8d}], jobID=06e76f6e31728025b22fdda9fadd6f01, jobVertexID=bc764cd8ddf7a0cff126f51c16239658, subtaskIndex=0}} for 06e76f6e31728025b22fdda9fadd6f01 - bc764cd8ddf7a0cff126f51c16239658 - 0 under allocation id AllocationID{7b5faad9073d7fac6759e40981197b8d}. 2019-01-02 03:03:47.530 [thread==> flink-akka.actor.default-dispatcher-2] partition.ResultPartition#:172 Source: Custom Source (1/1) (74a4ed4bb2f80aa2b98e11bd09ea64ef): Initialized ResultPartition 85c7415bae559d6198b8bb69d4c6e49f@74a4ed4bb2f80aa2b98e11bd09ea64ef [PIPELINED_BOUNDED, 2 subpartitions, 2 pending references] 2019-01-02 03:03:47.530 [thread==> flink-akka.actor.default-dispatcher-2]
[jira] [Commented] (FLINK-11231) Unified the definition of Java DataStream and Scala DataStream interface for set parallelism
[ https://issues.apache.org/jira/browse/FLINK-11231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732829#comment-16732829 ] Aljoscha Krettek commented on FLINK-11231: -- The Scala API and Java API work slightly differently. For Java, specific operations are represented by a {{SingleOutputStreamOperator}}, which is a subclass of {{DataStream}}. For Scala, everything is a {{DataStream}}. For Java it's like this because you technically can only set a parallelism on an actual operation. Both have a method {{setParallelism()}}. If we wanted to unify the interfaces, we would have to unify how {{DataStream}} behaves, changing methods is not enough. > Unified the definition of Java DataStream and Scala DataStream interface for > set parallelism > > > Key: FLINK-11231 > URL: https://issues.apache.org/jira/browse/FLINK-11231 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Affects Versions: 1.8.0 >Reporter: sunjincheng >Priority: Major > > For Unified the definition of Java DataStream and Scala DataStream interface, > in this JIRA. will do the following changes: > # Remove or deprecated `setParallelism` and `setMaxParallelism` for > `DataStream.scala` > # Add `getTransformation` for for `DataStream.scala`. > What to you think? [~till.rohrmann] welcome any feedback... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10981) Add or modify metrics to show the maximum usage of InputBufferPool/OutputBufferPool to help debugging back pressure
[ https://issues.apache.org/jira/browse/FLINK-10981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732826#comment-16732826 ] Piotr Nowojski commented on FLINK-10981: [~gaoyunhaii] aren't the already existing metrics enough and basically equivalent to what you are proposing? * {{totalQueueLen}} Total number of queued buffers in all input/output channels. * {{minQueueLen}} Minimum number of queued buffers in all input/output channels. * {{maxQueueLen}} Maximum number of queued buffers in all input/output channels. * {{avgQueueLen}} Average number of queued buffers in all input/output channels. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#network Actually one thing that I was more missing is the aggregation of the metics - metrics could be collected at the lowest possible level (some at the operator level, others at the task level, etc) and then aggregated up: operator -> operator chain (?) -> task -> stage (all tasks doing same thing across multiple task managers) -> job. Each level could present both aggregated stats AND define some new custom ones. Something like this I was especially missing during analysing where is the bottleneck on large cluster with huge job, that had ~50 tasks with parallelism ~200. > Add or modify metrics to show the maximum usage of > InputBufferPool/OutputBufferPool to help debugging back pressure > --- > > Key: FLINK-10981 > URL: https://issues.apache.org/jira/browse/FLINK-10981 > Project: Flink > Issue Type: Improvement > Components: Metrics, Network >Reporter: Yun Gao >Assignee: Yun Gao >Priority: Major > > Currently the network layer has provided two metrics items, namely > _InputBufferPoolUsageGauge_ and _OutputBufferPoolUsageGauge_ to show the > usage of input buffer pool and output buffer pool. When there are multiple > inputs(SingleInputGate) or outputs(ResultPartition), the two metrics items > show their average usage. > > However, we found that the maximum usage of all the InputBufferPool or > OutputBufferPool is also useful in debugging back pressure. Suppose we have a > job with the following job graph: > > {code:java} > F >\ > \ > _\/ > A ---> B > C ---> D >\ > \ > \-> E > {code} > Besides, also suppose D is very slow and thus cause back pressure, but E is > very fast and F outputs few records, thus the usage of the corresponding > input/output buffer pool is almost 0. > > Then the average input/output buffer usage of each task will be: > > {code:java} > A(100%) --> (100%) B (50%) --> (50%) C (100%) --> (100%) D > {code} > > > But the maximum input/output buffer usage of each task will be: > > {code:java} > A(100%) --> (100%) B (100%) --> (100%) C (100%) --> (100%) D > {code} > Users will be able to find the slowest task by finding the first task whose > input buffer usage is 100% but output usage is less than 100%. > > > If it is reasonable to show the maximum input/output buffer usage, I think > there may be three options: > # Modify the current computation logic of _InputBufferPoolUsageGauge_ and > _OutputBufferPoolUsageGauge._ > # Add two _new metrics items InputBufferPoolMaxUsageGauge and > OutputBufferPoolMaxUsageGauge._ > # Try to show distinct usage for each input/output buffer pool. > and I think maybe the second option is the most preferred. > > How do you think about that? > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)