Re: Performance test Flink vs Storm

2020-07-16 Thread Xintong Song
y have large enough heap space, then you can hardly benefit from further increasing it. I'm not aware of any benchmark for Kafka connectors. You can check flink-benchmarks[1], and maybe fork the repository and develop your own Kafka connector benchmark based on it. Thank you~ Xintong

Re: Performance test Flink vs Storm

2020-07-16 Thread Xintong Song
> > *I had set Checkpoint to use the Job manager backend.* Jobmanager backend also runs in JVM heap space and does not use managed memory. Setting managed memory fraction to 0 will give you larger JVM heap space, thus lesser GC pressure. Thank you~ Xintong Song On Thu, Jul 16, 2020 at

Re: Error --GC Cleaner Provider -- Flink 1.11.0

2020-07-13 Thread Xintong Song
Hi Murali, A proper fix of this problem could take some time. It may or may not catch the next bug-fix release (1.11.1). At the meantime, you can try to workaround this by upgrading your JDK8 to a recent release. E.g., the latest JDK8u252 [1]. Thank you~ Xintong Song [1] https

Re: Error --GC Cleaner Provider -- Flink 1.11.0

2020-07-13 Thread Xintong Song
FYI, I've opened FLINK-18581[1] for tracking this. Thank you~ Xintong Song [1] https://issues.apache.org/jira/browse/FLINK-18581 On Mon, Jul 13, 2020 at 4:54 PM Xintong Song wrote: > I think the problem is that the package-private method > `Reference.tryHandlePending` does n

Re: Error --GC Cleaner Provider -- Flink 1.11.0

2020-07-13 Thread Xintong Song
I think the problem is that the package-private method `Reference.tryHandlePending` does not exist in 1.8.0_40. The method does not exist in OpenJDK 8u40[1], but can be found in the latest AdoptOpenJDK [2]. It seems the method was first introduced in 8u202[3]. Thank you~ Xintong Song [1] https

Re: Implications of taskmanager.memory.process.size

2020-07-09 Thread Xintong Song
the `process.size`. The Flink framework will still use some off-heap memory, for purposes like network buffering and JVM overhead. Thank you~ Xintong Song On Thu, Jul 9, 2020 at 10:57 PM Vishal Santoshi wrote: > ager.memory.process.size(none)MemorySizeTotal Process Memory size for

Re: Manual allocation of slot usage

2020-07-07 Thread Xintong Song
y TMs do you have? And how many slots does each TM has? Thank you~ Xintong Song [1] https://issues.apache.org/jira/browse/FLINK-12122 On Tue, Jul 7, 2020 at 8:33 PM Mu Kong wrote: > Hi, Guo, > > Thanks for helping out. > > My application has a kafka source with 60 subtasks(

Re: Heartbeat of TaskManager timed out.

2020-07-07 Thread Xintong Song
Thanks for the updates, Ori. I'm not familiar with Scala. Just curious, if what you suspect is true, is it a bug of Scala? Thank you~ Xintong Song On Tue, Jul 7, 2020 at 1:41 PM Ori Popowski wrote: > Hi, > > I just wanted to update that the problem is now solved! > > I

Re: Flink AskTimeoutException killing the jobs

2020-07-05 Thread Xintong Song
As I already mentioned, > I would suggest to look into the jobmanager logs and gc logs, see if > there's any problem that prevent the process from handling the rpc messages > timely. > The Akka ask timeout does not seem to be the root problem to me. Thank you~ Xintong Song

Re: Re: How to dynamically initialize flink metrics in invoke method and then reuse it?

2020-07-02 Thread Xintong Song
Thank you~ Xintong Song On Fri, Jul 3, 2020 at 2:39 PM wangl...@geekplus.com.cn < wangl...@geekplus.com.cn> wrote: > > Seems there's no direct solution. > Perhaps i can implement this by initializing a HashMap with > all the possible value of tableName in `open` mehtod a

Re: How to dynamically initialize flink metrics in invoke method and then reuse it?

2020-07-02 Thread Xintong Song
Hi Lei, I think you should initialize the metric in the `open` method. Then you can save the initialized metric as a class field, and update it in the `invoke` method for each record. Thank you~ Xintong Song On Fri, Jul 3, 2020 at 11:50 AM wangl...@geekplus.com.cn < wangl...@geekplus.com

Re: Flink AskTimeoutException killing the jobs

2020-07-02 Thread Xintong Song
suggest to look into the jobmanager logs and gc logs, see if there's any problem that prevent the process from handling the rpc messages timely. Thank you~ Xintong Song On Fri, Jul 3, 2020 at 3:51 AM M Singh wrote: > Hi: > > I am using Flink 1.10 on AWS EMR clu

Re: Heartbeat of TaskManager timed out.

2020-07-02 Thread Xintong Song
nk 1.10 has less heap size compared to Flink 1.9, due to the memory model changes. Thank you~ Xintong Song On Thu, Jul 2, 2020 at 8:58 PM Ori Popowski wrote: > Thank you very much for your analysis. > > When I said there was no memory leak - I meant that from the specific > Ta

Re: Heartbeat of TaskManager timed out.

2020-07-01 Thread Xintong Song
Maybe you can share the log and gc-log of the problematic TaskManager? See if we can find any clue. Thank you~ Xintong Song On Wed, Jul 1, 2020 at 8:11 PM Ori Popowski wrote: > I've found out that sometimes one of my TaskManagers experiences a GC > pause of 40-50 seconds and I h

Re: Optimal Flink configuration for Standalone cluster.

2020-06-28 Thread Xintong Song
sk managers (say tens of GBs) unless absolutely necessary. Alternatively, you can try to launch multiple TMs on one physical machine, to reduce the memory size of each TM process. BTW, what kind of workload are you running? Is it streaming or batch? Thank you~ Xintong Song On Mon, Jun 29, 20

Re: Heartbeat of TaskManager timed out.

2020-06-28 Thread Xintong Song
n guide [1]. Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_migration.html On Sun, Jun 28, 2020 at 10:12 PM Ori Popowski wrote: > Thanks for the suggestions! > > > i recently tried 1.10 and see this error frequently. and

Re: Optimal Flink configuration for Standalone cluster.

2020-06-27 Thread Xintong Song
o set `.task.heap.size` and `managed.size`. 2. If you don't know how many heap/managed memory to configure, you can look for the configuration options in the beginning of the TM logs (`-Dkey=value`). Those are the values derived from your current configuration. Thank you~ Xi

Re: Heartbeat of TaskManager timed out.

2020-06-27 Thread Xintong Song
not timely handled before the timeout check. - Is there any metrics monitoring the network condition between the JM and timeouted TM? Possibly any jitters? Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html#heartbeat-timeout On Thu

Re: [ANNOUNCE] Yu Li is now part of the Flink PMC

2020-06-16 Thread Xintong Song
Congratulations Yu, well deserved~! Thank you~ Xintong Song On Wed, Jun 17, 2020 at 9:15 AM jincheng sun wrote: > Hi all, > > On behalf of the Flink PMC, I'm happy to announce that Yu Li is now > part of the Apache Flink Project Management Committee (PMC). > > Yu Li

Re: Dynamic rescaling in Flink

2020-06-14 Thread Xintong Song
single job mode. The session mode is not supported. But I haven't checked this for quite a while. It could have been changed. Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/yarn_setup.html#run-a-single-flink-job-on-yarn [2] https://ci.apach

Re: The network memory min (64 mb) and max (1 gb) mismatch

2020-06-12 Thread Xintong Song
Yes, that is correct. 'taskmanager.memory.process.size' is the most recommended. Thank you~ Xintong Song On Fri, Jun 12, 2020 at 10:59 PM Clay Teeter wrote: > Ok, this is great to know. So in my case; I have a k8 pod that has a > limit of 4Gb. I should remove the -Xmx and

Re: Insufficient number of network buffers- what does Total mean on the Flink Dashboard

2020-06-12 Thread Xintong Song
l whether "env.java.opts" works for you. Thank you~ Xintong Song On Fri, Jun 12, 2020 at 5:33 PM Vijay Balakrishnan wrote: > Hi Xintong, > Just to be clear. I haven't set any -Xmx -i will check our scripts again. > Assuming no -Xmx is set, the doc above says 1/4 of

Re: The network memory min (64 mb) and max (1 gb) mismatch

2020-06-12 Thread Xintong Song
leverage the configuration option "taskmanager.memory.task.heap.size", and an additional constant framework overhead will be added to this value for -Xmx. Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#jvm-parameters O

Re: Insufficient number of network buffers- what does Total mean on the Flink Dashboard

2020-06-12 Thread Xintong Song
-Xmx on Mesos. BTW, from your screenshot the physical memory is 123GB, so 1/4 of that is much closer to 29GB if we consider there are some rounding errors and accuracy loss. Thank you~ Xintong Song On Fri, Jun 12, 2020 at 4:33 PM Vijay Balakrishnan wrote: > Thx, Xintong for a great

Re: Flink 1.10.1 not using FLINK_TM_HEAP for TaskManager JVM Heap size correctly?

2020-06-12 Thread Xintong Song
you~ Xintong Song On Fri, Jun 12, 2020 at 4:27 PM Xintong Song wrote: > Hi Li, > > FLINK_TM_HEAP corresponds to the legacy configuration option > "taskmanager.heap.size". It is supported for backwards compatibility. I > strongly recommend you to use "

Re: Flink 1.10.1 not using FLINK_TM_HEAP for TaskManager JVM Heap size correctly?

2020-06-12 Thread Xintong Song
he configuration option but not for the environment variable) > The previous options which were responsible for the total memory used by > Flink are taskmanager.heap.size or taskmanager.heap.mb. Despite their > naming, they included not only JVM heap but also other off-heap memory > compon

Re: Insufficient number of network buffers- what does Total mean on the Flink Dashboard

2020-06-11 Thread Xintong Song
jvmHeap = (total - Max(cutoff-min, total * cutoff-ratio)) * (1 - networkFraction) = (102GB - Max(600MB, 102GB * 0.25)) * (1 - 0.48) = 40.6GB Have you specified a custom "-Xmx" parameter? Thank you~ Xintong Song On Fri, Jun 12, 2020 at 7:50 AM Vijay Balakrishnan wrote: > Hi

Re: The network memory min (64 mb) and max (1 gb) mismatch

2020-06-11 Thread Xintong Song
igurations will be read by Flink task manager so that memory will be managed accordingly. Flink task manager expects all the memory configurations are already set (thus network min/max should have the same value) before it's started. In your case, it seems such configurations are missin

Re: Dynamic rescaling in Flink

2020-06-09 Thread Xintong Song
dynamically adapt to the available resources (e.g., add/reduce pods on kubernetes). AFAIK, this is still in the design discussion. Thank you~ Xintong Song On Wed, Jun 10, 2020 at 2:44 AM Prasanna kumar < prasannakumarram...@gmail.com> wrote: > Hi all, > > Does flink support dynamic s

Re: Flink on yarn : yarn-session understanding

2020-06-08 Thread Xintong Song
ing only one job. Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/cluster_setup.html [2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/concepts/glossary.html#flink-application-cluster [3] https://ci.apache.org/projects/flink

Re: Flink Dashboard UI Tasks hard limit

2020-06-04 Thread Xintong Song
ould need to look into the *log of the task manager that is not responding* to understand what's wrong with it. Thank you~ Xintong Song On Fri, Jun 5, 2020 at 6:06 AM Vijay Balakrishnan wrote: > Thx a ton, Xintong. > I am using this configuration now: > taskman

Re: Flink Dashboard UI Tasks hard limit

2020-05-31 Thread Xintong Song
.NioEventLoop >> .processSelectedKeys(NioEventLoop.java:508) >> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop >> .run(NioEventLoop.java:470) >> at org.apache.flink.shaded.netty4.io.netty.util.concurrent. >> SingleThreadEventExecutor$5.run(SingleThre

Re: Flink Dashboard UI Tasks hard limit

2020-05-27 Thread Xintong Song
etwork_fraction, network_min), network_max)`. According to the error message, your current network memory size is `85922 buffers * 32KB/buffer = 2685MB`, smaller than your "max" (4gb). That means increasing the "max" does not help in your case. It is the "fraction" that you

Re: Flink Dashboard UI Tasks hard limit

2020-05-26 Thread Xintong Song
t the execution plan only shows 5. Thank you~ Xintong Song On Wed, May 27, 2020 at 3:16 AM Vijay Balakrishnan wrote: > Hi Xintong, > Thanks for the excellent clarification for tasks. > > I attached a sample screenshot above and din't reflect the slots used and > the tasks li

Re: Flink Dashboard UI Tasks hard limit

2020-05-24 Thread Xintong Song
an argument for the `flink run` command, to set parallelism for all operators. - Set `parallelism.default` in your `flink-conf.yaml`, to set a default parallelism for your jobs. This will be used for jobs that have not set parallelism with neither of the above methods. Thank you~ Xintong So

Re: Flink Dashboard UI Tasks hard limit

2020-05-22 Thread Xintong Song
lower parallelism. Could you share some more information about your use case? - What kind of job are your executing? Is it a streaming or batch processing job? - Which Flink deployment do you use? Standalone? Yarn? - It would be helpful if you can share the Flink logs. Thank you~ Xintong

Re: Flink BLOB server port exposed externally

2020-05-18 Thread Xintong Song
1.11.0 is feature freezing today. The final release date depends on the progress of release testing / bug fixing. Thank you~ Xintong Song On Mon, May 18, 2020 at 6:36 PM Omar Gawi wrote: > Thanks Till! > Do you know what is 1.11.0 release date? > > > On Mon, May 18, 2020 a

Re: Flink Memory analyze on AWS EMR

2020-05-12 Thread Xintong Song
PREFIX}" with "/your-file-name.jit". The token "" should be replaced with proper log directory path by Yarn automatically. I noticed that the usage of ${FLINK_LOG_PREFIX} is recommended by Flink's documentation [1]. This is IMO a bit misleading. I'll try to file

Re: Flink Memory analyze on AWS EMR

2020-05-11 Thread Xintong Song
Hi Jacky, Could you search for "Application Master start command:" in the debug log and post the result and a few lines before & after that? This is not included in the clip of attached log file. Thank you~ Xintong Song On Tue, May 12, 2020 at 5:33 AM Jacky D wrote: > hi,

Re: No Slots available exception in Apache Flink Job Manager while Scheduling

2020-05-08 Thread Xintong Song
Linking to the jira ticket, for the record. https://issues.apache.org/jira/browse/FLINK-17560 Thank you~ Xintong Song On Sat, May 9, 2020 at 2:14 AM Josson Paul wrote: > Set up > -- > Flink verson 1.8.3 > > Zookeeper HA cluster > > 1 ResourceManager/Dispa

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-29 Thread Xintong Song
se a few direct memory. But that's quite opportunistic. So it would be better to configure a non-zero task.off-heap if you know your tasks/operators use some direct memory. Thank you~ Xintong Song On Thu, Apr 30, 2020 at 12:14 PM Jiahui Jiang wrote: > Hey Xintong, thanks for the explanat

Re: 1.11 snapshot: Name or service not knownname localhost and taskMgr not started

2020-04-29 Thread Xintong Song
Hi Lei, Could you check whether the hostname 'localhost' is available on your CentOS machine? This is usually defined in "/etc/hosts". You can also try to modify the slaves file, replacing 'localhost' with '127.0.0.1'. The path is: /conf/slaves Thank you~

Re: Flink Task Manager GC overhead limit exceeded

2020-04-29 Thread Xintong Song
ner". I suspect there might be some argument passing problem regarding the spaces and double quotation marks. Thank you~ Xintong Song On Thu, Apr 30, 2020 at 11:39 AM Eleanore Jin wrote: > Hi Xintong, > > Thanks for the detailed explanation! > > as for the 2nd question: I mou

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-29 Thread Xintong Song
led by JVM. In Flink, managed memory and jvm-overhead are using native memory. That means, if you see a JVM OOM, increasing jvm-overhead should not help. Thank you~ Xintong Song On Thu, Apr 30, 2020 at 11:06 AM Jiahui Jiang wrote: > Hey Xintong, Steven, thanks for replies! > > @Steven W

Re: Flink Task Manager GC overhead limit exceeded

2020-04-29 Thread Xintong Song
tions look good to me. It the configured path '/dumps/oom.bin' a local path of the pod or a path of the host mounted onto the pod? The restarted pod is a completely new different pod. Everything you write to the old pod goes away as the pod terminated, unless they are written to the host

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-28 Thread Xintong Song
'task.off-heap.size' being 0 only represents that in most cases user codes / operators do not use off-heap memory. User would need to explicitly increase this configuration if UDFs or libraries of the job uses off-heap memory. Thank you~ Xintong Song On Wed, Apr 29, 2020 at 11:07 AM

Re: Flink 1.10 Out of memory

2020-04-24 Thread Xintong Song
True. Thanks for the clarification. Thank you~ Xintong Song On Fri, Apr 24, 2020 at 5:21 PM Stephan Ewen wrote: > I think native methods are not in a forked process. It is just a malloc() > call that failed, probably an I/O buffer or so. > This might mean that there really is

Re: Flink 1.10 Out of memory

2020-04-24 Thread Xintong Song
ative method, I think the problem is that not enough native memory can be allocated for executing the native method. Thank you~ Xintong Song On Fri, Apr 24, 2020 at 3:40 PM Stephan Ewen wrote: > @Xintong - out of curiosity, where do you see that this tries to fork a > process?

Re: IntelliJ java formatter

2020-04-23 Thread Xintong Song
Hi Flavio, I'm not aware of anyway to automatically format the codes. The only thing I find that might help is to enable your IDE with a checkstyle plugin. https://ci.apache.org/projects/flink/flink-docs-stable/flinkDev/ide_setup.html#checkstyle-for-java Thank you~ Xintong Song On Thu

Re: Flink 1.10 Out of memory

2020-04-23 Thread Xintong Song
@Stephan, I don't think so. If JVM hits the direct memory limit, you should see the error message "OutOfMemoryError: Direct buffer memory". Thank you~ Xintong Song On Thu, Apr 23, 2020 at 6:11 PM Stephan Ewen wrote: > @Xintong and @Lasse could it be that the JVM hits

Re: A Strategy for Capacity Testing

2020-04-23 Thread Xintong Song
performance to get stabilized. Depends on your workload, this could take up to tens of minutes. Please also be careful with aggregations over large windows. The emitting of windows might introduce large processing workloads, fluctuating the measured throughput. Thank you~ Xintong Song On Thu, Apr 23

Re: Flink 1.10 Out of memory

2020-04-19 Thread Xintong Song
heap / direct memory. My suggestion is to try increasing the JVM overhead configuration. You can leverage the configuration options 'taskmanager.memory.jvm-overhead.[min|max|fraction]'. See more details in the documentation[1]. Thank you~ Xintong Song [1] https://ci.apache.org/pr

Re: Flink On Yarn , ResourceManager is HA , if active ResourceManager changed,what is flink task status ?

2020-04-15 Thread Xintong Song
Normally, Yarn RM switch should not cause any problem to the running Flink instance. Unless the RM switch takes too long and Flink happens to request new containers during that time, it might lead to resource allocation timeout. Thank you~ Xintong Song On Wed, Apr 15, 2020 at 3:49 PM LakeShen

Re: Flink job consuming all available memory on host

2020-04-12 Thread Xintong Song
ny native memory? E.g., launch another process, calling a JNI library or so? Thank you~ Xintong Song On Sat, Apr 11, 2020 at 3:56 AM Mitch Lloyd wrote: > We are having an issue with a Flink Job that gradually consumes all > available memory on a Docker host machine, crashing the machin

Re: on YARN question

2020-04-10 Thread Xintong Song
d, including "-d". As a result, you're running the session cluster in attached mode, and the client will not exit until the session is shutdown. Thank you~ Xintong Song On Fri, Apr 10, 2020 at 1:10 PM Yangze Guo wrote: > Do you mean to run it in detach mode? If so, you could add

Re: Question about the flink 1.6 memory config

2020-03-31 Thread Xintong Song
environment and workloads. For standalone clusters, the cut-off will not take any effect. For containerized environments, depending on Yarn/Mesos configurations your container may or may not get killed due to exceeding the container memory. Thank you~ Xintong Song On Tue, Mar 31, 2020 at 5:34 PM

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Xintong Song
for a job cluster, but does not cover the scenarios of session clusters. Thank you~ Xintong Song On Mon, Mar 30, 2020 at 12:03 PM Yangze Guo wrote: > Thanks for your feedbacks, @Xintong and @Jeff. > > @Jeff > I think it would always be good to leverage exist logic in Flink, such >

Re: [Third-party Tool] Flink memory calculator

2020-03-29 Thread Xintong Song
Thanks Yangze, I've tried the tool and I think its very helpful. Thank you~ Xintong Song On Mon, Mar 30, 2020 at 9:40 AM Yangze Guo wrote: > Hi, Yun, > > I'm sorry that it currently could not handle it. But I think it is a > really good idea and that feature woul

Re: ClusterSpecification and Configuration questions

2020-03-23 Thread Xintong Song
helpful to that end. In addition, would you be able to check the Yarn logs? See if the container requests are received and containers are allocated. Thank you~ Xintong Song On Tue, Mar 24, 2020 at 6:45 AM Vitaliy Semochkin wrote: > Hi, > > I create a job with following p

Re: usae of ClusterSpecificationBuilder.taskManagerMemoryMB

2020-03-23 Thread Xintong Song
16 [1], which replaces masterMemoryMB with `jobmanager.memory.process.size`. That would also involve refactoring YarnClusterDescriptor, which is not in good shape (e.g. the method startAppMaster has more than 400 lines) and is closely coupled with ClusterSpecification. Thank you~ Xintong Song O

Re: How can i set the value of taskmanager.network.numberOfBuffers ?

2020-03-20 Thread Xintong Song
Hi Forideal, Do you mean you have 700 slots per TM or in total? How many TMs do you have? And how many slots do you have per TM? Also, when is the screenshot taken? It is after the job is fully initiated? It seems you only need 1k+ network buffers. Thank you~ Xintong Song On Fri, Mar 20

Re: JobMaster does not register with ResourceManager in high availability setup

2020-03-17 Thread Xintong Song
I'm not familiar with ZK either. I've copied Yang Wang, who might be able to provide some suggestions. Alternatively, you can try to post your question to the Apache ZooKeeper community, see if they have any clue. Thank you~ Xintong Song On Wed, Mar 18, 2020 at 8:12 AM Bajaj, Abhi

Re: JobMaster does not register with ResourceManager in high availability setup

2020-03-16 Thread Xintong Song
Flink 1.7 till the latest 1.10, and I'm not aware of any reported issue that the JM may not try to connect RM once the address is received. Thank you~ Xintong Song On Tue, Mar 17, 2020 at 7:45 AM Bajaj, Abhinav wrote: > Hi Xintong, > > > > Apologies for delayed response

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Xintong Song
link Master will interact with Kubernetes Master, and actively requests for pods/containers, like on Yarn/Mesos. Thank you~ Xintong Song On Mon, Mar 16, 2020 at 4:03 PM Pankaj Chand wrote: > Hi all, > > I want to run Flink, Spark and other processing engines on a single > K

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Xintong Song
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above. Thank you~ Xintong Song On Mon, Mar 16, 2020 at 5:40 PM Xintong Song wrote: > Hi Pankaj, > > "Running Flink on Kubernetes" refers

Re: how to specify yarnqueue when starting a new job programmatically?

2020-03-12 Thread Xintong Song
e.g., in a Flink YARN Session.[1] Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/yarn_setup.html#flink-yarn-session On Thu, Mar 12, 2020 at 6:20 PM Vitaliy Semochkin wrote: > Thank you Xintong Song, > > is there any way to queue pr

Re: Flink 1.10 container memory configuration with Mesos.

2020-03-11 Thread Xintong Song
size' is missing. You can take a look at the launching command, see if there's anything unexpected before the memory dynamic configurations. Thank you~ Xintong Song On Thu, Mar 12, 2020 at 2:26 PM Yangze Guo wrote: > Hi, Alexander > > I could not reproduce it in my local

Re: how to specify yarnqueue when starting a new job programmatically?

2020-03-11 Thread Xintong Song
Hi Vitaliy, You can specify a yarn queue by either setting the configuration option 'yarn.application.queue' [1], or using the command line option '-qu' (or '--queue') [2]. Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-rel

Re: scaling issue Running Flink on Kubernetes

2020-03-10 Thread Xintong Song
rea. Thank you~ Xintong Song On Wed, Mar 11, 2020 at 10:37 AM Eleanore Jin wrote: > _Hi Xintong, > > Thanks for the prompt reply! To answer your question: > >- Which Flink version are you using? > >v1.8.2 > >- Is this skew observed on

Re: scaling issue Running Flink on Kubernetes

2020-03-10 Thread Xintong Song
skew ease? I suspect the performance difference might be an outcome of some warming up issues. E.g., the existing TMs might have some file already localized, or some memory buffers already promoted to the JVM tenured area, while the new TMs have not. Thank you~ Xintong Song On Wed, Mar 11

Re: JobMaster does not register with ResourceManager in high availability setup

2020-03-05 Thread Xintong Song
the rest part of the log (from where the current one ends to the NoResourceAvailableException) to tell what happened during the scheduling. Also, could you confirm how many TMs do you use? Thank you~ Xintong Song On Fri, Mar 6, 2020 at 5:55 AM Bajaj, Abhinav wrote: > Hi Xintong, &g

Re: JobMaster does not register with ResourceManager in high availability setup

2020-03-04 Thread Xintong Song
hose from the job restart to the NoResourceAvailableException) to find out which is the case. Thank you~ Xintong Song On Thu, Mar 5, 2020 at 7:30 AM Bajaj, Abhinav wrote: > While I setup to reproduce the issue with debug logs, I would like to > share more information I noticed in INFO logs. &

Re: JobMaster does not register with ResourceManager in high availability setup

2020-03-03 Thread Xintong Song
ime.highavailability org.apache.flink.runtime.leaderretrieval org.apache.zookeeper Thank you~ Xintong Song On Wed, Mar 4, 2020 at 5:42 AM Bajaj, Abhinav wrote: > Hi, > > > > We recently came across an issue where JobMaster does not register with > ResourceManager in Fink high availability set

Re: MaxMetaspace default may be to low?

2020-02-25 Thread Xintong Song
rrently running different jobs. However, such cases usually requires various changes of configurations (process.size/flink.size, numOfSlots, etc.) and we think it makes sense to make metaspace one of them. Thank you~ Xintong Song On Tue, Feb 25, 2020 at 9:22 PM John Smith wrote: >

Re: yarn session: one JVM per task

2020-02-25 Thread Xintong Song
uding) 1.8. Thank you~ Xintong Song On Tue, Feb 25, 2020 at 7:41 PM David Morin wrote: > Hi Gary, > > Sorry I was probably not very clear. > Yes that's exactly what I want to hear :) > I use the -s 1 parameter and what I expect to have is one task of my Sink > (one insta

Re: Colocating Sub-Tasks across jobs / Sharing Task Slots across jobs

2020-02-25 Thread Xintong Song
> > Do you believe the code of the operators of the restarted Region can be > changed between restarts? I'm not an expert on the restart strategies, but AFAIK the answer is probably not. Sorry I overlooked that you need to modify the job. Thank you~ Xintong Song On Tue, Feb 25

Re: MaxMetaspace default may be to low?

2020-02-24 Thread Xintong Song
hanged that in Flink 1.10 to have stricter control on the overall memory usage of Flink processes. Thank you~ Xintong Song On Tue, Feb 25, 2020 at 1:24 PM John Smith wrote: > I would like to also add the same exact jobs on Flink 1.8 where running > perfectly fine. > > On Tue, 25 Feb

Re: MaxMetaspace default may be to low?

2020-02-24 Thread Xintong Song
ace memory leak. Thank you~ Xintong Song On Tue, Feb 25, 2020 at 7:50 AM John Smith wrote: > Hi, I just upgraded to 1.10 and I started deploying my jobs. Eventually > task nodes started shutting down with OutOfMemory Metaspace. > > I look at the logs and the tas

Re: Colocating Sub-Tasks across jobs / Sharing Task Slots across jobs

2020-02-24 Thread Xintong Song
Strategy [1], which restarts only the tasks connected by pipelined edges instead of the whole job graph. Thank you~ Xintong Song [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy On Mon, Feb 24, 2020 at 11:28

Re: yarn session: one JVM per task

2020-02-24 Thread Xintong Song
s a way to make some of the tasks scheduled to the same JVM. Not that I'm aware of. Thank you~ Xintong Song On Mon, Feb 24, 2020 at 8:43 PM David Morin wrote: > Hi, > > Thanks Xintong. > I've noticed than when I use yarn-session.sh with --slots (-s) parameter > but

Re: yarn session: one JVM per task

2020-02-23 Thread Xintong Song
task manager has enough cpu/memory resources and slots for running your job. Thank you~ Xintong Song On Sun, Feb 23, 2020 at 1:11 AM David Morin wrote: > Hi, > My app is based on a lib that is not thread safe (yet...). > In waiting of the patch has been pushed, how can I be sure that

Re: How JobManager and TaskManager find each other?

2020-02-20 Thread Xintong Song
different JM address/ports in your TM configurations, so the TM knows which JM to connect to. Thank you~ Xintong Song On Fri, Feb 21, 2020 at 6:38 AM KristoffSC wrote: > Hi all, > I was wondering how JobManager and TaskManager find each other? > Do they use multicast for this? >

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-20 Thread Xintong Song
Congratulations, Jingsong. Well deserved~! Thank you~ Xintong Song On Fri, Feb 21, 2020 at 11:05 AM Kurt Young wrote: > Hi everyone, > > I'm very happy to announce that Jingsong Lee accepted the offer of the > Flink PMC to > become a committer of the Flink project.

Re: [ANNOUNCE] Apache Flink 1.10.0 released

2020-02-12 Thread Xintong Song
Great to hear that! Thanks for being the release managers, Gary & Yu. Great work! Thank you~ Xintong Song On Wed, Feb 12, 2020 at 9:31 PM Yu Li wrote: > The Apache Flink community is very happy to announce the release of Apache > Flink 1.10.0, which is the latest major release.

Re: Slots Leak Observed when

2020-01-14 Thread Xintong Song
Hi, It would be helpful for understanding the problem if you could share the logs. Thank you~ Xintong Song On Wed, Jan 15, 2020 at 12:23 AM burgesschen wrote: > Hi guys, > > Out team is observing a stability issue on our Standalone Flink clusters. > > Background: The kafka cl

Re: managedMemoryInMB failure

2020-01-08 Thread Xintong Song
pu cores and decide thread pool sizes accordingly. But this is just my guess and I cannot confirm it. I would suggest you to configure "taskmanager.memory.size" explicitly anyway, to avoid potential problems caused by the uncertainty of JVM free heap memory size. BTW, this randomness is el

Re: managedMemoryInMB failure

2020-01-07 Thread Xintong Song
nning the same version of flink? Thank you~ Xintong Song On Wed, Jan 8, 2020 at 8:18 AM Fanbin Bu wrote: > Hi, > > with Flink 1.9 running in docker mode, I have a batch job and got the > following error message. > > However, it works totally fine with the same code on EM

Re: Best way set max heap size via env variables or program arguments?

2020-01-01 Thread Xintong Song
so account for some off-heap memory, such as network direct buffers and off-heap managed memory (if used). That's way you see the java heap size is always slightly smaller than the configured 'taskmanager.heap.size'. Thank you~ Xintong Song On Wed, Jan 1, 2020 at 3:10 AM Li P

Re: question: jvm heap size per task?

2019-12-25 Thread Xintong Song
53 [1] and FLIP-56 [2] for more details. Another related effort is pluggable slot manager [3], which allows having pluggable resource scheduling strategies such as launch task managers with customized resources according to the tasks' requirements. Thank you~ Xintong Song [1] https://cwiki.

Re: Flink On K8s, build docker image very slowly, is there some way to make it faster?

2019-12-22 Thread Xintong Song
e", change the line "FROM openjdk:8-jre-alpine" to point to a domestic or local image source. Thank you~ Xintong Song On Mon, Dec 23, 2019 at 2:46 PM LakeShen wrote: > Hi community , when I run the flink task on k8s , the first thing is that > to build the flink task jar

Re: Questions about taskmanager.memory.off-heap and taskmanager.memory.preallocate

2019-12-16 Thread Xintong Song
, but will have to wait until the GC to be truly released. Thank you~ Xintong Song On Tue, Dec 17, 2019 at 12:30 PM Ethan Li wrote: > Thank you very much Xintong! It’s much clear to me now. > > I am still on standalone cluster setup. Before I was using 350GB on-heap > memory on a

Re: Documentation tasks for release-1.10

2019-12-16 Thread Xintong Song
Thank you Kostas. Big +1 for keeping all the documentation related issues at one place. I've added the documentation task for resource management. Thank you~ Xintong Song On Mon, Dec 16, 2019 at 5:29 PM Kostas Kloudas wrote: > Hi all, > > With the feature-freeze for th

Re: [DISCUSS] Make Managed Memory always off-heap (Adjustment to FLIP-49)

2019-12-02 Thread Xintong Song
Sorry, I just realized that I've send my feedbacks to Jingsong's email address, instead of the dev / user mailing list. Please find my comments below. Thank you~ Xintong Song On Wed, Nov 27, 2019 at 4:32 PM Xintong Song wrote: > As a participant of the discussion yesterday, I

Re: Batch Job in a Flink 1.9 Standalone Cluster

2019-10-10 Thread Xintong Song
allocate/deallocated overheads and optimizing performance. Thank you~ Xintong Song On Thu, Oct 10, 2019 at 7:55 PM Timothy Victor wrote: > After a batch job finishes in a flink standalone cluster, I notice that > the memory isn't freed up. I understand Flink uses it's o

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-22 Thread Xintong Song
Congratulations! Thanks Gordon and Kurt for being the release managers, and thanks all the contributors. Thank you~ Xintong Song On Thu, Aug 22, 2019 at 2:39 PM Yun Gao wrote: > Congratulations ! > > Very thanks for Gordon and Kurt for managing the release and very >

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Xintong Song
Congratulations Andery~! Thank you~ Xintong Song On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez wrote: > Congratulations Andrey! > > I am glad the Flink committer team is growing at such a pace! > > --- > Oytun Tez > > *M O T A W O R D* > The World's Fastest

Re: Flink program,Full GC (System.gc())

2019-08-13 Thread Xintong Song
t automatically, as long as there are continuous activities of creating / destroying objects in heap, e.g., due to heartbeats. Please refer to java garbage collection documents [1] for more details. Thank you~ Xintong Song [1] https://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01

Re: some slots are not be available,when job is not running

2019-08-12 Thread Xintong Song
Hi, It would be good if you can provide the job manager and task manager log files, so that others can analysis the problem? Thank you~ Xintong Song On Mon, Aug 12, 2019 at 10:12 AM pengcheng...@bonc.com.cn < pengcheng...@bonc.com.cn> wrote: > Hi all, > some slots are not be av

Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Xintong Song
Congratulations~! Thank you~ Xintong Song On Wed, Aug 7, 2019 at 4:00 PM vino yang wrote: > Congratulations! > > highfei2...@126.com 于2019年8月7日周三 下午7:09写道: > > > Congrats Hequn! > > > > Best, > > Jeff Yang > > > > > > Origi

Re: Dynamically allocating right-sized task resources

2019-08-05 Thread Xintong Song
e.org/projects/flink/flink-docs-release-1.8/concepts/runtime.html#task-slots-and-resources Thank you~ Xintong Song On Sun, Aug 4, 2019 at 9:40 PM Chad Dombrova wrote: > Hi all, > First time poster, so go easy on me :) > > What is Flink's story for accommodating task workloads w

<    1   2   3   >