RE: Direct buffer memory in job with hbase client
Looks like I set wrong parameter. Is should have been taskmanager.memory.task.off-heap.size. From: Anton [mailto:anton...@yandex.ru] Sent: Friday, December 17, 2021 10:12 PM To: 'Xintong Song' Cc: 'user' Subject: RE: Direct buffer memory in job with hbase client Hi Xintong, After recent job failure I’ve set taskmanager.memory.task.heap.size to 128m, but the cluster was unable to start with next output: Starting cluster. Starting standalonesession daemon on host ***. Password: [ERROR] The execution result is empty. [ERROR] Could not get JVM parameters and dynamic configurations properly. [ERROR] Raw output from BashJavaUtils: WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. INFO [] - Loading configuration property: jobmanager.rpc.address, *** INFO [] - Loading configuration property: jobmanager.rpc.port, 6123 INFO [] - Loading configuration property: jobmanager.memory.process.size, 16m INFO [] - Loading configuration property: taskmanager.memory.process.size, 172800m INFO [] - Loading configuration property: taskmanager.numberOfTaskSlots, 31 INFO [] - Loading configuration property: parallelism.default, 1 INFO [] - Loading configuration property: jobmanager.execution.failover-strategy, region INFO [] - Loading configuration property: taskmanager.memory.task.heap.size, 128m INFO [] - The derived from fraction jvm overhead memory (16.875gb (18119393550 bytes)) is greater than its max value 1024.000mb (1073741824 bytes), max value will be used instead Exception in thread "main" org.apache.flink.configuration.IllegalConfigurationException: TaskManager memory configuration failed: If Total Flink, Task Heap and (or) Managed Memory sizes are explicitly configured then the Network Memory size is the rest of the Total Flink memory after subtracting all other configured types of memory, but the derived Network Memory is inconsistent with its configuration. at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:163) at org.apache.flink.runtime.util.bash.BashJavaUtils.getTmResourceParams(BashJavaUtils.java:85) at org.apache.flink.runtime.util.bash.BashJavaUtils.runCommand(BashJavaUtils.java:67) at org.apache.flink.runtime.util.bash.BashJavaUtils.main(BashJavaUtils.java:56) Caused by: org.apache.flink.configuration.IllegalConfigurationException: If Total Flink, Task Heap and (or) Managed Memory sizes are explicitly configured then the Network Memory size is the rest of the Total Flink memory after subtracting all other configured types of memory, but the derived Network Memory is inconsistent with its configuration. at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.sanityCheckNetworkMemoryWithExplicitlySetTotalFlinkAndHeapMemory(TaskExecutorFlinkMemoryUtils.java:344) at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:147) at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:42) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveProcessSpecWithTotalProcessMemory(ProcessMemoryUtils.java:119) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:84) at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:160) ... 3 more Caused by: org.apache.flink.configuration.IllegalConfigurationException: Derived Network Memory size (100.125gb (107508399056 bytes)) is not in configured Network Memory range [64.000mb (67108864 bytes), 1024.000mb (1073741824 bytes)]. at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.sanityCheckNetworkMemory(TaskExecutorFlinkMemoryUtils.java:378) at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.sanityCheckNetworkMemoryWithExplicitlySetTotalFlinkAndHeapMemory(TaskExecutorFlinkMemoryUtils.java:342) ... 8 more How to choose memory parameters properly? From: Xintong Song [mailto:tonysong...@gmail.com] Sent: Wednesday, December 15, 2021 12:17 PM To: Anton mailto:anton...@yandex.ru> > Cc: user mailto:user@flink.apache.org> > Subject: Re: Direct buffer memory in job with hbase client Hi Anton, You may want to try increasing the task off-heap memory, as your tasks are using hbase client which needs off-heap (direct) memory. The default task off-heap memory is 0 because most tasks do not use off-heap memory. Unfortunately, I cannot advise on how much task off-heap memory your
RE: Direct buffer memory in job with hbase client
Hi Xintong, After recent job failure I’ve set taskmanager.memory.task.heap.size to 128m, but the cluster was unable to start with next output: Starting cluster. Starting standalonesession daemon on host ***. Password: [ERROR] The execution result is empty. [ERROR] Could not get JVM parameters and dynamic configurations properly. [ERROR] Raw output from BashJavaUtils: WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. INFO [] - Loading configuration property: jobmanager.rpc.address, *** INFO [] - Loading configuration property: jobmanager.rpc.port, 6123 INFO [] - Loading configuration property: jobmanager.memory.process.size, 16m INFO [] - Loading configuration property: taskmanager.memory.process.size, 172800m INFO [] - Loading configuration property: taskmanager.numberOfTaskSlots, 31 INFO [] - Loading configuration property: parallelism.default, 1 INFO [] - Loading configuration property: jobmanager.execution.failover-strategy, region INFO [] - Loading configuration property: taskmanager.memory.task.heap.size, 128m INFO [] - The derived from fraction jvm overhead memory (16.875gb (18119393550 bytes)) is greater than its max value 1024.000mb (1073741824 bytes), max value will be used instead Exception in thread "main" org.apache.flink.configuration.IllegalConfigurationException: TaskManager memory configuration failed: If Total Flink, Task Heap and (or) Managed Memory sizes are explicitly configured then the Network Memory size is the rest of the Total Flink memory after subtracting all other configured types of memory, but the derived Network Memory is inconsistent with its configuration. at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:163) at org.apache.flink.runtime.util.bash.BashJavaUtils.getTmResourceParams(BashJavaUtils.java:85) at org.apache.flink.runtime.util.bash.BashJavaUtils.runCommand(BashJavaUtils.java:67) at org.apache.flink.runtime.util.bash.BashJavaUtils.main(BashJavaUtils.java:56) Caused by: org.apache.flink.configuration.IllegalConfigurationException: If Total Flink, Task Heap and (or) Managed Memory sizes are explicitly configured then the Network Memory size is the rest of the Total Flink memory after subtracting all other configured types of memory, but the derived Network Memory is inconsistent with its configuration. at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.sanityCheckNetworkMemoryWithExplicitlySetTotalFlinkAndHeapMemory(TaskExecutorFlinkMemoryUtils.java:344) at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:147) at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.deriveFromTotalFlinkMemory(TaskExecutorFlinkMemoryUtils.java:42) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveProcessSpecWithTotalProcessMemory(ProcessMemoryUtils.java:119) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:84) at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:160) ... 3 more Caused by: org.apache.flink.configuration.IllegalConfigurationException: Derived Network Memory size (100.125gb (107508399056 bytes)) is not in configured Network Memory range [64.000mb (67108864 bytes), 1024.000mb (1073741824 bytes)]. at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.sanityCheckNetworkMemory(TaskExecutorFlinkMemoryUtils.java:378) at org.apache.flink.runtime.util.config.memory.taskmanager.TaskExecutorFlinkMemoryUtils.sanityCheckNetworkMemoryWithExplicitlySetTotalFlinkAndHeapMemory(TaskExecutorFlinkMemoryUtils.java:342) ... 8 more How to choose memory parameters properly? From: Xintong Song [mailto:tonysong...@gmail.com] Sent: Wednesday, December 15, 2021 12:17 PM To: Anton Cc: user Subject: Re: Direct buffer memory in job with hbase client Hi Anton, You may want to try increasing the task off-heap memory, as your tasks are using hbase client which needs off-heap (direct) memory. The default task off-heap memory is 0 because most tasks do not use off-heap memory. Unfortunately, I cannot advise on how much task off-heap memory your job needs, which probably depends on your hbase client configurations. Thank you~ Xintong Song On Wed, Dec 15, 2021 at 1:40 PM Anton mailto:anton...@yandex.ru> > wrote: Hi, from time to time my job is stopping to process messages with warn message listed below. Tried to increase jobmanager.memory.process.size and taskmanager.memory.process.si
Re: Direct buffer memory in job with hbase client
Hi Anton, You may want to try increasing the task off-heap memory, as your tasks are using hbase client which needs off-heap (direct) memory. The default task off-heap memory is 0 because most tasks do not use off-heap memory. Unfortunately, I cannot advise on how much task off-heap memory your job needs, which probably depends on your hbase client configurations. Thank you~ Xintong Song On Wed, Dec 15, 2021 at 1:40 PM Anton wrote: > Hi, from time to time my job is stopping to process messages with warn > message listed below. Tried to increase jobmanager.memory.process.size and > taskmanager.memory.process.size but it didn’t help. > > What else can I try? “Framework Off-heap” is 128mb now as seen is task > manager dashboard and Task Off-heap is 0b. Documentation says that “You > should only change this value if you are sure that the Flink framework > needs more memory.” And I’m not sure about it. > > Flink version is 1.13.2. > > > > 2021-11-29 14:06:53,659 WARN > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline [] - An > exceptionCaught() event was fired, and it reached at the tail of the > pipeline. It usually means the last handler in the pipeline did not handle > the exception. > > org.apache.hbase.thirdparty.io.netty.channel.ChannelPipelineException: > org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.handlerAdded() > has thrown an exception; removed. > > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:624) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.addFirst(DefaultChannelPipeline.java:181) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.addFirst(DefaultChannelPipeline.java:358) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.addFirst(DefaultChannelPipeline.java:339) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hadoop.hbase.ipc.NettyRpcConnection.saslNegotiate(NettyRpcConnection.java:215) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$600(NettyRpcConnection.java:76) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hadoop.hbase.ipc.NettyRpcConnection$2.operationComplete(NettyRpcConnection.java:289) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hadoop.hbase.ipc.NettyRpcConnection$2.operationComplete(NettyRpcConnection.java:277) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:605) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da2:?] > > at > org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:300) > [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261caa9da
Direct buffer memory in job with hbase client
Hi, from time to time my job is stopping to process messages with warn message listed below. Tried to increase jobmanager.memory.process.size and taskmanager.memory.process.size but it didn't help. What else can I try? "Framework Off-heap" is 128mb now as seen is task manager dashboard and Task Off-heap is 0b. Documentation says that "You should only change this value if you are sure that the Flink framework needs more memory." And I'm not sure about it. Flink version is 1.13.2. 2021-11-29 14:06:53,659 WARN org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline [] - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. org.apache.hbase.thirdparty.io.netty.channel.ChannelPipelineException: org.apache.hadoop.hbase.security.NettyHBaseSaslRpcClientHandler.handlerAdded () has thrown an exception; removed. at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.callHand lerAdded0(DefaultChannelPipeline.java:624) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.addFirst (DefaultChannelPipeline.java:181) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.addFirst (DefaultChannelPipeline.java:358) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.addFirst (DefaultChannelPipeline.java:339) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hadoop.hbase.ipc.NettyRpcConnection.saslNegotiate(NettyRpcConnect ion.java:215) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hadoop.hbase.ipc.NettyRpcConnection.access$600(NettyRpcConnection .java:76) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hadoop.hbase.ipc.NettyRpcConnection$2.operationComplete(NettyRpcC onnection.java:289) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hadoop.hbase.ipc.NettyRpcConnection$2.operationComplete(NettyRpcC onnection.java:277) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyLi stener0(DefaultPromise.java:578) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyLi steners0(DefaultPromise.java:571) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyLi stenersNow(DefaultPromise.java:550) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyLi steners(DefaultPromise.java:491) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.setValue 0(DefaultPromise.java:616) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.setSucce ss0(DefaultPromise.java:605) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.trySucce ss(DefaultPromise.java:104) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPromise.trySucces s(DefaultChannelPromise.java:84) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$Abstract NioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:300) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.nio.AbstractNioChannel$Abstract NioUnsafe.finishConnect(AbstractNioChannel.java:335) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelecte dKey(NioEventLoop.java:707) [blob_p-6eb282e9e614ab47d8c0b446632a1a9cba8a3955-6e6e09bc9b5fae2679cbbb261ca a9da2:?] at org.apache.hbase.thirdparty.io.netty.channel.nio.NioEventLoop.processSelecte dKeysOptimized(NioEventLoop.java:655) [bl