to sum up: the lines you were seeing seem to be the down- and upload of the TaskManager logs from the web interface which go through the BlobServer and its components.
Nico On Thursday, 31 August 2017 11:51:27 CEST Federico D'Ambrosio wrote: > Hi, > > 1) I'm using Flink 1.3.2 > > 2) Th JobManager log is pretty much the same concerning those lines: > > 2017-08-30 14:16:53,343 INFO > org.apache.zookeeper.ClientCnxn - Opening > socket connection to server master-1.localdomain/10.0.0.55:2181 > 2017-08-30 14:16:53,344 INFO > org.apache.zookeeper.ClientCnxn - Socket > connection established to master-1.localdomain/10.0.0.55:2181, initiating > session > 2017-08-30 14:16:53,348 INFO > org.apache.zookeeper.ClientCnxn - Session > establishment complete on server master-1.localdomain/10.0.0.55:2181, > sessionid = 0x15e326a8fe6000e, negotiated timeout = 40000 > 2017-08-30 14:16:53,467 INFO > org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher > - Subtask 0 has no active shards to read on startup; marking the subtask as > temporarily idle ... > 2017-08-30 14:16:53,469 INFO > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer - > Subtask 1 will be seeded with initial shard > StreamShardHandle{streamName='fdt', shard='{ShardId: > shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: > 340282366920938463463374607431768211455},SequenceNumberRange: > {StartingSequenceNumber: > 49576318619583361934524589342704326364484033599642796034,}}'}, starting > state set as sequence number LATEST_SEQUENCE_NUM > 2017-08-30 14:16:53,470 INFO > org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher > - Subtask 1 will start consuming seeded shard > StreamShardHandle{streamName='fdt', shard='{ShardId: > shardId-000000000000,HashKeyRange: {StartingHashKey: 0,EndingHashKey: > 340282366920938463463374607431768211455},SequenceNumberRange: > {StartingSequenceNumber: > 49576318619583361934524589342704326364484033599642796034,}}'} from sequence > number LATEST_SEQUENCE_NUM with ShardConsumer 0 > 2017-08-30 14:16:53,608 INFO > lab.vardata.HBaseBatchFormat - Task 0: > Opening connection to currentDay to execute 1 tasks on Single Put job > > 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob. > BlobCache - Created BLOB cache storage directory > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f > 2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:59937 > 2017-08-30 14:17:21,323 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:17:21,324 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 3ff486dff4c4eaafdab42b30a877326e62bfca82 > from localhost/127.0.0.1:43268 > 2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938 > 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:59976 > 2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:18:13,710 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 2f5283326aab77faa047b705cd1d6470035b3b7d > from localhost/127.0.0.1:43268 > 2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978 > 2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60022 > 2017-08-30 14:19:29,812 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:19:29,814 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 > from localhost/127.0.0.1:43268 > 2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024 > 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60110 > 2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:21:42,858 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 8638bdf78b0e540786de6c291f710a8db447a2b4 > from localhost/127.0.0.1:43268 > 2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112 > 2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60295 > 2017-08-30 14:26:11,243 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:26:11,247 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 6d30c88539d511bb9acc13b53bb2a128614f5621 > from localhost/127.0.0.1:43268 > 2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297 > 2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60410 > 2017-08-30 14:29:20,943 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:29:20,945 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 345ae4be7a2a8077ecefd6a155d7e975b38efd48 > from localhost/127.0.0.1:43268 > 2017-08-30 14:29:20,946 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 345ae4be7a2a8077ecefd6a155d7e975b38efd48 from /127.0.0.1:60412 > 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60525 > 2017-08-30 14:32:10,004 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:32:10,006 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 709e2f64a484ff39634fe53d50dc760ceefd1f8e > from localhost/127.0.0.1:43268 > 2017-08-30 14:32:10,006 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 709e2f64a484ff39634fe53d50dc760ceefd1f8e from /127.0.0.1:60527 > 2017-08-30 14:37:51,282 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60734 > 2017-08-30 14:37:51,283 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:37:51,285 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 > from localhost/127.0.0.1:43268 > 2017-08-30 14:37:51,286 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 98fce553f71d4cc5a11c4345482e6cc3a37a91e3 from /127.0.0.1:60736 > 2017-08-30 14:37:54,222 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60740 > 2017-08-30 14:37:54,223 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:37:54,225 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 96387068e0bf71c5ff2019dbb1a43b74fb53067a > from localhost/127.0.0.1:43268 > 2017-08-30 14:37:54,225 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 96387068e0bf71c5ff2019dbb1a43b74fb53067a from /127.0.0.1:60742 > 2017-08-30 14:40:44,005 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:60852 > 2017-08-30 14:40:44,006 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:40:44,007 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 430066a8a3febb68ada10d78663b0972b92e66e5 > from localhost/127.0.0.1:43268 > 2017-08-30 14:40:44,008 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 430066a8a3febb68ada10d78663b0972b92e66e5 from /127.0.0.1:60854 > 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:32820 > 2017-08-30 14:45:52,671 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 14:45:52,672 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > ff0c1f371621c2f4ffb7ed06f448841a8a2458ac > from localhost/127.0.0.1:43268 > 2017-08-30 14:45:52,673 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > ff0c1f371621c2f4ffb7ed06f448841a8a2458ac from /127.0.0.1:32822 > 2017-08-30 15:15:24,635 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:33955 > 2017-08-30 15:15:24,636 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:15:24,638 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 291f5aeb9306cb94098255237e01ab6735cf42ea > from localhost/127.0.0.1:43268 > 2017-08-30 15:15:24,638 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 291f5aeb9306cb94098255237e01ab6735cf42ea from /127.0.0.1:33957 > 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:34170 > 2017-08-30 15:21:04,189 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:21:04,191 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > e46b51d3bd0476b6a8a656469e7546cd933fa478 > from localhost/127.0.0.1:43268 > 2017-08-30 15:21:04,191 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > e46b51d3bd0476b6a8a656469e7546cd933fa478 from /127.0.0.1:34172 > 2017-08-30 15:22:10,141 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:34223 > 2017-08-30 15:22:10,142 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:22:10,144 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > cf9700032c229b39e634eeda73284e116314f7bb > from localhost/127.0.0.1:43268 > 2017-08-30 15:22:10,144 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > cf9700032c229b39e634eeda73284e116314f7bb from /127.0.0.1:34225 > 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:34544 > 2017-08-30 15:30:10,133 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 15:30:10,135 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 > from localhost/127.0.0.1:43268 > 2017-08-30 15:30:10,136 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 3b1d4b4d4820c2b74fd4a45d5741f2df940f5189 from /127.0.0.1:34546 > 2017-08-30 16:01:58,159 DEBUG org.apache.flink.runtime.blob. > BlobClient - PUT content addressable BLOB stream to / > 127.0.0.1:35741 > 2017-08-30 16:01:58,160 DEBUG org.apache.flink.runtime.blob. > BlobServerConnection - Received PUT request for content > addressable BLOB > 2017-08-30 16:01:58,162 INFO org.apache.flink.runtime.blob. > BlobCache - Downloading > 6bde2f7a709181065c6710c2252a5846f361ad68 > from localhost/127.0.0.1:43268 > 2017-08-30 16:01:58,162 DEBUG org.apache.flink.runtime.blob. > BlobClient - GET content addressable BLOB > 6bde2f7a709181065c6710c2252a5846f361ad68 from /127.0.0.1:35743 > > 3) There actually was CPU load, but I thought Flink was stuck in a loop or > something, because as you can see from the timestamp on the logs these > lines went on for a couple hours with no sign of actual writes on HBase > (before crashing for unrelated reasons, Kinesis stream was deleted). Since > I was writing on HBase, which I was accessing via Zookeeper, I was > expecting to see the Logs I put inside the writeRecord method of the > OutputFormat. > As for the jstack trace, I'm currently unable to provide one (because I > can't access the kinesis stream right now), but I'll try to emulate the > stream and provide that stack as soon as I can. > > 4) Yes, those were log lines from the TM log in the web dashboard > > Thank you very much for your help > > 2017-08-31 10:29 GMT+02:00 Nico Kruber <n...@data-artisans.com>: > > Hi Federico, > > 1) Which version of Flink are you using? > > 2) Can you also share the JobManager log? > > 3) Why do you think, Flink is stuck at the BlobCache? Is it really > > blocked, or > > do you still have CPU load? Can you post stack traces of the TaskManager > > (TM) > > and JobManager processes when you think they are stuck (using jstack)? > > 4) These PUT requests in the TM logs are strange, unless you showed the TM > > logs in the web interface - did you? > > > > > > Nico > > > > On Thursday, 31 August 2017 09:45:59 CEST Fabian Hueske wrote: > > > Hi Federico, > > > > > > Not sure what's going on there but Nico (in CC) is more familiar with > > > the > > > blob cache and might be able to help. > > > > > > Best, Fabian > > > > > > 2017-08-30 15:35 GMT+02:00 Federico D'Ambrosio <fedex...@gmail.com>: > > > > Hi, > > > > > > > > I have a rather simple Flink job which has a KinesisConsumer as a > > > > source > > > > > > and an HBase table as sink, in which I write using writeOutputFormat. > > > > I'm > > > > > > running it on a local machine with a single taskmanager (2 slots, 2G). > > > > The > > > > > > KinesisConsumer works fine and the connection to the HBase table gets > > > > opened fine (i.e. the open method of the class implementing > > > > OutputFormat > > > > > > gets actually called). > > > > > > > > I'm running the job at a parallelism of 2, while the sink has a > > > > parallelism of 1. The > > > > > > > > Still, looking at the log I see that after opening the connection, the > > > > job > > > > > > gets stuck at lines like this one: > > > > > > > > INFO org.apache.flink.runtime.blob.BlobCache - > > > > Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > > > localhost/127.0.0.1:43268 > > > > > > > > Each following one another, like this: > > > > > > > > 2017-08-30 14:17:21,318 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Created BLOB cache storage directory > > > > > > > > /tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f 2017-08-30 > > > > 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient > > > > > > > > - PUT content addressable BLOB stream to /127.0.0.1:59937 > > > > > > > > 2017-08-30 14:17:21,323 DEBUG > > > > org.apache.flink.runtime.blob.BlobServerConnection - > > > > Received > > > > > > PUT request for content addressable BLOB 2017-08-30 14:17:21,324 INFO > > > > org.apache.flink.runtime.blob.BlobCache - > > > > Downloading 3ff486dff4c4eaafdab42b30a877326e62bfca82 from > > > > localhost/127.0.0.1:43268 2017-08-30 14:17:21,324 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 3ff486dff4c4eaafdab42b30a877326e62bfca82 from > > > > /127.0.0.1:59938 2017-08-30 14:18:13,708 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:59976 2017-08-30 > > > > 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:18:13,710 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading 2f5283326aab77faa047b705cd1d6470035b3b7d from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:18:13,710 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 2f5283326aab77faa047b705cd1d6470035b3b7d from > > > > /127.0.0.1:59978 2017-08-30 14:19:29,811 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60022 2017-08-30 > > > > 14:19:29,812 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:19:29,814 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:19:29,814 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from > > > > /127.0.0.1:60024 2017-08-30 14:21:42,856 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60110 2017-08-30 > > > > 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:21:42,858 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:21:42,859 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 8638bdf78b0e540786de6c291f710a8db447a2b4 from > > > > /127.0.0.1:60112 2017-08-30 14:26:11,242 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60295 2017-08-30 > > > > 14:26:11,243 DEBUG org.apache.flink.runtime.blob.BlobServerConnection > > > > > > > > - Received PUT request for content addressable BLOB 2017-08-30 > > > > > > > > 14:26:11,247 INFO org.apache.flink.runtime.blob.BlobCache > > > > > > > > - Downloading 6d30c88539d511bb9acc13b53bb2a128614f5621 from > > > > > > > > localhost/127.0.0.1:43268 2017-08-30 14:26:11,247 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - GET > > > > content addressable BLOB 6d30c88539d511bb9acc13b53bb2a128614f5621 from > > > > /127.0.0.1:60297 2017-08-30 14:29:20,942 DEBUG > > > > org.apache.flink.runtime.blob.BlobClient - PUT > > > > content addressable BLOB stream to /127.0.0.1:60410 > > > > > > > > > > > > My questions are: what is the jobmanager doing here? Why is he taking > > > > ages > > > > > > to do this? How do i speed up this behaviour? > > > > > > > > Thank you very much for your attention, > > > > > > > > Federico D'Ambrosio
signature.asc
Description: This is a digitally signed message part.