Hi,

I have a rather simple Flink job which has a KinesisConsumer as a source
and an HBase table as sink, in which I write using writeOutputFormat. I'm
running it on a local machine with a single taskmanager (2 slots, 2G). The
KinesisConsumer works fine and the connection to the HBase table gets
opened fine (i.e. the open method of the class implementing OutputFormat
gets actually called).

I'm running the job at a parallelism of 2, while the sink has a parallelism
of 1. The

Still, looking at the log I see that after opening the connection, the job
gets stuck at lines like this one:

INFO  org.apache.flink.runtime.blob.BlobCache                       -
Downloading 8638bdf78b0e540786de6c291f710a8db447a2b4 from
localhost/127.0.0.1:43268

Each following one another, like this:

2017-08-30 14:17:21,318 INFO  org.apache.flink.runtime.blob.BlobCache
                     - Created BLOB cache storage directory
/tmp/blobStore-8a2a96af-b836-4c95-b79a-a4b80929126f
2017-08-30 14:17:21,321 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - PUT content addressable BLOB stream to
/127.0.0.1:59937
2017-08-30 14:17:21,323 DEBUG
org.apache.flink.runtime.blob.BlobServerConnection            -
Received PUT request for content addressable BLOB
2017-08-30 14:17:21,324 INFO  org.apache.flink.runtime.blob.BlobCache
                     - Downloading
3ff486dff4c4eaafdab42b30a877326e62bfca82 from
localhost/127.0.0.1:43268
2017-08-30 14:17:21,324 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - GET content addressable BLOB
3ff486dff4c4eaafdab42b30a877326e62bfca82 from /127.0.0.1:59938
2017-08-30 14:18:13,708 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - PUT content addressable BLOB stream to
/127.0.0.1:59976
2017-08-30 14:18:13,708 DEBUG
org.apache.flink.runtime.blob.BlobServerConnection            -
Received PUT request for content addressable BLOB
2017-08-30 14:18:13,710 INFO  org.apache.flink.runtime.blob.BlobCache
                     - Downloading
2f5283326aab77faa047b705cd1d6470035b3b7d from
localhost/127.0.0.1:43268
2017-08-30 14:18:13,710 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - GET content addressable BLOB
2f5283326aab77faa047b705cd1d6470035b3b7d from /127.0.0.1:59978
2017-08-30 14:19:29,811 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - PUT content addressable BLOB stream to
/127.0.0.1:60022
2017-08-30 14:19:29,812 DEBUG
org.apache.flink.runtime.blob.BlobServerConnection            -
Received PUT request for content addressable BLOB
2017-08-30 14:19:29,814 INFO  org.apache.flink.runtime.blob.BlobCache
                     - Downloading
f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from
localhost/127.0.0.1:43268
2017-08-30 14:19:29,814 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - GET content addressable BLOB
f91fd7ecec6f90809f52ee189cb48aa1e30b04f6 from /127.0.0.1:60024
2017-08-30 14:21:42,856 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - PUT content addressable BLOB stream to
/127.0.0.1:60110
2017-08-30 14:21:42,856 DEBUG
org.apache.flink.runtime.blob.BlobServerConnection            -
Received PUT request for content addressable BLOB
2017-08-30 14:21:42,858 INFO  org.apache.flink.runtime.blob.BlobCache
                     - Downloading
8638bdf78b0e540786de6c291f710a8db447a2b4 from
localhost/127.0.0.1:43268
2017-08-30 14:21:42,859 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - GET content addressable BLOB
8638bdf78b0e540786de6c291f710a8db447a2b4 from /127.0.0.1:60112
2017-08-30 14:26:11,242 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - PUT content addressable BLOB stream to
/127.0.0.1:60295
2017-08-30 14:26:11,243 DEBUG
org.apache.flink.runtime.blob.BlobServerConnection            -
Received PUT request for content addressable BLOB
2017-08-30 14:26:11,247 INFO  org.apache.flink.runtime.blob.BlobCache
                     - Downloading
6d30c88539d511bb9acc13b53bb2a128614f5621 from
localhost/127.0.0.1:43268
2017-08-30 14:26:11,247 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - GET content addressable BLOB
6d30c88539d511bb9acc13b53bb2a128614f5621 from /127.0.0.1:60297
2017-08-30 14:29:20,942 DEBUG org.apache.flink.runtime.blob.BlobClient
                     - PUT content addressable BLOB stream to
/127.0.0.1:60410


My questions are: what is the jobmanager doing here? Why is he taking ages
to do this? How do i speed up this behaviour?

Thank you very much for your attention,

Federico D'Ambrosio

Reply via email to