Dear flink users and developers, I am trying to test scaling a flink streaming application on a single node and here I summarize my configuration and preliminary result. It would be really helpful if you take some time and consult my settings.
test application: flink-1.0.0/examples/streaming/WordCount.jar input file: enwiki-20160305-pages-articles-multistream-index.txt (747,757,155 bytes) (https://dumps.wikimedia.org/enwiki/20160305/enwiki-20160305-pages-articles-multistream-index.txt.bz2) Running environment is as follows: cpu: (4 * AMD Opteron 6378 (16 cores per each)) 2.4GHz memory: 120 GB os: CentOS 7.2 vm: Java 8u74 flink: flink-1.0.0 My flink configuration is here (modified ones): jobmanager.heap.mb: 1024 taskmanager.heap.mb: 2048 taskmanager.numberOfTaskSlots: 64 taskmanager.network.numberOfBuffers: 8192 Keeping the configuration above the same all the way through my test, I only changed parallelism.default for each of the test cases. parallelism.default: 1 parallelism.default: 2 parallelism.default: 4 parallelism.default: 16 parallelism.default: 32 parallelism.default: 64 And I put the result at the end. It seems to scale well until the case of parallelism 8 and usually, ``Source -> Flat Map'' scales better than ``Keyed Aggregation -> Sink''. The result from parallelized subtasks of ``Keyed Aggregation -> Sink'' seem more consistent than the subtasks of ``Source -> Flat Map''. Do you see anything that I might need to / have to fix in the flink configuration or jvm configuration(I did not touch this for this experiment) to improve the performance? Especially the result I got with parallelism of 64 does not look good to me. I would also really appreciate if there is something you want to suggest that might be worth trying. Thank you. With best regards, Shinhyung Yang #============================================================================== # parallelism.default: 1 #============================================================================== Source: Read Text File Source -> Flat Map(1/1): 34m 57s Keyed Aggregation -> Sink: Unnamed(1/1): 47m 05s #============================================================================== # parallelism.default: 2 #============================================================================== Source: Read Text File Source -> Flat Map(1/2): 25m 56s Source: Read Text File Source -> Flat Map(2/2): 26m 27s Keyed Aggregation -> Sink: Unnamed(1/2): 34m 09s Keyed Aggregation -> Sink: Unnamed(2/2): 34m 08s #============================================================================== # parallelism.default: 4 #============================================================================== Source: Read Text File Source -> Flat Map(2/4): 21m 39s Source: Read Text File Source -> Flat Map(1/4): 21m 39s Source: Read Text File Source -> Flat Map(4/4): 22m 30s Source: Read Text File Source -> Flat Map(3/4): 22m 31s Keyed Aggregation -> Sink: Unnamed(3/4): 23m 20s Keyed Aggregation -> Sink: Unnamed(1/4): 24m 13s Keyed Aggregation -> Sink: Unnamed(4/4): 24m 57s Keyed Aggregation -> Sink: Unnamed(2/4): 25m 35s #============================================================================== # parallelism.default: 8 #============================================================================== Source: Read Text File Source -> Flat Map(1/8): 21m 06s Source: Read Text File Source -> Flat Map(7/8): 21m 32s Source: Read Text File Source -> Flat Map(2/8): 21m 45s Source: Read Text File Source -> Flat Map(6/8): 21m 45s Source: Read Text File Source -> Flat Map(4/8): 21m 45s Source: Read Text File Source -> Flat Map(3/8): 21m 45s Source: Read Text File Source -> Flat Map(8/8): 21m 55s Source: Read Text File Source -> Flat Map(5/8): 21m 55s Keyed Aggregation -> Sink: Unnamed(4/8): 22m 08s Keyed Aggregation -> Sink: Unnamed(3/8): 22m 28s Keyed Aggregation -> Sink: Unnamed(7/8): 22m 58s Keyed Aggregation -> Sink: Unnamed(6/8): 22m 59s Keyed Aggregation -> Sink: Unnamed(5/8): 22m 59s Keyed Aggregation -> Sink: Unnamed(2/8): 23m 00s Keyed Aggregation -> Sink: Unnamed(1/8): 23m 47s Keyed Aggregation -> Sink: Unnamed(8/8): 23m 51s #============================================================================== # parallelism.default: 16 #============================================================================== Source; Read Text File Source -> Flat Map(14/16); 08m 57s Source; Read Text File Source -> Flat Map(5/16); 13m 00s Source; Read Text File Source -> Flat Map(10/16); 16m 49s Source; Read Text File Source -> Flat Map(15/16); 16m 49s Source; Read Text File Source -> Flat Map(6/16); 17m 54s Source; Read Text File Source -> Flat Map(2/16); 18m 40s Source; Read Text File Source -> Flat Map(4/16); 19m 48s Source; Read Text File Source -> Flat Map(8/16); 19m 48s Source; Read Text File Source -> Flat Map(16/16); 20m 49s Source; Read Text File Source -> Flat Map(9/16); 21m 06s Source; Read Text File Source -> Flat Map(12/16); 21m 41s Source; Read Text File Source -> Flat Map(3/16); 22m 08s Source; Read Text File Source -> Flat Map(11/16); 22m 08s Source; Read Text File Source -> Flat Map(1/16); 24m 13s Source; Read Text File Source -> Flat Map(13/16); 24m 13s Source; Read Text File Source -> Flat Map(7/16); 24m 14s Keyed Aggregation -> Sink; Unnamed(2/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(12/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(6/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(8/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(14/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(3/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(15/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(4/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(9/16); 24m 23s Keyed Aggregation -> Sink; Unnamed(7/16); 24m 24s Keyed Aggregation -> Sink; Unnamed(11/16); 24m 24s Keyed Aggregation -> Sink; Unnamed(13/16); 24m 24s Keyed Aggregation -> Sink; Unnamed(1/16); 25m 07s Keyed Aggregation -> Sink; Unnamed(5/16); 25m 08s Keyed Aggregation -> Sink; Unnamed(10/16); 25m 19s Keyed Aggregation -> Sink; Unnamed(16/16); 25m 21s #============================================================================== # parallelism.default: 32 #============================================================================== Source: Read Text File Source -> Flat Map(14/32): 05m 25s Source: Read Text File Source -> Flat Map(2/32): 05m 41s Source: Read Text File Source -> Flat Map(26/32): 07m 24s Source: Read Text File Source -> Flat Map(15/32): 09m 35s Source: Read Text File Source -> Flat Map(9/32): 10m 23s Source: Read Text File Source -> Flat Map(11/32): 10m 40s Source: Read Text File Source -> Flat Map(31/32): 10m 40s Source: Read Text File Source -> Flat Map(27/32): 10m 41s Source: Read Text File Source -> Flat Map(20/32): 13m 25s Source: Read Text File Source -> Flat Map(29/32): 15m 02s Source: Read Text File Source -> Flat Map(5/32): 15m 43s Source: Read Text File Source -> Flat Map(16/32): 16m 00s Source: Read Text File Source -> Flat Map(21/32): 16m 18s Source: Read Text File Source -> Flat Map(6/32): 17m 28s Source: Read Text File Source -> Flat Map(10/32): 18m 37s Source: Read Text File Source -> Flat Map(25/32): 18m 37s Source: Read Text File Source -> Flat Map(19/32): 18m 37s Source: Read Text File Source -> Flat Map(18/32): 19m 30s Source: Read Text File Source -> Flat Map(28/32): 19m 48s Source: Read Text File Source -> Flat Map(8/32): 20m 05s Source: Read Text File Source -> Flat Map(7/32): 20m 05s Source: Read Text File Source -> Flat Map(22/32): 20m 17s Source: Read Text File Source -> Flat Map(1/32): 20m 34s Source: Read Text File Source -> Flat Map(32/32): 21m 56s Source: Read Text File Source -> Flat Map(13/32): 21m 56s Source: Read Text File Source -> Flat Map(12/32): 21m 56s Source: Read Text File Source -> Flat Map(3/32): 21m 56s Source: Read Text File Source -> Flat Map(30/32): 21m 56s Source: Read Text File Source -> Flat Map(24/32): 22m 25s Source: Read Text File Source -> Flat Map(17/32): 22m 26s Source: Read Text File Source -> Flat Map(23/32): 22m 38s Source: Read Text File Source -> Flat Map(4/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(12/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(21/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(22/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(11/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(18/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(5/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(24/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(13/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(6/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(9/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(26/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(2/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(3/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(28/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(15/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(19/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(7/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(27/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(25/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(23/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(14/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(17/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(20/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(29/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(32/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(31/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(4/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(8/32): 22m 55s Keyed Aggregation -> Sink: Unnamed(30/32): 23m 31s Keyed Aggregation -> Sink: Unnamed(1/32): 23m 31s Keyed Aggregation -> Sink: Unnamed(10/32): 23m 31s Keyed Aggregation -> Sink: Unnamed(16/32): 23m 32s #============================================================================== # parallelism.default: 64 #============================================================================== Source: Read Text File Source -> Flat Map(13/64): 12m 48s Source: Read Text File Source -> Flat Map(38/64): 13m 29s Source: Read Text File Source -> Flat Map(47/64): 14m 37s Source: Read Text File Source -> Flat Map(24/64): 17m 56s Source: Read Text File Source -> Flat Map(25/64): 20m 47s Source: Read Text File Source -> Flat Map(4/64): 20m 58s Source: Read Text File Source -> Flat Map(45/64): 21m 55s Source: Read Text File Source -> Flat Map(14/64): 21m 55s Source: Read Text File Source -> Flat Map(53/64): 22m 13s Source: Read Text File Source -> Flat Map(48/64): 23m 50s Source: Read Text File Source -> Flat Map(50/64): 23m 50s Source: Read Text File Source -> Flat Map(16/64): 24m 00s Source: Read Text File Source -> Flat Map(8/64): 24m 11s Source: Read Text File Source -> Flat Map(3/64): 24m 11s Source: Read Text File Source -> Flat Map(12/64): 24m 23s Source: Read Text File Source -> Flat Map(59/64): 24m 23s Source: Read Text File Source -> Flat Map(49/64): 24m 23s Source: Read Text File Source -> Flat Map(35/64): 25m 15s Source: Read Text File Source -> Flat Map(64/64): 26m 11s Source: Read Text File Source -> Flat Map(40/64): 27m 09s Source: Read Text File Source -> Flat Map(54/64): 27m 38s Source: Read Text File Source -> Flat Map(5/64): 28m 56s Source: Read Text File Source -> Flat Map(28/64): 30m 43s Source: Read Text File Source -> Flat Map(17/64): 30m 43s Source: Read Text File Source -> Flat Map(11/64): 30m 43s Source: Read Text File Source -> Flat Map(7/64): 31m 47s Source: Read Text File Source -> Flat Map(18/64): 31m 54s Source: Read Text File Source -> Flat Map(23/64): 31m 54s Source: Read Text File Source -> Flat Map(32/64): 32m 46s Source: Read Text File Source -> Flat Map(60/64): 32m 46s Source: Read Text File Source -> Flat Map(52/64): 32m 46s Source: Read Text File Source -> Flat Map(19/64): 33m 33s Source: Read Text File Source -> Flat Map(10/64): 33m 33s Source: Read Text File Source -> Flat Map(61/64): 33m 59s Source: Read Text File Source -> Flat Map(2/64): 34m 18s Source: Read Text File Source -> Flat Map(22/64): 34m 18s Source: Read Text File Source -> Flat Map(6/64): 34m 18s Source: Read Text File Source -> Flat Map(26/64): 34m 58s Source: Read Text File Source -> Flat Map(56/64): 36m 22s Source: Read Text File Source -> Flat Map(36/64): 37m 07s Source: Read Text File Source -> Flat Map(42/64): 37m 19s Source: Read Text File Source -> Flat Map(20/64): 37m 50s Source: Read Text File Source -> Flat Map(55/64): 38m 34s Source: Read Text File Source -> Flat Map(63/64): 38m 47s Source: Read Text File Source -> Flat Map(1/64): 38m 47s Source: Read Text File Source -> Flat Map(43/64): 40m 19s Source: Read Text File Source -> Flat Map(51/64): 40m 37s Source: Read Text File Source -> Flat Map(46/64): 41m 58s Source: Read Text File Source -> Flat Map(9/64): 42m 04s Source: Read Text File Source -> Flat Map(57/64): 42m 41s Source: Read Text File Source -> Flat Map(33/64): 43m 25s Source: Read Text File Source -> Flat Map(30/64): 43m 25s Source: Read Text File Source -> Flat Map(44/64): 45m 04s Source: Read Text File Source -> Flat Map(34/64): 46m 17s Source: Read Text File Source -> Flat Map(39/64): 46m 40s Source: Read Text File Source -> Flat Map(58/64): 46m 49s Source: Read Text File Source -> Flat Map(27/64): 47m 45s Source: Read Text File Source -> Flat Map(29/64): 47m 45s Source: Read Text File Source -> Flat Map(31/64): 48m 21s Source: Read Text File Source -> Flat Map(21/64): 48m 38s Source: Read Text File Source -> Flat Map(41/64): 49m 41s Source: Read Text File Source -> Flat Map(15/64): 53m 41s Source: Read Text File Source -> Flat Map(37/64): 54m 27s Source: Read Text File Source -> Flat Map(62/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(36/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(61/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(64/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(25/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(23/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(2/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(15/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(18/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(34/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(9/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(27/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(40/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(54/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(24/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(19/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(56/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(8/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(46/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(6/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(58/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(59/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(10/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(55/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(38/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(7/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(20/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(5/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(49/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(12/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(28/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(31/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(29/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(50/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(60/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(26/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(43/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(17/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(47/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(37/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(3/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(13/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(14/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(48/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(30/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(63/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(45/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(35/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(51/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(22/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(44/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(32/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(42/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(52/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(41/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(39/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(4/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(21/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(62/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(57/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(33/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(1/64): 54m 56s Keyed Aggregation -> Sink: Unnamed(53/64): 55m 24s Keyed Aggregation -> Sink: Unnamed(11/64): 55m 24s Keyed Aggregation -> Sink: Unnamed(16/64): 55m 25s