We are trying to load and transform a large amount of data using the IgniteDataStreamer using a custom StreamReceiver. We'd like this to run a lot faster, and we cannot find anything that is close to saturated, except the data-streamer threads, queues. This is 2.5, with Ignite persistence, and enough memory to fit all the data.
I was looking to turn some knob, like the size of a thread pool, to increase the throughput, but I can't find any bottleneck. If I turn up the demand, the throughput does not increase, and the per transaction latency increases. This would indicate a bottleneck somewhere. The application has loaded about 900 million records of type A at this point, and now we would like to load 2.5B records of type B. Records of type A have a key and a unique ID. Records of type B have a different key type, plus a foreign field that is A's unique ID. The key we use in ignite for record B is (B's key, A's key as affinity key). We also maintain caches to map A's ID back to its key, and something similar for B. For each record the stream receiver starts a pessimistic transaction, we will end up with 1 local gets and 2-3 gets with no affinity (i.e. 50% local on two nodes), and 2-4 puts, before we commit the transaction. (FULL_SYNC caches). There are a several fields with indices. I've simplified this down to two nodes, with 4 cache caches each with one backup, all with WAL LOGGING disabled. The two nodes have 256GB of memory and 32 CPUs and local SSDs that are unmirrored (i3.8xlarge on AWS). The network is supposed to be 10 Gb. The dataset is basically in memory, and with the WAL disabled there is very little I/O. The WAL logging disable only pushed the transaction rate from about 1750 to about 2000 TPS. The CPU doesn't get above 20%, the network bandwidth is only about 6MB/s from each node and only about 1500 packets per second per node. The read wait time on the SSDs is only enough to lock up a single thread, and there are no writes except during checkpoints. When I look at thread dumps, there is no obvious bottleneck except for the Datastreamer threads. Doubling the number of DataStreamer threads from current 64 to 128 has no effect on throughput. Looking via MXbeans, where I have a fix for IGNITE-7616, the DataStreamer pool is saturated. The "Striped Executor" is not. With the WAL enabled, the "StripedExecutor" shows some bursty load, when disabled the active threads are queue low. The work is distributed across the StripedExecutor threads. The nonDataStreamer thread pools all frequently go to 0 active threads, while the DataStreamer pool stays backed up. With the WAL on with 64 DataStreamer threads, there tended to be able 53 "Owner transactions" on the node. A snapshot of transactions outstanding follows. Is there another place to look? The DS threads tend top be waiting on futures, and the other threads are consistent with the relatively THanks -DH f0a49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 104 33549c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 b0949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 114 2ca49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 96349c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 134 9ca49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 28f39c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 215 a2649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 e7849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 06849c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 114 89849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 35549c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 134 f0449c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 eb849c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 114 85549c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 d8a49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 90b49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 94 2b649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 ee949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 104 bd649c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 124 e3949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 114 bf949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 18749c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 114 fca49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 4e849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 95449c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 b2b49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 94 19949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal], 6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 efa49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 94 37a49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 ec649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 6a949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 bc749c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 d6849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 f3649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 a4749c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 124 61a49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 f4a49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 30649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 124 0c049c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 205 1c949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 a0849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal], 6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 e9649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal], 6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 124 f7949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 22649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 60449c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 134 36849c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 114 29949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 25449c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 67449c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 49849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 97949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 8d449c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 134 74949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 26849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 06949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 08e39c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 215 43a49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 72949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 6fa49c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 94 0b949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 7b749c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal], 6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 ec949c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 0fa49c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 104 42749c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 09349c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 bb649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARED, NEAR, DURATION: 124 7f849c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 114 4f849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 06549c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 9e749c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 a0949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 ff649c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 124 c9449c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 134 18849c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 4d749c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal], 6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 62949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 e0749c53561-00000000-08a9-7ea9-0000-000000000002=ACTIVE, NEAR, DURATION: 124 3d949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 6d949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 dc049c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 205 6f949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 104 ed749c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal], dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 114 ad949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal]], DURATION: 104 25949c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 114 20549c53561-00000000-08a9-7ea9-0000-000000000002=PREPARING, NEAR, PRIMARY: [dae1a619-4886-4001-8ac5-6651339c67b7 [ip-172-17-0-1.ec2.internal, ip-10-32-98-209.ec2.internal], 6d3f06d6-3346-4ca7-8d5d-b5d8af2ad12e [ip-172-17-0-1.ec2.internal, ip-10-32-97-243.ec2.internal]], DURATION: 134 Disclaimer The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful. This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.
