A further note. I just looked at all 100 receivers this way. They all have the same inflection point at 28000 messages. But ... it could also have been a particular point in time that triggered it -- because the all hit the 28000 message mark within 2.5 seconds of each other.
On Fri, Feb 12, 2021 at 11:14 AM Michael Goulish <[email protected]> wrote: > Sorry -- I didn't realize this list would remove the image of my graph. > > Can everyone see this? > <https://www.dropbox.com/s/4t1xbp46y57mfgn/messages_vs_time.jpg?dl=0> > > On Fri, Feb 12, 2021 at 7:51 AM Chuck Rolke <[email protected]> wrote: > >> The mail list scrubs attachments. Maybe create a jira and add the image >> to that. >> >> ----- Original Message ----- >> > From: "Michael Goulish" <[email protected]> >> > To: [email protected] >> > Sent: Friday, February 12, 2021 2:43:40 AM >> > Subject: Re: Dispatch Router: Changing buffer size in buffer.c blows up >> AMQP. >> > >> > *Can you explain how you are measuring AMQP throughput? What message >> sizes >> > are you using? Credit windows? How many senders and receivers? Max >> frame* >> > * size?* >> > >> > Oops! Good point. Describe the Test! >> > >> > 100 senders, 100 receivers, 100 unique addresses -- each sender sends to >> > one receiver. >> > Each sender is throttled to 100 messages per second (Apparently I Really >> > Like the number 100). >> > And message size is .... wait for it ... 100. (payload size .. so >> > really 139 or something like that.) >> > >> > Credit window is 1000. >> > >> > I can't find anything in my router config nor in my C client code about >> max >> > frame size. What do I get by default? Or, how can I check that? >> > >> > The way I measured throughput was that -- first -- I noticed that when I >> > made the test go longer, i.e. send 20 million total messages instead of >> the >> > original 1 million -- it was taking much longer than I expected. So I >> had >> > each receiver log a message every time its total received messages was >> > divisible by 1000. >> > >> > What I saw was that the first thousand came after 11 seconds (just >> about as >> > expected because of sender-throttle to 100/sec) but that later thousands >> > became slower. By the time I stopped the test -- after more than 50,000 >> > messages per receiver -- each thousand was taking ... well ... look at >> this >> > very interesting graph that I made of one receiver's behavior. >> > >> > This graph is made by just noting the time when you receive each >> thousandth >> > message (time since test started) and graphing that -- so we expect to >> see >> > an upward-sloping straight line whose slope is determined by how long it >> > takes to receive each 1000 messages (should be close to 10 seconds). >> > >> > [image: messages_vs_time.jpg] >> > >> > I'm glad I graphed this! This inflection point was a total shock to me. >> > NOTE TO SELF: always graph everything from now on forever. >> > >> > I guess Something Interesting happened at about 28 seconds! >> > >> > Maybe what I need ... is a reading from "qdstat -m" just before and >> after >> > that inflection point !?!?? >> > >> > >> > >> > On Thu, Feb 11, 2021 at 5:37 PM Ted Ross <[email protected]> wrote: >> > >> > > On Thu, Feb 11, 2021 at 2:08 PM Michael Goulish <[email protected]> >> > > wrote: >> > > >> > > > OK, so in the file Dispatch Router file src/buffer.c I changed this: >> > > > size_t BUFFER_SIZE = 512; >> > > > to this: >> > > > size_t BUFFER_SIZE = 4096; >> > > > >> > > > Gordon tells me that's like 8 times bigger. >> > > > >> > > > >> > > > It makes a terrific difference in throughput in the TCP adapter, >> and if >> > > you >> > > > limit the sender to the throughput that the receiver can accept, it >> can >> > > go >> > > > Real Fast with no memory bloat. ( Like 15 Gbit/sec ) >> > > > >> > > > But. >> > > > AMQP throughput is Not Happy with this change. >> > > > >> > > > Some of the managed fields grow rapidly (although not enough to >> account >> > > for >> > > > total memory growth) -- and throughput gradually drops to a crawl. >> > > > >> > > > Here are the fields that increase dramatically (like 10x or more) >> -- and >> > > > the ones that don't much change. >> > > > >> > > > qd_bitmask_t >> > > > *qd_buffer_t * >> > > > qd_composed_field_t >> > > > qd_composite_t >> > > > qd_connection_t >> > > > qd_hash_handle_t >> > > > qd_hash_item_t >> > > > qd_iterator_t >> > > > *qd_link_ref_t* >> > > > qd_link_t >> > > > qd_listener_t >> > > > qd_log_entry_t >> > > > qd_management_context_t >> > > > *qd_message_content_t* >> > > > *qd_message_t* >> > > > qd_node_t >> > > > qd_parse_node_t >> > > > qd_parse_tree_t >> > > > qd_parsed_field_t >> > > > qd_session_t >> > > > qd_timer_t >> > > > *qdr_action_t* >> > > > qdr_address_config_t >> > > > qdr_address_t >> > > > qdr_connection_info_t >> > > > qdr_connection_t >> > > > qdr_connection_work_t >> > > > qdr_core_timer_t >> > > > qdr_delivery_cleanup_t >> > > > *qdr_delivery_ref_t* >> > > > *qdr_delivery_t* >> > > > qdr_field_t >> > > > qdr_general_work_t >> > > > qdr_link_ref_t >> > > > qdr_link_t >> > > > qdr_link_work_t >> > > > qdr_query_t >> > > > qdr_terminus_t >> > > > >> > > > >> > > > Does anyone have a great idea about any experiment I could do, >> > > > instrumentation I could add, whatever -- that might help to further >> > > > diagnose what is going on? >> > > > >> > > >> > > Can you explain how you are measuring AMQP throughput? What message >> sizes >> > > are you using? Credit windows? How many senders and receivers? Max >> frame >> > > size? >> > > >> > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >>
