Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Kurt Wall assigned an issue to Unassigned Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Kurt Wall Assignee: Kurt Wall Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Kurt Wall assigned an issue to Kurt Wall Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Kurt Wall Assignee: Kurt Wall Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Joshua Partlow commented on an issue Re: Periodic timeouts when reading from master Kurt Wall I think you can FR this by creating some variable amount of module lib files, and setting very small agent configtimeout values. Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Joshua Partlow commented on an issue Re: Periodic timeouts when reading from master Merged to master in 30e1d28 Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Kylo Ginsberg updated an issue Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Kylo Ginsberg Sprint: Week 2014-6-25 to 2014-7-9 , Week 2014-7-9 to 2014-7-23 Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Henrik Lindberg commented on an issue Re: Periodic timeouts when reading from master Yes, I think we just moved the limit of when the failure we are trying to avoid occurs to when an individual read times out by the change in the PR. The real fix seems to be to fail the operation if plugin sync times out - there is no way you can run with a partial result and get a defined result. While a read_timeout of 0 potentially can hang the agent completely (unless using TCP keep alive), it may actually be preferred since at least the result is defined :-D Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegrou
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Joshua Partlow commented on an issue Re: Periodic timeouts when reading from master configtimeout is a confusing setting because of it's multiple effects. Removing the Timeout.timeout call from the Downloader#evaluate does not prevent a long running catalog retrieval from timing out based on the configtimeout setting, because Connection#read_timeout is still set to configtimeout. And since a failed pluginsync never aborted an agent run, I believe the PR, as is, is backwards compatible while still removing the extra timeout constraint that previously would cut off a pluginsync part way through if the time for retrieving /all/ files within the pluginsync exceeded configtimeout. We could create new read_timeout and open_timeout settings and default them to configtimeout, which we mark deprecated? And then reset their defaults in 4.0. Though Henrik Lindberg raised some concerns above about defaulting read_timeout to 0 (above). Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede)
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Joshua Partlow assigned an issue to Joshua Partlow Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Joshua Partlow Assignee: Joshua Partlow Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Henrik Lindberg commented on an issue Re: Periodic timeouts when reading from master afaikt, the Timeout can be removed since it can cause a timeout in the middle of productive work. Changing the read timeout to 0 (indefinite) has a side effect that may not be wanted. I am assuming that an agent does a connect (which may timeout) and then a read (that does not time out). Such a configuration may cause other agents to be locked out from connecting if the server runs out of connections. I am assuming that a read timeout would cause an agent to re-connect after a random time (or at its nest scheduled time to run). OTOH, if it is expected that a server should be configured to handle a number of connections >= the number of agents then it does not matter that much (only memory consumption per open connection on the server side). Another problem with no read timeout is that the agent may wait for every on the connection, while it is waiting something happens to the network, it does not really have a connection, but the loss of this connection goes undetected (if socket does not use keepalive, a broken network goes undetected). If the connection is lost without the agent knowing it, it will never terminate the read and never get any new data. If we use keepalive on the sockets then that will ensure that the read will fail if it does not get an ACK on the keepalive probe. If we break up the configtimeout into two settings; configtimeout_connect, configtimeout_read - what are the expectations on backwards compatible change? Can we simply set the read to 0, and the connect to 2m, and deprecate the configtimeout? (Its description says "How long the client should wait for the configuration to be retrieved before considering it a failure. This can help reduce flapping if too many clients contact the server at one time. # {AS_DURATION} ". That description does not match the actual behavior since the value is used for both connect and read timeout but not for the entire transaction. If such a timeout is wanted, we should not remove the outer timeout. Instead, there should be three values, configtimeout, connect_timeout, and read_timeout. The configtimeout would then be set to something much longer. Add Comment Puppet / PUP-2885
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker assigned an issue to Henrik Lindberg Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Andy Parker Assignee: Henrik Lindberg Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker commented on an issue Re: Periodic timeouts when reading from master I think the correct fix for this will involve: Audit that the timeout that Josh identified isn't needed Remove the timeout if it turns out that it truly is redundant Push the timeout down closer to where it is needed if it turns out there are other issues the timeout could have been covering. Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede)
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker updated an issue Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Andy Parker Sprint: Week 2014-7-23 to 2014-8-6 Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker updated an issue Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Andy Parker Story Points: 1 Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker updated an issue Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Andy Parker Fix Version/s: 3.7.0 Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker assigned an issue to Unassigned Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Andy Parker Assignee: Andy Parker Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker updated an issue Puppet / PUP-2885 Periodic timeouts when reading from master Change By: Andy Parker Sprint: Week 2014- 7 6 - 23 25 to 2014- 8 7 - 6 9 Add Comment This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/optout.
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker commented on an issue Re: Periodic timeouts when reading from master I traced back the timeout for the entire pluginsync as far as I could and ended up at commit 25b28c, which is from 2008. It looks like that has always been there. I suspect that it was there first and then the other timeouts were added later. I agree with Josh that there should only be one or the other of these timeouts. Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Visit this group at http://groups.google.com/group/puppet-bugs. For more options, visit https://groups.google.com/d/op
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker commented on an issue Re: Periodic timeouts when reading from master Oh! I hadn't noticed the 2 minute timeout around the entire pluginsync. That is a bad thing and possibly the thing that we are hitting. In the apache log that showed the problem we could see that there was about 5 seconds between each request, which means that it would time out if there were more than 12 files being synced. In PE this happens pretty often because it always has stdlib and the tests force complete resyncs pretty often. Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. This message was sent by Atlassian JIRA (v6.1.4#6159-sha1:44eaede) -- You received this message because you are subscribed to the Google Groups "Puppet Bugs" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-bugs+unsubscr...@googlegroups.com. To post to this group, send email to puppet-bugs@googlegroups.com. Vis
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Josh Cooper commented on an issue Re: Periodic timeouts when reading from master I think the agent is doing two things wrong when it comes to network communications. First, it sets socket connect and read timeout to Puppet[:configtimeout], which defaults to 2 minutes: @connection.read_timeout = Puppet[:configtimeout] @connection.open_timeout = Puppet[:configtimeout] The connect timeout (open_timeout) is okay, though I would prefer that we expose these socket timeouts as separate settings. The read timeout should really be 0 (indefinite). This way the agent will block indefinitely while waiting for the master's response, and if the master closes the connection, then the client will receive a TCP RST and be interrupted. The second issue is the pluginsync is performed in a timeout block with a default 2 minute timeout. This doesn't make any sense. The timeout will be triggered if the agent takes more than 2 minutes to complete the entire pluginsync operation, even if it is making progress downloading individual files: ::Timeout.timeout(Puppet[:configtimeout]) do catalog.apply do |trans| trans.changed?.find_all do |resource| yield resource if block_given? files << resource[:path] end end end I think we should remove the Timeout block and change the read socket timeout to 0. Add Comment Puppet / PUP-2885 Periodic timeouts when reading from master
Jira (PUP-2885) Periodic timeouts when reading from master
Title: Message Title Andy Parker created an issue Puppet / PUP-2885 Periodic timeouts when reading from master Issue Type: Bug Affects Versions: 3.6.2 Assignee: Andy Parker Components: Networking Services Created: 02/Jul/14 3:02 PM Environment: Rack, Passenger, Apache Priority: Normal Reporter: Andy Parker Periodically, there are failures where the agent performs a pluginsync, a number of files are downloaded from the master, and eventually one of the files hangs, timeouts, and causes the pluginsync to abort. Add Comment