On 05/07/12 18:56, Luke Kanies wrote:
> On Jul 5, 2012, at 9:49 AM, Andrew Parker wrote:
> 
>> As Deepak said, we are taking a look over Brice's patches right
>> now. Initially we'll target them at 3.0 and then we'll probably
>> move on to back porting and tuning a bit on 2.7 after we have 3.0
>> stabilized. At the same time we have been taking a look at the
>> catalog retrieval time problem. Based on the discussion that we had
>> a while ago on this list, I think the plan of attack is to remove
>> the YAML translation for the caching. I haven't seen any numbers
>> for the speed improvement that we get for that yet.
> 
> Peter Meier has shown that storying to yaml takes at least a minute
> and often 2-3 minutes on his systems, and that same catalog is
> transferred in json over the wire to his agents, which takes a
> negligible amount of time.  So, while we don't have a bunch of
> independent runs showing the specific wins, I think we have enough
> anecdotal data that we can be sure.

I remember we switched to zaml back then and that was better than with
default ruby yaml.

>> Daniel, you are working on the YAML caching, right? Have you had a
>> chance to profile before and after the patch with a realistic setup
>> to see what kind of improvement we might expect from that change?
>> 
>> On Jul 5, 2012, at 12:28 AM, DEGREMONT Aurelien wrote:
>> 
>>> Le 04/07/2012 19:29, Brice Figureau a écrit :
>>>> Fixing #2198 [1] would be a very good start, then parallelize 
>>>> non-dependent sub-trees. In a word, that's not easy.
>>> I know that's would the *true* way to speed up puppet agent. But,
>>> as you said, I think this is far from trivial. But I do not
>>> enough of puppet internal to be sure about that.
> 
> I agree that parallelization might make a big difference, but it has
> the chance to be a world of hurt, too.  Threading in general is hard,
> threading in ruby is ridiculous, and threading around operations on
> the system whose interactions you can't predict (e.g., two package
> updates at a time, or a package update and a user change) might be
> completely insane.

Definitely. There are still some cases where parallel execution could be
achieved (ie file resources are usually independent except when their
chained by requires).

> However, I think there's a lot of room for improvement that gets us
> close to but not quite at parallel operation, such as grouping
> package operations (which could be done now, albeit not trivially),
> and having better hooks in the RAL for when a resource is ready for
> operation, so there's less blocking.

Yes, I believe we all agree about this.

Do not forget also the following disapointement for new users, that
might be worth fixing:
* recursive file operations
* (too) many file checksum computation

>>> But, as shown in graph in topic "Trying to isolate performance
>>> issues with config retrieval." there is room for improvements,
>>> only doing some profiling/patching I think. There is unjustified
>>> slowness in several places.
> 
> Absolutely agree.  Andy and team are working hard on exactly this.

That's really good. I'm going to try to profile one of my largest node
agent (in noop mode) to see what it gives. This might point to some part
of the code that we could optimize (ie low hanging fruits).

-- 
Brice Figureau
My Blog: http://www.masterzen.fr/

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To post to this group, send email to puppet-dev@googlegroups.com.
To unsubscribe from this group, send email to 
puppet-dev+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to