Alex,

Thanks for the great information.  I'll check out the Jan email thread
and then follow up with more questions.

BTW, the script statement did come across rather badly, sorry about that
(and after all that PC training I was required to take!)

Thanks
Chris

On Thu, 2006-02-23 at 10:36, Alex Balk wrote:
> Chris Croswhite wrote:
> 
> > Alex,
> >
> > Yeah, I already have a ton of questions and need some pointers in large
> > scale deploys (best practices, do's, dont's, etc,).
> >
> >   
> 
> Till I get the legal issues out of the way, I can't share the scripts...
> What I can do, however, is share the ideas I've implemented, as those
> were developed outside the customer environment and were just spin-offs
> of common concepts like orchestration, federation, etc.
> Here's a few things:
> 
>     * When unicasting a tree hierarchy of nodes could provide useful
>       drill-down capabilities.
>     * Most organization already have some form of logical grouping for
>       cluster nodes. For example: faculty, course, devel-group, etc.
>       Within those groups one might find additional logical
>       partitioning. For example: platform, project, developer, etc.
>       Using these as the basis for constructing your logical hierarchy
>       provides real-world basis for information analysis, saves you the
>       trouble of deciding how to construct your grid tree and prevents
>       gmond aggregators from handling too many nodes (though I've found
>       that a single gmond can store information for 1k nodes without
>       noticeable impact on performance).
>     * Nodes will sometimes move between logical clusters. Hence,
>       whatever mechanism you have in place has to detect this and
>       regenerate its gmond.conf.
>     * Using a central map which stores "cluster_name gmond_aggregator
>       gmetad_aggregator" will save you the headache of figuring out who
>       reports where, who pulls info from where, etc. If you take this
>       approach be sure to cache this file locally or put it on your
>       parallel FS (if you use one). You wouldn't want 10k hosts trying
>       to retrieve it from a single filer.
>     * The same map file approach can be used for gmetrics. This allows
>       anyone in your IT group to add custom metrics without having to be
>       familiar with gmetric and without having to handle crontabs. A
>       mechanism which reads (the cached version of) this file could
>       handle inserting/removing crontabs as needed.
> 
> Also, check out the ganglia-general thread from January 2006 called
> "Pointers on architecting a large scale ganglia setup".
> 
> 
> > I would love to get my hands on your shell scripts to figure out what
> > you are doing (the unicast idea is pretty good).
> >
> >   
> 
> Okay, that sounds almost obscene ;-)
> 
> 
> Cheers,
> Alex
> 
> > Chris
> >
> >
> > On Thu, 2006-02-23 at 09:35, Alex Balk wrote:
> >   
> >> Chris,
> >>
> >>
> >> Cool! Thanks!
> >>
> >> If you need any pointers on large-scale deployments, beyond the
> >> excellent thread that was discussed here last month, drop us a line. I'm
> >> managing Ganglia on a cluster of about the same size as yours, spanning
> >> multiple sites.
> >>
> >>
> >> I've developed a framework for automating the deployment of Ganglia in a
> >> federated mode (we use unicast). I'm currently negotiating the
> >> possibility of releasing this framework to the Ganglia community. It's
> >> not the prettiest piece of code, as it's written in bash and spans a few
> >> thousands lines of code (I didn't expect it to grow into something like
> >> that), but it provides some nice functionality like map-based logical
> >> clusters, automatic node migration between clusters, map-based gmetrics,
> >> and some other candies.
> >>
> >> If negotiations fail I'll consider rewriting it from scratch in perl on
> >> my own free time.
> >>
> >>
> >> btw, I think Martin was looking for a build on HP-UX 11...
> >>
> >>
> >> Cheers,
> >>
> >> Alex
> >>
> >>
> >> Chris Croswhite wrote:
> >>
> >>     
> >>>> This raises another issue, which I believe is significant to the
> >>>> development process of Ganglia. At the moment we don't seem to have
> >>>> (correct me if I'm wrong) official testers for various platforms.
> >>>> Maybe we could have some people volunteer to be official beta testers?
> >>>> We wouldn't have to have a release out the door without properly
> >>>> testing it under most OS/archs.
> >>>>     
> >>>>         
> >>> The company I work for is looking to deploy ganglia across all compute
> >>> farms, some ~10k systems.  I could help with beta testing on these
> >>> platforms:
> >>> HP-UX 11+11i
> >>> AIX51+53
> >>> slowlaris7-10
> >>> solaris10 x64
> >>> linux32/64 (SuSE and RH)
> >>>
> >>> Just let me know when you have a new candidate and I can push the client
> >>> onto some test systems.
> >>>
> >>> Chris
> >>>
> >>>   
> >>>       
> >
> >   


Reply via email to