On 10-08-10 07:16, Rob Lanphier wrote:
> At any rate, there are a couple of problems with the way that it works:
> 1.  Once we saturate the NIC on the logging machine, the quality of
> our sampling degrades pretty rapidly.  We've generally had a problem
> with that over the past few months.
>   

As already stated elsewhere, we didn't really saturate any NICs, just
some socket buffers. Because of the large number of configured log
pipes, the software (udp2log) could not empty the socket buffers fast
enough.

> If this were your typical commercial operation, the answer would be
> "why aren't you just logging into Streambase?" (or some other data
> warehousing storage solution).  I'm not suggesting that we do that (or
> even look at any of the solutions that bill themselves as open source
> alternatives), since, while our needs are increasing, we still aren't
> planning to be anywhere near as sophisticated as a lot of data
> tracking orgs.  Still, it's worth asking questions about our existing
> setup.  Should we be looking optimize our existing single-box setup,
> extending our software to have multi-node collection, or looking at a
> whole new collection strategy?
>
>   

Besides the ideas that are currently being kicked around of improving or
rewriting the udp log collection software, there's also always the
short-term, easy option of sending a multicast UDP stream, and having
multiple collectors with distinct log pipes setup. E.g. one machine for
the sampled logging, and another, independent machine to do all the
special purpose log streams. I do like more efficient software solutions
rather than throwing more iron at the problem, though. :)

-- 
Mark Bergsma <m...@wikimedia.org>
Operations Engineer, Wikimedia Foundation


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to