I also try to limit what goes at higher warning levels. One of my goals over hte next few months is to improve our current logging. It sounds like this is a good time to make sure we're on the same page.
We're going to have to train users on something (esp since our currently logging is very noisy). The short version I like is "Info and more severe are for operators; info and less severe are for developers." Here's what I usually use as a guideline (constrained to slf4j levels): = ERROR Something is wrong and an operator needs to do something, preferably very soon. In other words, if I was on call I'd expect to get paged. = WARN Something is amiss, but not of immediate concern. An operator who is on call but not busy at the moment might want to investigate some kind of underlying issue, but the system will continue to function within some reasonable bound. = INFO Summary information about normal operations that is safe to ignore. GC information, throughput stats, that kind of thing. = DEBUG Low level information that is not normally useful, but will help determine the cause of a system malfunction. Usually something a developer or tier 3 supporter would want when something was going wrong (e.g. stack traces). = TRACE Detailed low level information at a volume that probably can't be gathered in production. Eric, do those all sound reasonable? I want to make sure we have a common basis before I get into the specifics of this case. -Sean On Fri, Apr 18, 2014 at 8:21 PM, Eric Newton <eric.new...@gmail.com> wrote: > -1 > > I would hesitate to put *any* message at WARN. It is normal for balancing > to take a little while, especially for some of my users who have their own > balancing algorithm. > > Users feel the need to fix the problem; after all, it's there in big scary > yellow on the monitor page. I don't like training users to ignore scary > yellow. Is it a problem, or not? > > Alternatively, put the balance info into the master status, and display it. > Like GC collection time... hey, I've been migrating these tablets for a > long time... turn yellow/red. > > -Eric > > > > > On Fri, Apr 18, 2014 at 4:03 PM, Sean Busbey <bus...@cloudera.com> wrote: > > > At the moment all of our logs about problems balancing are at DEBUG. > > > > Given the impact to a cluster when this happens (skewing load onto few > > servers, in some case severely), I'd like to raise it to WARN so that it > > surfaces for operators in the Monitor and in the non-debug log. > > > > Thought I'd do a quick lazy consensus check before filing a jira and > taking > > care of it. > > > > -- > > Sean > > > -- Sean