Changing logging level at runtime is good, but we should also consider 
non-reproducible cases. After a system becomes complicated enough and gets 
widely used, I believe it would get such cases. 
Agree that stack trace should be logged only when it is really needed. It looks 
to me hbase is doing good here. E.g. In hbase UT log, I don't see useless 
stacktrace at least in 98.5(It is not the case for Zookeeper, sometimes I see 
quite many stacktraces logged with INFO or WARN...) , things have changed in 
master?
Thanks.

在 2014-10-26,3:42,Sean Busbey <bus...@cloudera.com> 写道:

> That sounds great!
> 
> Either a shell or hbase command would be my suggestion, preferably with
> some logic for mapping from the short logger name that ends up in the log
> to the actual logger to adjust.
> 
> The ref guide would need a section on how to use the proposed log changing
> tool along based on seeing a log message.
> 
> Would being able to make the logger with the failure change at runtime to
> show the stack trace address your concerns Andrew and Qiang?
> 
> To be clear, I'm not saying we should *never* have stack traces above
> DEBUG, just that they should be rare and reserved for when we can't point
> an operator at something to do.
> 
> -- 
> Sean
> On Oct 25, 2014 1:50 PM, "Nick Dimiduk" <ndimi...@gmail.com> wrote:
> 
>> We have the ability to alter log levels at runtime. This would allow an
>> operator to temporarily increase log level for afflicted components, even
>> in production. Doing this on a server-by-server basis should have minimal
>> impact on overall cluster performance. Maybe this needs to be better
>> documented? Maybe we need a script that makes this easier, or could be
>> managed via a new shell command?
>> 
>> On Saturday, October 25, 2014, Andrew Purtell <apurt...@apache.org> wrote:
>> 
>>> ​
>>> On Sat, Oct 25, 2014 at 6:34 AM, Sean Busbey <bus...@cloudera.com
>>> <javascript:;>> wrote:
>>> 
>>>> Even if debug is disabled in production, it could be enabled on a
>>>> non-production system for reproducing the problem, no?
>>>> 
>>> 
>>> ​In my experience, often enough, no.​
>>> 
>>> I do hear the complaint that Hadoop ecosystem projects are quite operator
>>> unfriendly because error messages most often come in the form of a
>>> stacktrace. It's a totally valid point. I think we could certainly
>> improve
>>> the exception message printed ahead of the stacktrace in a large number
>> of
>>> cases.
>>> 
>>> 
>>> 
>>> On Sat, Oct 25, 2014 at 6:34 AM, Sean Busbey <bus...@cloudera.com
>>> <javascript:;>> wrote:
>>> 
>>>> Even if debug is disabled in production, it could be enabled on a
>>>> non-production system for reproducing the problem, no?
>>>> 
>>>> --
>>>> Sean
>>>> On Oct 25, 2014 7:11 AM, "Qiang Tian" <tian...@gmail.com
>> <javascript:;>>
>>> wrote:
>>>> 
>>>>> perhaps case by case is better. stacktrace is one of most important
>>>> problem
>>>>> determination methods.  debug is mostly disabled in production, we
>> may
>>>> lose
>>>>> important clues.
>>>>> 
>>>>> 
>>>>> On Sat, Oct 25, 2014 at 1:14 PM, Sean Busbey <bus...@cloudera.com
>>> <javascript:;>>
>>>> wrote:
>>>>> 
>>>>>> Hi!
>>>>>> 
>>>>>> Right now we have many failure paths where we send stack traces to
>>> log
>>>>>> files at ERROR / WARN. In an effort to make things easier to
>> operate,
>>>> I'd
>>>>>> like to propose we move towards:
>>>>>> 
>>>>>> * INFO/WARN/ERROR : description of failure and if possible an
>> action
>>> an
>>>>>> operator could take to fix/diagnose
>>>>>> * DEBUG : information needed to handle failures that require
>>> developer
>>>>>> action, i.e. stack traces
>>>>>> 
>>>>>> I figure this can go as one or more subtasks off of HBASE-12341,
>> but
>>>>> wanted
>>>>>> to float things here before I get started.
>>>>>> 
>>>>>> --
>>>>>> Sean
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> 
>>>  - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>>> 
>> 

Reply via email to