Re: Debug logging enabled by default since 2.2

Mick Semb Wever Mon, 19 Mar 2018 16:17:10 -0700

> For this reason I'm more in favor of converting the debug.log into
async/verbose_system.log as suggested by Jeremiah and use a marker to
direct these logs (former DEBUG level logs) to that log instead.



+1

Thanks for the inclusive and considerate summary Paulo, and for the
suggestion that moves us forward Jeremiah.


> From the discussion here it seems that many would like to change how this
works.  Rather
than just turning off the debug.log I would propose that we switch to using
the SLF4J
MARKER[1] ability to move the messages back to INFO but tag them as
belonging to
the asynchronous_system.log rather than the normal system.log.


>  In a way the real issue might be that we don’t have nightly performance
runs that would make an accidentally introduced debug statement obvious.
> A log statement that runs once or more per read or write should be easy
to spot.


+1
Could we not put in an assertion into the dtests that fails if any
query-path DEBUG statements appear?
This would enforce the Logging Guidelines.

And those Logging Guidelines should be moved to the in-tree docs.

Mick


On 20 March 2018 at 05:51, Paulo Motta <pauloricard...@gmail.com> wrote:

> Thanks for the constructive input and feedback! From this discussion
> it seems like overloading the DEBUG level to signify
> async-verbose-INFO on CASSANDRA-10241 is leading to some confusion and
> we should fix this.
>
> However, we cannot simply turn debug.log off as during CASSANDRA-10241
> some verbose-but-useful-info-logs, such as flush information were
> changed from INFO to DEBUG, and since the patch has been in for nearly
> 3 years it's probably non-revertable. Furthermore, the practice of
> using the DEBUG level for logging non-debug stuff has been in our
> Logging Guidelines
> (https://wiki.apache.org/cassandra/LoggingGuidelines) since then, so
> there is probably useful DEBUG stuff that would need to be turned into
> INFO if we get rid of debug.log.
>
> For this reason I'm more in favor of converting the debug.log into
> async/verbose_system.log as suggested by Jeremiah and use a marker to
> direct these logs (former DEBUG level logs) to that log instead.
> Nevertheless, if the majority prefers to get back to a single
> system.log file and get rid of debug.log/verbose_system.log altogether
> then we would need to go through all log usages and readjust them to
> use the proper logging levels and update our logging guidelines to
> reflect whatever new policy is decided, not only disabling debug.log
> and call it a day.
>
> 2018-03-19 12:02 GMT-03:00 Jeremiah D Jordan <jerem...@datastax.com>:
> > People seem hung up on DEBUG here.  The goal of CASSANDRA-10241 was
> > to clean up the system.log so that it a very high “signal” in terms of
> what was logged
> > to it synchronously, but without reducing the ability of the logs to
> allow people to
> > solve problems and perform post mortem analysis of issues.  We have
> informational
> > log messages that are very useful to understanding the state of things,
> like compaction
> > status, repair status, flushing, or the state of gossip in the system
> that are very useful to
> > operators, but if they are all in the system.log make said log file
> harder to look over for
> > issues.  In 10241 the method chosen for how to keep these log messages
> around by
> > default, but get them out of the system.log was that these messages were
> changed from
> > INFO to DEBUG and the new debug.log was created.
> >
> > From the discussion here it seems that many would like to change how
> this works.  Rather
> > than just turning off the debug.log I would propose that we switch to
> using the SLF4J
> > MARKER[1] ability to move the messages back to INFO but tag them as
> belonging to
> > the asynchronous_system.log rather than the normal system.log.
> >
> > [1] https://logback.qos.ch/manual/layouts.html#marker <
> https://logback.qos.ch/manual/layouts.html#marker>
> > https://www.slf4j.org/faq.html#fatal <https://www.slf4j.org/faq.
> html#fatal>
> >
> >
> >> On Mar 19, 2018, at 9:01 AM, Stefan Podkowinski <s...@apache.org>
> wrote:
> >>
> >> I'd agree that INFO should be the default. Turning on the DEBUG logging
> >> can cause notable performance issues and I would not enable it on
> >> production systems unless I really have to. That's why I created
> >> CASSANDRA-12696 for 4.0, so you'll be able to at least only partially
> >> enable DEBUG based on what's relevant to look at, e.g. `nodetool
> >> setlogginglevel bootstrap DEBUG`.
> >>
> >> But small improvements like that won't change the fact that log files
> >> suck in general for more complex analysis, except for trivial tailing
> >> and grepping. You have to make sure that logging is enabled and old
> >> records you're interested in will not be rotated out. Then you have to
> >> gather log files from individual nodes somehow. Eventually I end up with
> >> a local tarball with logs in that situation and the fun starts creating
> >> hacky, regex loaded Python scripts to parse them. As each log message is
> >> limited to a single line of text, it's often missing out relevant
> >> details. You also got to create different parsers for different messages
> >> of course. It's just inefficient and too time consuming to gather
> >> information that way. Let alone implementing more advanced monitoring
> >> solutions on top of that.
> >>
> >> That's exactly why I started working on the "diagnostic events"
> >> (CASSANDRA-12944) idea more than a year ago. There's also support for
> >> persistency (CASSANDRA-13460) that would implement storing important but
> >> infrequent events as rich json objects in a local keyspace and allows
> >> retrieving them by using CQL. I still like the idea and think it's worth
> >> pursuing.
> >>
> >>
> >> On 19.03.18 09:53, Alain RODRIGUEZ wrote:
> >>> Hello,
> >>>
> >>> I am not developing Cassandra, but I am using it actively and helping
> >>> people to work with it. My perspective might be missing some code
> >>> considerations and history as I did not go through the ticket where
> this
> >>> 'debug' level was added by default. But here is a feedback after
> upgrading
> >>> a few clusters to Cassandra 2.2:
> >>>
> >>> When upgrading a cluster to Cassandra 2.2, 'disable the debug logs' is
> in
> >>> my runbook. I mean, very often, when some cluster is upgraded to
> Cassandra
> >>> 2.2 and has problems with performances, the 2 most frequent issues are:
> >>>
> >>> - DEBUG level being turned on
> >>> - and / or dynamic snitching being enabled
> >>>
> >>> This is especially true for high percentile (very clear on p99). Let's
> put
> >>> the dynamic snitch aside as it is not our topic here.
> >>>
> >>> From an operational perspective, I prefer to set the debug level to
> 'DEBUG'
> >>> when I need it than having, out of the box, something that is
> unexpected
> >>> and impact performances. Plus the debug level can be changed without
> >>> restarting the node, through 'JMX' or even using 'nodetool' now.
> >>>
> >>> Also in most cases, the 'INFO' level is good enough for me to detect
> most
> >>> of the issues. I was even able to recreate a detailed history of
> events for
> >>> a customer recently, 'INFO' logs are already very powerful and
> complete I
> >>> believe (nice work on this by the way). Then monitoring is helping a
> lot
> >>> too. I did not have to use debug logs for a long time. It might
> happen, but
> >>> I will find my way to enable them.
> >>>
> >>> Even though it feels great to be able to help people with that easily
> >>> because the cause is often the same and turning off the logs is a
> >>> low hanging fruit in C*2.2 clusters that have very nice results and is
> easy
> >>> to achieve, I would prefer people not to fall into these performances
> traps
> >>> in the first place. In my head, 'Debug' logs should be for debug
> purposes
> >>> (by opposition to 'always on'). It seems legit. I am surprised this
> brings
> >>> so many discussions I thought this was a common standard widely
> accepted,
> >>> and beyond Cassandra. That being said, it is good to see those
> exchanges
> >>> are happening, so the decision that will be taken will be a good one,
> I am
> >>> sure. I hope this comment will help, I have no other goal, for sure I
> am
> >>> not willing to feed a conflict but a talk and I hope no one felt
> offended
> >>> by this feedback. I believe this change was made aiming at
> >>> helping/improving things, but it turns out it is more of an annoyance
> than
> >>> truly helpful (my personal perspective).
> >>>
> >>> I would +1 on making 'INFO' default again, but if someone is missing
> >>> information that should be in 'INFO'. If some informations are missing
> at
> >>> the 'INFO' level, why not add informations that should be at the 'INFO'
> >>> level there directly and keep log levels meaningful? Making sure we do
> not
> >>> bring the logs degrading performances from 'Debug' to 'Info' as much
> as we
> >>> can.
> >>>
> >>> Hope this is useful,
> >>>
> >>> C*heers,
> >>>
> >>> -----------------------
> >>> Alain Rodriguez - @arodream - al...@thelastpickle.com
> >>> France / Spain
> >>>
> >>> The Last Pickle - Apache Cassandra Consulting
> >>> http://www.thelastpickle.com
> >>>
> >>> 2018-03-19 2:18 GMT+00:00 kurt greaves <k...@instaclustr.com>:
> >>>
> >>>> On the same page as Michael here. We disable debug logs in production
> due
> >>>> to the performance impact. Personally I think if debug logging is
> necessary
> >>>> for users to use the software we're doing something wrong. Also in my
> >>>> experience, if something doesn't reproduce it will not get fixed.
> Debug
> >>>> logging helps, but I've never seen a case where an actual bug simply
> >>>> *doesn't* reproduce eventually, and I'm sure if this issue appears
> debug
> >>>> logging could be enabled after the fact for the relevant classes and
> >>>> eventually it will reoccur and we could solve the problem. I've never
> seen
> >>>> a user say no to helping debug a problem with patched jars/changes to
> their
> >>>> system (like logging). I'd much prefer we pushed that harder rather
> than
> >>>> just saying "Everyone gets debug logging!". I'm also really not sold
> that
> >>>> debug logging saves the day. To me it mostly just speeds up the
> >>>> identification process, it generally doesn't expose magical
> information
> >>>> that wasn't available before, you just had to think about it a bit
> more.
> >>>>
> >>>>
> >>>> In a way the real issue might be that we don’t have nightly
> performance
> >>>>> runs that would make an accidentally introduced debug statement
> obvious.
> >>>> This is an issue, but I don't think it's the *real* issue. As already
> >>>> noted, debug logging is for debugging, which normal users shouldn't
> have to
> >>>> think about when they are just operating the software. We shouldn't
> risk
> >>>> performance regressions just for developer convenience.
> >>>>
> >>>> On 19 March 2018 at 00:55, Ariel Weisberg <adwei...@fastmail.fm>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> In a way the real issue might be that we don’t have nightly
> performance
> >>>>> runs that would make an accidentally introduced debug statement
> obvious.
> >>>>>
> >>>>> A log statement that runs once or more per read or write should be
> easy
> >>>> to
> >>>>> spot. I haven’t measured the impact though. And as a bonus by having
> this
> >>>>> we can spot a variety of performance issues introduced by all kinds
> of
> >>>>> changes.
> >>>>>
> >>>>> Ariel
> >>>>>
> >>>>>> On Mar 18, 2018, at 3:46 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> >>>>>>
> >>>>>> In Cassandra-10241 I said I was torn on this whole ticket, since
> most
> >>>>> people would end up turning it off if it had a negative impact. You
> said:
> >>>>>> “I'd like to emphasize that we're not talking about turning debug or
> >>>>> trace on for client-generated request paths. There's way too much
> data
> >>>>> generated and it's unlikely to be useful.
> >>>>>> What we're proposing is enabling debug logging ONLY for cluster
> state
> >>>>> changes like gossip and schema, and infrequent activities like
> repair. “
> >>>>>> Clearly there’s a disconnect here - we’ve turned debug logging on
> for
> >>>>> everything and shuffled some stuff to trace, which is a one time
> action
> >>>> but
> >>>>> is hard to protect against regression. In fact, just looking at the
> read
> >>>>> callback shows two instances of debug log in the client request path
> >>>>> (exercise for the reader to “git blame”).
> >>>>>> Either we can go clean up all the surprises that leaked through, or
> we
> >>>>> can turn off debug and start backing out some of the changes in
> 10241.
> >>>>> Putting stuff like compaction in the same bucket as digest mismatch
> and
> >>>>> gossip state doesn’t make life materially better for most people.
> >>>>>> --
> >>>>>> Jeff Jirsa
> >>>>>>
> >>>>>>
> >>>>>>> On Mar 18, 2018, at 11:21 AM, Jonathan Ellis <jbel...@gmail.com>
> >>>> wrote:
> >>>>>>> That really depends on whether you're judicious in deciding what to
> >>>> log
> >>>>> at
> >>>>>>> debug, doesn't it?
> >>>>>>>
> >>>>>>> On Sun, Mar 18, 2018 at 12:57 PM, Michael Kjellman <
> >>>> kjell...@apple.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> +1. this is how it works.
> >>>>>>>>
> >>>>>>>> your computer doesn’t run at debug logging by default. your phone
> >>>>> doesn’t
> >>>>>>>> either. neither does your smart tv. your database can’t be
> running at
> >>>>> debug
> >>>>>>>> just because it makes our lives as engineers easier.
> >>>>>>>>
> >>>>>>>>> On Mar 18, 2018, at 5:14 AM, Alexander Dejanovski <
> >>>>>>>> a...@thelastpickle.com> wrote:
> >>>>>>>>> It's a tiny bit unusual to turn on debug logging for all users by
> >>>>> default
> >>>>>>>>> though, and there should be occasions to turn it on when facing
> >>>> issues
> >>>>>>>> that
> >>>>>>>>> you want to debug (if they can be easily reproduced).
> >>>>>>>
> >>>>>>> --
> >>>>>>> Jonathan Ellis
> >>>>>>> co-founder, http://www.datastax.com
> >>>>>>> @spyced
> >>>>> ------------------------------------------------------------
> ---------
> >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>>>>
> >>>>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


-- 
Mick Semb Wever
Australia

The Last Pickle
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Debug logging enabled by default since 2.2

Reply via email to