> For this reason I'm more in favor of converting the debug.log into async/verbose_system.log as suggested by Jeremiah and use a marker to direct these logs (former DEBUG level logs) to that log instead.
+1 Thanks for the inclusive and considerate summary Paulo, and for the suggestion that moves us forward Jeremiah. > From the discussion here it seems that many would like to change how this works. Rather than just turning off the debug.log I would propose that we switch to using the SLF4J MARKER[1] ability to move the messages back to INFO but tag them as belonging to the asynchronous_system.log rather than the normal system.log. > In a way the real issue might be that we don’t have nightly performance runs that would make an accidentally introduced debug statement obvious. > A log statement that runs once or more per read or write should be easy to spot. +1 Could we not put in an assertion into the dtests that fails if any query-path DEBUG statements appear? This would enforce the Logging Guidelines. And those Logging Guidelines should be moved to the in-tree docs. Mick On 20 March 2018 at 05:51, Paulo Motta <pauloricard...@gmail.com> wrote: > Thanks for the constructive input and feedback! From this discussion > it seems like overloading the DEBUG level to signify > async-verbose-INFO on CASSANDRA-10241 is leading to some confusion and > we should fix this. > > However, we cannot simply turn debug.log off as during CASSANDRA-10241 > some verbose-but-useful-info-logs, such as flush information were > changed from INFO to DEBUG, and since the patch has been in for nearly > 3 years it's probably non-revertable. Furthermore, the practice of > using the DEBUG level for logging non-debug stuff has been in our > Logging Guidelines > (https://wiki.apache.org/cassandra/LoggingGuidelines) since then, so > there is probably useful DEBUG stuff that would need to be turned into > INFO if we get rid of debug.log. > > For this reason I'm more in favor of converting the debug.log into > async/verbose_system.log as suggested by Jeremiah and use a marker to > direct these logs (former DEBUG level logs) to that log instead. > Nevertheless, if the majority prefers to get back to a single > system.log file and get rid of debug.log/verbose_system.log altogether > then we would need to go through all log usages and readjust them to > use the proper logging levels and update our logging guidelines to > reflect whatever new policy is decided, not only disabling debug.log > and call it a day. > > 2018-03-19 12:02 GMT-03:00 Jeremiah D Jordan <jerem...@datastax.com>: > > People seem hung up on DEBUG here. The goal of CASSANDRA-10241 was > > to clean up the system.log so that it a very high “signal” in terms of > what was logged > > to it synchronously, but without reducing the ability of the logs to > allow people to > > solve problems and perform post mortem analysis of issues. We have > informational > > log messages that are very useful to understanding the state of things, > like compaction > > status, repair status, flushing, or the state of gossip in the system > that are very useful to > > operators, but if they are all in the system.log make said log file > harder to look over for > > issues. In 10241 the method chosen for how to keep these log messages > around by > > default, but get them out of the system.log was that these messages were > changed from > > INFO to DEBUG and the new debug.log was created. > > > > From the discussion here it seems that many would like to change how > this works. Rather > > than just turning off the debug.log I would propose that we switch to > using the SLF4J > > MARKER[1] ability to move the messages back to INFO but tag them as > belonging to > > the asynchronous_system.log rather than the normal system.log. > > > > [1] https://logback.qos.ch/manual/layouts.html#marker < > https://logback.qos.ch/manual/layouts.html#marker> > > https://www.slf4j.org/faq.html#fatal <https://www.slf4j.org/faq. > html#fatal> > > > > > >> On Mar 19, 2018, at 9:01 AM, Stefan Podkowinski <s...@apache.org> > wrote: > >> > >> I'd agree that INFO should be the default. Turning on the DEBUG logging > >> can cause notable performance issues and I would not enable it on > >> production systems unless I really have to. That's why I created > >> CASSANDRA-12696 for 4.0, so you'll be able to at least only partially > >> enable DEBUG based on what's relevant to look at, e.g. `nodetool > >> setlogginglevel bootstrap DEBUG`. > >> > >> But small improvements like that won't change the fact that log files > >> suck in general for more complex analysis, except for trivial tailing > >> and grepping. You have to make sure that logging is enabled and old > >> records you're interested in will not be rotated out. Then you have to > >> gather log files from individual nodes somehow. Eventually I end up with > >> a local tarball with logs in that situation and the fun starts creating > >> hacky, regex loaded Python scripts to parse them. As each log message is > >> limited to a single line of text, it's often missing out relevant > >> details. You also got to create different parsers for different messages > >> of course. It's just inefficient and too time consuming to gather > >> information that way. Let alone implementing more advanced monitoring > >> solutions on top of that. > >> > >> That's exactly why I started working on the "diagnostic events" > >> (CASSANDRA-12944) idea more than a year ago. There's also support for > >> persistency (CASSANDRA-13460) that would implement storing important but > >> infrequent events as rich json objects in a local keyspace and allows > >> retrieving them by using CQL. I still like the idea and think it's worth > >> pursuing. > >> > >> > >> On 19.03.18 09:53, Alain RODRIGUEZ wrote: > >>> Hello, > >>> > >>> I am not developing Cassandra, but I am using it actively and helping > >>> people to work with it. My perspective might be missing some code > >>> considerations and history as I did not go through the ticket where > this > >>> 'debug' level was added by default. But here is a feedback after > upgrading > >>> a few clusters to Cassandra 2.2: > >>> > >>> When upgrading a cluster to Cassandra 2.2, 'disable the debug logs' is > in > >>> my runbook. I mean, very often, when some cluster is upgraded to > Cassandra > >>> 2.2 and has problems with performances, the 2 most frequent issues are: > >>> > >>> - DEBUG level being turned on > >>> - and / or dynamic snitching being enabled > >>> > >>> This is especially true for high percentile (very clear on p99). Let's > put > >>> the dynamic snitch aside as it is not our topic here. > >>> > >>> From an operational perspective, I prefer to set the debug level to > 'DEBUG' > >>> when I need it than having, out of the box, something that is > unexpected > >>> and impact performances. Plus the debug level can be changed without > >>> restarting the node, through 'JMX' or even using 'nodetool' now. > >>> > >>> Also in most cases, the 'INFO' level is good enough for me to detect > most > >>> of the issues. I was even able to recreate a detailed history of > events for > >>> a customer recently, 'INFO' logs are already very powerful and > complete I > >>> believe (nice work on this by the way). Then monitoring is helping a > lot > >>> too. I did not have to use debug logs for a long time. It might > happen, but > >>> I will find my way to enable them. > >>> > >>> Even though it feels great to be able to help people with that easily > >>> because the cause is often the same and turning off the logs is a > >>> low hanging fruit in C*2.2 clusters that have very nice results and is > easy > >>> to achieve, I would prefer people not to fall into these performances > traps > >>> in the first place. In my head, 'Debug' logs should be for debug > purposes > >>> (by opposition to 'always on'). It seems legit. I am surprised this > brings > >>> so many discussions I thought this was a common standard widely > accepted, > >>> and beyond Cassandra. That being said, it is good to see those > exchanges > >>> are happening, so the decision that will be taken will be a good one, > I am > >>> sure. I hope this comment will help, I have no other goal, for sure I > am > >>> not willing to feed a conflict but a talk and I hope no one felt > offended > >>> by this feedback. I believe this change was made aiming at > >>> helping/improving things, but it turns out it is more of an annoyance > than > >>> truly helpful (my personal perspective). > >>> > >>> I would +1 on making 'INFO' default again, but if someone is missing > >>> information that should be in 'INFO'. If some informations are missing > at > >>> the 'INFO' level, why not add informations that should be at the 'INFO' > >>> level there directly and keep log levels meaningful? Making sure we do > not > >>> bring the logs degrading performances from 'Debug' to 'Info' as much > as we > >>> can. > >>> > >>> Hope this is useful, > >>> > >>> C*heers, > >>> > >>> ----------------------- > >>> Alain Rodriguez - @arodream - al...@thelastpickle.com > >>> France / Spain > >>> > >>> The Last Pickle - Apache Cassandra Consulting > >>> http://www.thelastpickle.com > >>> > >>> 2018-03-19 2:18 GMT+00:00 kurt greaves <k...@instaclustr.com>: > >>> > >>>> On the same page as Michael here. We disable debug logs in production > due > >>>> to the performance impact. Personally I think if debug logging is > necessary > >>>> for users to use the software we're doing something wrong. Also in my > >>>> experience, if something doesn't reproduce it will not get fixed. > Debug > >>>> logging helps, but I've never seen a case where an actual bug simply > >>>> *doesn't* reproduce eventually, and I'm sure if this issue appears > debug > >>>> logging could be enabled after the fact for the relevant classes and > >>>> eventually it will reoccur and we could solve the problem. I've never > seen > >>>> a user say no to helping debug a problem with patched jars/changes to > their > >>>> system (like logging). I'd much prefer we pushed that harder rather > than > >>>> just saying "Everyone gets debug logging!". I'm also really not sold > that > >>>> debug logging saves the day. To me it mostly just speeds up the > >>>> identification process, it generally doesn't expose magical > information > >>>> that wasn't available before, you just had to think about it a bit > more. > >>>> > >>>> > >>>> In a way the real issue might be that we don’t have nightly > performance > >>>>> runs that would make an accidentally introduced debug statement > obvious. > >>>> This is an issue, but I don't think it's the *real* issue. As already > >>>> noted, debug logging is for debugging, which normal users shouldn't > have to > >>>> think about when they are just operating the software. We shouldn't > risk > >>>> performance regressions just for developer convenience. > >>>> > >>>> On 19 March 2018 at 00:55, Ariel Weisberg <adwei...@fastmail.fm> > wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> In a way the real issue might be that we don’t have nightly > performance > >>>>> runs that would make an accidentally introduced debug statement > obvious. > >>>>> > >>>>> A log statement that runs once or more per read or write should be > easy > >>>> to > >>>>> spot. I haven’t measured the impact though. And as a bonus by having > this > >>>>> we can spot a variety of performance issues introduced by all kinds > of > >>>>> changes. > >>>>> > >>>>> Ariel > >>>>> > >>>>>> On Mar 18, 2018, at 3:46 PM, Jeff Jirsa <jji...@gmail.com> wrote: > >>>>>> > >>>>>> In Cassandra-10241 I said I was torn on this whole ticket, since > most > >>>>> people would end up turning it off if it had a negative impact. You > said: > >>>>>> “I'd like to emphasize that we're not talking about turning debug or > >>>>> trace on for client-generated request paths. There's way too much > data > >>>>> generated and it's unlikely to be useful. > >>>>>> What we're proposing is enabling debug logging ONLY for cluster > state > >>>>> changes like gossip and schema, and infrequent activities like > repair. “ > >>>>>> Clearly there’s a disconnect here - we’ve turned debug logging on > for > >>>>> everything and shuffled some stuff to trace, which is a one time > action > >>>> but > >>>>> is hard to protect against regression. In fact, just looking at the > read > >>>>> callback shows two instances of debug log in the client request path > >>>>> (exercise for the reader to “git blame”). > >>>>>> Either we can go clean up all the surprises that leaked through, or > we > >>>>> can turn off debug and start backing out some of the changes in > 10241. > >>>>> Putting stuff like compaction in the same bucket as digest mismatch > and > >>>>> gossip state doesn’t make life materially better for most people. > >>>>>> -- > >>>>>> Jeff Jirsa > >>>>>> > >>>>>> > >>>>>>> On Mar 18, 2018, at 11:21 AM, Jonathan Ellis <jbel...@gmail.com> > >>>> wrote: > >>>>>>> That really depends on whether you're judicious in deciding what to > >>>> log > >>>>> at > >>>>>>> debug, doesn't it? > >>>>>>> > >>>>>>> On Sun, Mar 18, 2018 at 12:57 PM, Michael Kjellman < > >>>> kjell...@apple.com> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> +1. this is how it works. > >>>>>>>> > >>>>>>>> your computer doesn’t run at debug logging by default. your phone > >>>>> doesn’t > >>>>>>>> either. neither does your smart tv. your database can’t be > running at > >>>>> debug > >>>>>>>> just because it makes our lives as engineers easier. > >>>>>>>> > >>>>>>>>> On Mar 18, 2018, at 5:14 AM, Alexander Dejanovski < > >>>>>>>> a...@thelastpickle.com> wrote: > >>>>>>>>> It's a tiny bit unusual to turn on debug logging for all users by > >>>>> default > >>>>>>>>> though, and there should be occasions to turn it on when facing > >>>> issues > >>>>>>>> that > >>>>>>>>> you want to debug (if they can be easily reproduced). > >>>>>>> > >>>>>>> -- > >>>>>>> Jonathan Ellis > >>>>>>> co-founder, http://www.datastax.com > >>>>>>> @spyced > >>>>> ------------------------------------------------------------ > --------- > >>>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >>>>> For additional commands, e-mail: dev-h...@cassandra.apache.org > >>>>> > >>>>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > > -- Mick Semb Wever Australia The Last Pickle Apache Cassandra Consulting http://www.thelastpickle.com