It would be nice to model "Danger!" messages with "All Clear!" directly.
I'll make a ticket. On Thu, Nov 6, 2014 at 3:47 PM, Josh Elser <[email protected]> wrote: > > > > On Nov. 6, 2014, 5:47 p.m., kturner wrote: > > > > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java, > line 250 > > > < > https://reviews.apache.org/r/27654/diff/3/?file=751140#file751140line250> > > > > > > The compaction code remembers when it logged an exception and does > not do it again. It also logs a message if the compaction becomes > unstuck. An advantage I thought of w/ repeatedly logging, is that you > could see the stack trace changing (or not). > > > > > > > > > The stack trace is a possible trace. By the time logging > happens, the assignment could have completed and the thread could have > moved on to other things. > > > > Josh Elser wrote: > > Yeah, since these are running fairly regularly (order of seconds) a > stuck assignment could get really spammy. Like you point out, there could > be value gained from printing out the stack more than once. Maybe I could > add some backoff which only warns so often? > > > > bq. By the time logging happens, the assignment could have completed > and the thread could have moved on to other things. > > > > Do you think the message should be updated to be more clear about > this? A "Maybe you should look into this" type message? > > > > kturner wrote: > > > a stuck assignment could get really spammy > > > > I think that spam is probably ok as long as the default is high > enough such that when it does happen, its something to be concerned about. > Could make the timer check a little less frequently. > > > > > Do you think the message should be updated to be more clear about > this? > > > > I think compaction code just says its a possible stack trace. I > suppose a good solution would be to have error codes, then user can look up > error code and get nitty gritty details. Can't really put too much info in > log message. > > > > Josh Elser wrote: > > bq. Could make the timer check a little less frequently. > > > > As long as we have a long threshold for warning about a stuck > assignment, we can easily make a longer period on the timer. The timer > period dictates the minimum stuck assignment time -- I can update the > description with a clarification. > > > > kturner wrote: > > I was thinking that once an assignment is considered stuck, that > each time the timer kicks a check (I think its either 5 secs or 1 sec, not > sure) that it will cause a spam. Was thinking this could be increased to > produce less spam. The period of the timer could be a function of > tserver.assignment.duration.warning, like 1/4 or 1/2. > > bq. The period of the timer could be a function of > tserver.assignment.duration.warning, like 1/4 or 1/2. > > That would work, unless the user changed the value of the duration > warning. It would still fire at the old period (unless I'm much trickier > about scheduling the task to run). > > Regardless need to think some more about preventing spam. > > > - Josh > > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/27654/#review60185 > ----------------------------------------------------------- > > > On Nov. 6, 2014, 12:58 a.m., Josh Elser wrote: > > > > ----------------------------------------------------------- > > This is an automatically generated e-mail. To reply, visit: > > https://reviews.apache.org/r/27654/ > > ----------------------------------------------------------- > > > > (Updated Nov. 6, 2014, 12:58 a.m.) > > > > > > Review request for accumulo. > > > > > > Bugs: ACCUMULO-3304 > > https://issues.apache.org/jira/browse/ACCUMULO-3304 > > > > > > Repository: accumulo > > > > > > Description > > ------- > > > > Watches assignments and reports when an assignment is running for longer > than a configured time. > > > > > > Diffs > > ----- > > > > core/src/main/java/org/apache/accumulo/core/conf/Property.java 56f3d9c > > > > server/tserver/src/main/java/org/apache/accumulo/tserver/ActiveAssignmentRunnable.java > PRE-CREATION > > > > server/tserver/src/main/java/org/apache/accumulo/tserver/RunnableStartedAt.java > PRE-CREATION > > > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java > 94be0bb > > > > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java > 935ffeb > > > > Diff: https://reviews.apache.org/r/27654/diff/ > > > > > > Testing > > ------- > > > > Very minimal. > > > > > > Thanks, > > > > Josh Elser > > > > > >
