https://issues.apache.org/jira/browse/ACCUMULO-3311
On Thu, Nov 6, 2014 at 7:29 PM, Eric Newton <eric.new...@gmail.com> wrote: > It would be nice to model "Danger!" messages with "All Clear!" directly. > > I'll make a ticket. > > On Thu, Nov 6, 2014 at 3:47 PM, Josh Elser <josh.el...@gmail.com> wrote: > >> >> >> > On Nov. 6, 2014, 5:47 p.m., kturner wrote: >> > > >> server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java, >> line 250 >> > > < >> https://reviews.apache.org/r/27654/diff/3/?file=751140#file751140line250> >> > > >> > > The compaction code remembers when it logged an exception and >> does not do it again. It also logs a message if the compaction becomes >> unstuck. An advantage I thought of w/ repeatedly logging, is that you >> could see the stack trace changing (or not). >> > > >> > > >> > > The stack trace is a possible trace. By the time logging >> happens, the assignment could have completed and the thread could have >> moved on to other things. >> > >> > Josh Elser wrote: >> > Yeah, since these are running fairly regularly (order of seconds) a >> stuck assignment could get really spammy. Like you point out, there could >> be value gained from printing out the stack more than once. Maybe I could >> add some backoff which only warns so often? >> > >> > bq. By the time logging happens, the assignment could have >> completed and the thread could have moved on to other things. >> > >> > Do you think the message should be updated to be more clear about >> this? A "Maybe you should look into this" type message? >> > >> > kturner wrote: >> > > a stuck assignment could get really spammy >> > >> > I think that spam is probably ok as long as the default is high >> enough such that when it does happen, its something to be concerned about. >> Could make the timer check a little less frequently. >> > >> > > Do you think the message should be updated to be more clear about >> this? >> > >> > I think compaction code just says its a possible stack trace. I >> suppose a good solution would be to have error codes, then user can look up >> error code and get nitty gritty details. Can't really put too much info in >> log message. >> > >> > Josh Elser wrote: >> > bq. Could make the timer check a little less frequently. >> > >> > As long as we have a long threshold for warning about a stuck >> assignment, we can easily make a longer period on the timer. The timer >> period dictates the minimum stuck assignment time -- I can update the >> description with a clarification. >> > >> > kturner wrote: >> > I was thinking that once an assignment is considered stuck, that >> each time the timer kicks a check (I think its either 5 secs or 1 sec, not >> sure) that it will cause a spam. Was thinking this could be increased to >> produce less spam. The period of the timer could be a function of >> tserver.assignment.duration.warning, like 1/4 or 1/2. >> >> bq. The period of the timer could be a function of >> tserver.assignment.duration.warning, like 1/4 or 1/2. >> >> That would work, unless the user changed the value of the duration >> warning. It would still fire at the old period (unless I'm much trickier >> about scheduling the task to run). >> >> Regardless need to think some more about preventing spam. >> >> >> - Josh >> >> >> ----------------------------------------------------------- >> This is an automatically generated e-mail. To reply, visit: >> https://reviews.apache.org/r/27654/#review60185 >> ----------------------------------------------------------- >> >> >> On Nov. 6, 2014, 12:58 a.m., Josh Elser wrote: >> > >> > ----------------------------------------------------------- >> > This is an automatically generated e-mail. To reply, visit: >> > https://reviews.apache.org/r/27654/ >> > ----------------------------------------------------------- >> > >> > (Updated Nov. 6, 2014, 12:58 a.m.) >> > >> > >> > Review request for accumulo. >> > >> > >> > Bugs: ACCUMULO-3304 >> > https://issues.apache.org/jira/browse/ACCUMULO-3304 >> > >> > >> > Repository: accumulo >> > >> > >> > Description >> > ------- >> > >> > Watches assignments and reports when an assignment is running for >> longer than a configured time. >> > >> > >> > Diffs >> > ----- >> > >> > core/src/main/java/org/apache/accumulo/core/conf/Property.java 56f3d9c >> > >> >> server/tserver/src/main/java/org/apache/accumulo/tserver/ActiveAssignmentRunnable.java >> PRE-CREATION >> > >> >> server/tserver/src/main/java/org/apache/accumulo/tserver/RunnableStartedAt.java >> PRE-CREATION >> > >> server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java >> 94be0bb >> > >> >> server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java >> 935ffeb >> > >> > Diff: https://reviews.apache.org/r/27654/diff/ >> > >> > >> > Testing >> > ------- >> > >> > Very minimal. >> > >> > >> > Thanks, >> > >> > Josh Elser >> > >> > >> >> >