From: "Adrian Crum" <adrian.c...@sandglass-software.com> > On 4/7/2013 10:30 AM, Jacques Le Roux wrote: >> From: "Adrian Crum" <adrian.c...@sandglass-software.com> >>> From my perspective, you are the one complicating things. If you don't >>> want to slow down the system with logging I/O, then turn logging off. >> When you are continously developping a system, removing all log info/timing >> from production can be problematic. >> It's sometimes the only way you have to understand why and how a random >> issue appears. > > > That is helpful. So, when you are troubleshooting a production > installation, you enable TIMING and INFO logging to see what's going on. > Understood. Now, what is the purpose of using a threshold?
To separate slow services and, for the 2nd threshold, possibly very slow ones. Sometimes, it depends not only on external system but also on a whole "ecosystem" (net connections, firewalls, etc.) We know things always changes, it's like a chaos sometimes. At this stage you are yet sure not what for looking for. So you need a 1st rough filter to better understand what's going on Putting thresholds on services durations can then help to separate cases. > You mentioned marking slow services, but a single threshold value for > all services will not do that - because every service is different and > each one will have a unique typical elapsed time. That was the reason > why I changed the text of the message - because services that were > completed in timely manner were being logged as "slow" - which was not true. There were slow and very slow, that why I used 2 properties, and set the 1st to 0, to have the behaviour you introduced and I agree with for OOTB setting (show all services duration) > Have you considered using the metrics feature? Ha forgot to ask about it, now I see and remember this thread and I will digg in http://markmail.org/message/x4lzvda66ju6gdg5 Of course, I'd not be against a brief briefing, or a link to explain. Thanks for the reminder Jacques > > >> >>> "We've always done it that way" is not a valid argument. >> I was just saying that personally I did not have problems whith how it was >> before your change. I was easy to adapt to my need. I have also no problems >> with your change, it's sill easy to adapt. >> >>> You still haven't given us any requirements that demonstrate the need to >>> put complicated decision making around a single log entry. >> My solution is not complicated at all 2 properties for 2 thresholds. Why do >> you feel it as complicated? >> Actually I'm not stuck with this solution. If you want, comment it out or >> even revert and the chapter is closed. >> >> Jacques >> >>> -Adrian >>> >>> On 4/7/2013 9:18 AM, Jacques Le Roux wrote: >>>> This seems over complicated to me. Don't arbitrarily slow OFBiz by storing >>>> (IOs) services durations at each service call. Let's keep it simple and >>>> easy tunable. >>>> >>>> The way it was before ( >50 => slow; > 200 => very slow) was there for >>>> years. I did not code it, and after a decade we want to change it, OK. >>>> >>>> Most of the time you don't care about that. When you do, you simply want >>>> to separate slow services from others. >>>> >>>> I believe OOTB 2 properties are enough. I set them to allow to see all >>>> services durations and to specifically mark slow services. I picked 1 sec, >>>> why? It would be terribly slow services internally, but externals ones can >>>> vary much more. >>>> If you need more thresholds/separations it's easy from what is coded now >>>> to add suiting your needs. >>>> >>>> From my experience, when looking at logs on a system you quickly know >>>> which are these values, when you are interested by them >>>> >>>> The most important point here is not slow the system and to follow >>>> KISS/YAGNI ways. >>>> >>>> Please don't complicate things, thanks! >>>> >>>> Jacques >>>> >>>> >>>> From: "Atul Vani" <atul.v...@hotwaxmedia.com> >>>>> Here's what I think, it's all raw though :) >>>>> * As suggested by Jacopo, we maintain stats in some kind of entity. Let's >>>>> say average running time. >>>>> * We use this average running time to decide if a timing log should be >>>>> printed. The thing is, not all services are same, some are complex and are >>>>> supposed to take longer than others, so a static number would never >>>>> suffice. We do not have any means to calculate the complexity of a service >>>>> (by looking at the mini-lang code) to decide this number, so either a user >>>>> will tell it or we can compute it on the basis of past run times. >>>>> * There can come cases when user would want to tell the number (all past >>>>> experiences were bad ones, or he just optimized), so we can add some >>>>> attribute to the service definition. There is something like metric >>>>> already present, not sure how that works. >>>>> * A single number for all services does not make sense to me, because (as >>>>> mentioned above) not all services are same. >>>>> >>>>> On Sun, 07 Apr 2013 04:14:58 +0530, Adrian Crum >>>>> <adrian.c...@sandglass-software.com> wrote: >>>>> >>>>>> No, a commit is NOT a reply to an email. >>>>>> >>>>>> Please, let's discuss this. You seem to be forcing your perception of >>>>>> how logging should be done on the rest of the community. I would prefer >>>>>> that we all participate in a discussion and come up with a design that >>>>>> works for everyone. >>>>>> >>>>>> -Adrian >>>>>> >>>>>> On 4/6/2013 5:32 PM, Jacques Le Roux wrote: >>>>>>> I thought the commit comment answered your question >>>>>>> >>>>>>> OOTB, there is only 1 change from what you committed: now services >>>>>>> longer than 1 sec will also show as slow in log >>>>>>> >>>>>>> Jacques >>>>>>> >>>>>>> From: "Adrian Crum" <adrian.c...@sandglass-software.com> >>>>>>>> Please don't change the timing logging - there should not be any >>>>>>>> conditions placed on it. >>>>>>>> >>>>>>>> You didn't answer my question. I was hoping we could avoid a commit war >>>>>>>> by discussing your requirements and designing a solution that makes >>>>>>>> everyone happy. >>>>>>>> >>>>>>>> -Adrian >>>>>>>> >>>>>>>> On 4/6/2013 12:00 PM, Jacques Le Roux wrote: >>>>>>>>> Hi Adrian, >>>>>>>>> >>>>>>>>> Thanks for asking, I committed and commented at revision: 1465223 >>>>>>>>> >>>>>>>>> Atul, >>>>>>>>> >>>>>>>>> It was not easy to read your patch in the email (cut at 80 chars). >>>>>>>>> Please open a Jira if you want to improve my commit. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Jacques >>>>>>>>> >>>>>>>>> >>>>>>>>> From: "Adrian Crum" <adrian.c...@sandglass-software.com> >>>>>>>>>> Jacques, >>>>>>>>>> >>>>>>>>>> What are your requirements? What are you looking for in the logs? >>>>>>>>>> >>>>>>>>>> -Adrian >>>>>>>>>> >>>>>>>>>> On 4/4/2013 10:54 PM, Jacques Le Roux wrote: >>>>>>>>>>> These numbers are from experience of hours and hours staring at >>>>>>>>>>> clusters logs, but yes it's arbitrary and depend on context (as I >>>>>>>>>>> guess were picked the initial numbers which are there for years >>>>>>>>>>> Then why not the obvious solution Jacopo proposed of properties, >>>>>>>>>>> easy to change even dynamically... >>>>>>>>>>> I can't see anything more flexible, at least at 23:43 after days of >>>>>>>>>>> works. >>>>>>>>>>> AS you said, once you spot one such line in log it's not a biggie >>>>>>>>>>> to get there (you have the class line in log) and adapt suiting >>>>>>>>>>> your needs. >>>>>>>>>>> So maybe "like you proposed" we could indeed put a very low value >>>>>>>>>>> (I mean 0) as property. >>>>>>>>>>> >>>>>>>>>>> For Atul's proposition, sorry not the courage to check tonight >>>>>>>>>>> (maybe a patch in a Jira would help to read) >>>>>>>>>>> >>>>>>>>>>> Jacques >>>>>>>>>>> PS: I guessed "fuss around", knew tinker, not fidget :D >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Adrian Crum wrote: >>>>>>>>>>>> Why not 20 or 30 or 40? >>>>>>>>>>>> >>>>>>>>>>>> That's the problem with arbitrary values - they don't mean >>>>>>>>>>>> anything. >>>>>>>>>>>> >>>>>>>>>>>> From my perspective, if anyone has timing enabled, then they >>>>>>>>>>>> want to >>>>>>>>>>>> see what's going on in the system. >>>>>>>>>>>> >>>>>>>>>>>> Feel free to change it. >>>>>>>>>>>> >>>>>>>>>>>> -Adrian >>>>>>>>>>>> >>>>>>>>>>>> On 4/3/2013 9:22 AM, Jacques Le Roux wrote: >>>>>>>>>>>>> Hi Adrian, All, >>>>>>>>>>>>> >>>>>>>>>>>>> Should we really show the timing for all services? >>>>>>>>>>>>> Maybe increasing from 50 to 75 or even 100 for the 1st case would >>>>>>>>>>>>> be enough? >>>>>>>>>>>>> >>>>>>>>>>>>> Jacques >>>>>>>>>>>>> >>>>>>>>>>>>> From: <adri...@apache.org> >>>>>>>>>>>>>> Author: adrianc >>>>>>>>>>>>>> Date: Wed Apr 3 07:57:24 2013 >>>>>>>>>>>>>> New Revision: 1463863 >>>>>>>>>>>>>> >>>>>>>>>>>>>> URL: http://svn.apache.org/r1463863 >>>>>>>>>>>>>> Log: >>>>>>>>>>>>>> Log message cleanup in ServiceDispatcher.java. Removed confusing >>>>>>>>>>>>>> text about services taking too long. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Modified: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ofbiz/trunk/framework/service/src/org/ofbiz/service/ServiceDispatcher.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> Modified: >>>>>>>>>>>>>> ofbiz/trunk/framework/service/src/org/ofbiz/service/ServiceDispatcher.java >>>>>>>>>>>>>> URL: >>>>>>>>>>>>>> http://svn.apache.org/viewvc/ofbiz/trunk/framework/service/src/org/ofbiz/service/ServiceDispatcher.java?rev=1463863&r1=1463862&r2=1463863&view=diff >>>>>>>>>>>>>> ============================================================================== >>>>>>>>>>>>>> --- >>>>>>>>>>>>>> ofbiz/trunk/framework/service/src/org/ofbiz/service/ServiceDispatcher.java >>>>>>>>>>>>>> (original) +++ >>>>>>>>>>>>>> ofbiz/trunk/framework/service/src/org/ofbiz/service/ServiceDispatcher.java >>>>>>>>>>>>>> Wed Apr 3 07:57:24 2013 @@ -571,10 +571,8 @@ public >>>>>>>>>>>>>> class ServiceDispatcher { rs.setEndStamp(); >>>>>>>>>>>>>> >>>>>>>>>>>>>> long timeToRun = System.currentTimeMillis() - >>>>>>>>>>>>>> serviceStartTime; >>>>>>>>>>>>>> - if (Debug.timingOn() && timeToRun > 50) { >>>>>>>>>>>>>> - Debug.logTiming("Slow sync service execution >>>>>>>>>>>>>> detected: service [" + localName + "/" + modelService.name + "] >>>>>>>>>>>>>> finished in [" + timeToRun + "] milliseconds", module); >>>>>>>>>>>>>> - } else if (Debug.infoOn() && timeToRun > 200) { >>>>>>>>>>>>>> - Debug.logInfo("Very slow sync service execution >>>>>>>>>>>>>> detected: service [" + localName + "/" + modelService.name + "] >>>>>>>>>>>>>> finished in [" + timeToRun + "] milliseconds", module); + >>>>>>>>>>>>>> if (Debug.timingOn()) { >>>>>>>>>>>>>> + Debug.logTiming("Sync service [" + localName + "/" >>>>>>>>>>>>>> + modelService.name + "] finished in [" + timeToRun + "] >>>>>>>>>>>>>> milliseconds", module); } >>>>>>>>>>>>>> if ((Debug.verboseOn() || modelService.debug) && >>>>>>>>>>>>>> timeToRun > 50 && !modelService.hideResultInLog) { >>>>>>>>>>>>>> // Sanity check - some service results can be >>>>>>>>>>>>>> multiple MB in size. Limit message size to 10K. >>>>> -- >>>>> Using Opera's revolutionary email client: http://www.opera.com/mail/ >