On 09/25/2015 03:06 PM, Phil Steitz wrote:
On 9/25/15 11:01 AM, Ole Ersoy wrote:

On 09/25/2015 11:34 AM, Phil Steitz wrote:
I disagree. Good tests, API contracts, exception management and
documentation can and should eliminate the need for cluttering
low-level library code with debug logging.
Logging could be viewed as clutter.  Constructed the right way,
the logging statements could also be viewed as comments.

I agree that good tests, API contracts, and in general keeping the
design as minimalistic and simple as possible should be the first
criteria to review before introducing logging.  When the code is
trivial I leave logging out, unless I need to put in a few
statements in order to track collaboration between components in
an algorithm.
Other times the method is as simple as it gets and I have to have
logging.  For example take the LevenbergMarquardtOptimizer
optimize() method.  It's atomic in the sense that what's in the
method belongs in the method.  There are several loops within the
main loop, etc. and tracking what's going, even with a observer
being notified on each increment, would be far from elegant.
Why, exactly do you need to "track what is going on?"
I hope that I don't.  Most of the time code we write, especially the CM code, 
is rock solid.  When implementing complex algorithms we break it down to simple 
building blocks, reason about it, break it down some more, unit test it, and it 
looks fantastic.  Then we hit a snag, and start putting in println statements.  
Then we delete these after the fix, cut up 5 chickens, and pray.

If an issue occurs, then there is a low probability that it is a CM component.  
The traces can help prove that it is not the CM component.

If the code is now rock solid, then the log statements bother no one.  We put 
the code back in production, trace initially to make sure everything looks 
healthy, and then turn off logging.  On the flip side if we notice that 
something else seems off, we're back to putting in the println statements again.

Watching traces of CM components in production is an additional insurance 
policy for guaranteeing quality.

Also if there is an issue, and someone needs to understand it fast, the best 
way is to watch data flow through the system.  For new contributors, this could 
be a good way to get up to speed on an algorithm.


   If you need
to do that as a user of the code, some kind of listener or API to
give you the information that you need is appropriate.
I agree, but as I showed with the LevenbergMarquardtOptimizer attempting inject 
code into the optimize() method is a non trivial exercise.  Components should 
have a simple way to examine discrete steps that are being performed.

   Dumping text
to an external resource to "solve" this usually indicates smelliness
somewhere - either in the library API or the client code.
Or it is a precaution to ensure that no one forgot to flush.


For example perhaps we want to see what's going on with the
parameters in this small (5% of the method size) section of code:
What parameters and where did they come from?  If from the client,
the client can validate them.  If the library needs to validate or
confirm suitability, then it should do that in code or via tests.
                 // compute the scaled predicted reduction
                 // and the scaled directional derivative
                 for (int j = 0; j < solvedCols; ++j) {
                     int pj = permutation[j];
                     double dirJ = lmDir[pj];
                     work1[j] = 0;
                     for (int i = 0; i <= j; ++i) {
                         work1[i] += weightedJacobian[i][pj] * dirJ;
                     }
                 }

If there is something wrong with this in production, the shortest
path to figuring that out is reviewing a trace.  The longer path
is to stop the server.  Grab the data.  Open up a debugger.  Run
the data.

If we wanted to observe this section of code the observer would be
looking at sub loops of of the optimize method.  So that's doable,
but it creates an interface design for the observer that's
cluttered with events corresponding to various low level algorithm
details.


Several times, I've been obliged to create a modified version of CM
to introduce "print" statements (poor man's logging!) in order to
figure out why my code did not do what it was supposed to.
It's pretty tragic that anyone of us should have to do this.  It's
also wasteful, because if Gilles has to do this, then there's a
good chance that others have to do it to.  The reason Tomcat logs
at various levels is so that we can see what's going on in
production and track down bugs.
No.  Have a look at the Tomcat logging code.  It is mostly
initialization, shutdown and exceptions or warnings.

I agree, but I would argue that these should be far simpler to reason about 
than the life cycle of some of the CM algorithms.

   This is
necessary because tomcat is a container - there is no client
application to catch exceptions or get API results back.
The use case is the same.  If there is a production problem, it will show up in 
the logs.  This is what Gilles and I are asking for.  The ability to help 
diagnose component performance by reviewing a trace.  If the component is rock 
solid, then the additional tracing is free.

Part of this reminds me of a hesitation I had with use Markdown syntax for 
tables.  It's pretty tedious to line up all the pipes, etc. to make the tables 
look pretty.  Then I downloaded atom and a markdown plugin, and it 
automatically did it for me.

It should be pretty easy to hide logging statements in code via tooling. I just 
searched, and it does not look like it exists yet, but at least others are 
thinking about it:
http://stackoverflow.com/questions/10705814/how-to-write-eclipse-plugin-to-hide-logger-statements


   The analog
for the kind of instrumentation you are proposing inside [math]
would be like tomcat larding itself up with debug logging throughout
the request processing lifecycle, which it does not do.
I agree, but the request API publishes a great deal of detail about each 
request.  Also Tomcat is in production all over the place.  It goes through a 
lot of hammering.  Math components get far less exposure....probably in the 
order of 1/100,000.,,I think would be a conservative estimate.


    It is a
bad analogy in any case, because Tomcat is a container, which is 2
big steps up the processing chain from a low-level library.
That's fair.  The CM developers are all very talented though.  I don't think 
Gilles is going to get all sloppy joe with his logging. He cares a great deal 
about the code quality, just like everyone else.  I think if we OK logging, 
then initially Gilles will probably be the only one using it, because he 
strongly feels he needs it. The rest of us seem fairly content with the state 
of the components, so I'm guessing the components will stay the way they are 
for a while.  If there is nothing wrong, then there is no need to introduce 
tracing.

When I perform the LevenbergMarquardtOptimizer I'm about to embark on, I'm 
going to introduce tracing just to learn more about the algorithms dynamics.  
It would be a shame to have to delete the tracing when the refactoring is 
complete.

Introducing tracing is going to give future maintainers and contributors a 
great on ramp.

Cheers,
- Ole




Phil

Lets become one with the logging and make it into something that
strengthens both code integrity and comprehensibility.  Everyone
take take out your Feng Shui mat and do some deep breathing right
now.

Here again, tests, good design, code inspection are the way to go in
low-level components.  I have also spent a lot of time researching
bugs in [math], other Commons components and other complex systems.
My experience is a little different: excessive logging / debug code
for code that contains it often just tends to get in the way,
It's a very good point.  Sometimes we come across logging
statements that are just noise.  Given how rigorous we are about
reviewing everything though, I think a win win would be to limit
the noise during code review, and pledge not to "Call our mom" in
the middle of the code, etc.

especially after it has rotted a bit.  Rather than adding more code
to maintain so that you can have less conceptual control over the
functional code, it is better, IMNSHO, to focus on making the
functional code as simple as possible with clean well-documented,
test-validated API contracts.
Totally agree.  I think this should be the first priority, and
that logging should be used when there are no simple clean
alternatives.

It also makes a code easier to debug while developing or
modifying it
(without resorting to poor man's logging, then deleting the
"print",
then reinstating them, then deleting them again, ad nauseam).
Pretty sure we have all been here.

Cheers,
- Ole


Gilles

[1] No quality or complexity judgment implied.

Phil
Gilles

Thomas


On Fri, Sep 25, 2015 at 3:17 PM, Ole Ersoy <ole.er...@gmail.com>
wrote:

Hello,

We have been discussing various ways to view what's happening
internally
with algorithms, and the topic of including SLF4J has come up.
I know that
this was discussed earlier and it was decided that CM is a low
level
dependency, therefore it should minimize the transitive
dependencies that
it introduces.  The Java community has adopted many means of
dealing with
potential logging conflicts, so I'm requesting that we use
SLF4J
for
logging.

I know that JBoss introduced its own logging system, and this
made me a
bit nervous about this suggestion, so I looked up strategies
for
switching
their logger out with SLF4J:



http://stackoverflow.com/questions/14733369/force-jboss-logging-to-use-of-slf4j




The general process I go through when working with many
dependencies that
might use commons-logging instead of SLF4J looks something like
this:



http://stackoverflow.com/questions/8921382/maven-slf4j-version-conflict-when-using-two-different-dependencies-that-requi




With JDK9 individual modules can define their own isolated
set of
dependencies.  At this point the fix should be a permanent.  If
someone has
has a very intricate scenario that we have not yet seen, they
could use
(And probably should use) OSGi to isolate dependencies.

WDYT?

Cheers,
- Ole
---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

.


---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to