BTW, one thing I forgot to add: with infoStream, it's very difficult to
extend the level of output if one wants, for example, to add logging
messages to the search part (or other parts). The reason is one would need
to permeate infoStream down too many classes. Instead, with Java logging,
each class is responsible for its own logging (by obtaining a Logger
instance given the class name). You can later turn on/off logging per
package/class.

Perhaps instead of introducing Java logging then (if you're too against it),
we could introdue a static InfoStream class, with a static message() and
isVerbose() methods. That way, all classes who wish to log any message can
use it and it will be easier to add messages in the future from other
classes.
Even though it won't allow controlling which classes/packages will output to
the log file, it will give easier extension to Lucene logging. Would that
make more sense?

I still would prefer to see Java logging embedded, but if that's
unacceptable by the community, then having the above solution is better than
what we have today.

On Fri, Dec 5, 2008 at 9:38 PM, Shai Erera <[EMAIL PROTECTED]> wrote:

> Have you ever tried to debug your search application after it was shipped
> to a customer? When problems occur on the customer end, you cannot very
> easily reproduce problems because customers don't like to give you access to
> their systems, not always they are willing to share the index with you and
> let alone the documents that have been indexed.
>
> Logging is very common in products just for that purpose. Of course I can
> use debugging when something happens in my development environment. But
> that's not the case after the product has shipped.
>
> As for the logging framework, I'd think that Java logging creates no
> dependencies for Lucene. java.util.logging exists at least since 1.4. So
> it's already in the JDK. You might argue that some applications who embed a
> search component over Lucene use a different logging system (such as Log4j),
> but in that case I think it'd be fair to say that Java logging is what
> Lucene uses.
>
> You already do it today - you say that you use infoStream which prints
> messages. Only the solution in Lucene today cannot be customized. I either
> turn on *logging* for the entire Lucene package (or actually just the
> indexing part) or not. I cannot, for example, turn on *logging* just for the
> merge part.
>
> The debugging on the customer side is mostly what I'm after. My experience
> with another search library (proprietary) with exactly the same *logging*
> capabilities like Lucene (you either turn on/off logging for everything),
> although it contained messages from other parts of the search library as
> well, show that it's extremely difficult to debug what's going on during
> search on the customer side. Sometimes, all the application can log is that
> it adds a document with some attributes, but if you really want to
> understand what's going on inside Lucene, it's impossible. One useful
> information might be what are the actual tokens that were added to the
> index. There's no way the application can tell you that, w/o running the
> Analyzer on the text. But then it needs to write code, which I think could
> have been written in Lucene.
> Another useful information is what is the query that's actually being run.
> I guess that printing the QueryParser Query output object might be enough,
> but you never know.
> Maybe you'd like to know what indexes participated in the search, in case
> of a distributed indexing scenario.
>
> And the list can only grow ...
>
> Like I said in my first email - logging is an approach the community has to
> make, w/o neccessarily going over all the existing code and add messages.
> Those can be added over time, by many people who'd like to get detailed
> information from Lucene.
>
> I hope my intentions are clearer now.
>
>
> On Fri, Dec 5, 2008 at 9:06 PM, Michael McCandless <
> [EMAIL PROTECTED]> wrote:
>
>>
>> I also feel that the primary usage of the internal messaging in Lucene
>> today is debugging, and we don't need a logging framework for that.
>>
>> Mike
>>
>>
>> Doug Cutting wrote:
>>
>>  The infoStream stuff goes back to 1997, before there was log4j or any
>>> other Java logging framework.
>>>
>>> There's never been a big push to add logging to Lucene.  It would add a
>>> dependency, and Lucene's jar has always been standalone, which is nice.
>>>  Dependencies can conflict.  If Lucene requires one version of a dependency,
>>> then it may not work well with code that require a different version of that
>>> dependency.
>>>
>>> And it hasn't been clear which framework to adopt.  Log4j is the
>>> granddaddy, then there's Java logging and commons logging.  Today the
>>> preferred framework is probably SLF4J.  Good thing we didn't choose the
>>> wrong one years ago!
>>>
>>> And how many log entries would folks really want to see per query or
>>> document indexed?  In production I don't think most folks want to see more
>>> than one entry per query or document indexed.  So finer-grained logging
>>> would be for debugging.  For that one can instead use a debugger.  Hence the
>>> traditional lack of demand for detailed logging in Lucene.
>>>
>>> That's the history as I recall it.  The future is less clear.
>>>
>>> Doug
>>>
>>> Grant Ingersoll wrote:
>>>
>>>> I think the main motivation has always been to have no dependencies in
>>>> the core so as to keep it as fast and lightweight as possible.  Then, of
>>>> course, there is always the usual religious wars around which logging
>>>> framework to use, not to mention the nightmare that is trying to manage
>>>> multiple logging frameworks across several projects that are being
>>>> integrated.  Then, of course, there is the question of how useful any core
>>>> Lucene logs would be to users writing search applications.  For the most
>>>> part, my experience has been that I want logging to tell me when a document
>>>> was added, when searches occur, etc. but I don't necessarily need to know
>>>> things like the fact that Lucene is now entering the analysis phase of
>>>> Document inversion.  And, for all these needs, I can just as well do that
>>>> logging in the application and not in Lucene.
>>>> All that is not to say we couldn't add in logging, I'm just suggesting
>>>> reasons I can think of for why it has not been added to date and why I am
>>>> not sure it needs to be there going forward.  I believe various other 
>>>> people
>>>> have contributed reasons in the past.  I seem to recall Doug spelling some
>>>> out, but don't have the thread handy.
>>>> -Grant
>>>> On Dec 5, 2008, at 1:17 PM, Shai Erera wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I was wondering why doesn't the Lucene code uses Java logging, instead
>>>>> of the infoStream set in IndexWriter? Today, if I want to enable tracing 
>>>>> of
>>>>> Lucene code, the only thing I can do is set an infoStream, but then I get
>>>>> many many messages. Moreoever, those messages seem to cover indexing code
>>>>> only.
>>>>>
>>>>> I hope to get some opinions on the use of Java logging instead of
>>>>> infoStream, and hopefully to start addind logging messages in other places
>>>>> in the code (like during search, query parsing etc.)
>>>>>
>>>>> I feel that this is an approach the community has to decide on before
>>>>> we start adding messages to the code. Using Java logging can greatly 
>>>>> benefit
>>>>> tracing of indexing applications who use Lucene. If the vote is +1 for 
>>>>> using
>>>>> Java logging, we can start by deprecating infoStream (in 2.9, remove in 
>>>>> 3.0)
>>>>> and use logging instead.
>>>>>
>>>>> What do you think?
>>>>>
>>>>> Shai
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>

Reply via email to