[ 
https://issues.apache.org/jira/browse/SOLR-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875045#comment-13875045
 ] 

Hoss Man commented on SOLR-5623:
--------------------------------

bq. In general whats happening here is not happening inside indexwriter, its 
happening in the analysis chain. I think solr or other applications is the 
right place to add additional debugging information (such as a unique ID for 
the document) because only it has that additional context to ensure its what is 
useful to get to the bottom.

Agreed ... but it would be nice if (in addition to application concepts like 
the uniqueKey of the doc) the exceptions could be annotated with information 
like what field name was associated with the runtime exception -- I don't think 
there's currently anyway for code "above" IndexWriter to do that is there?  

The flip side though is that having this kind of logic in IndexWriter (or 
DocInverterPerField, or wherever under the covers) to wrap any arbitrary 
Runtime exception (maybe IllegalArgumentEx, maybe ArrayOutOfBounds, etc...) 
with some kind of generic LuceneAnalysisRuntimeException that contains a 
"getField" method seems like a really bad idea since it would hide (via 
wrapping) the true underlying exception type.  We do this a lot in Solr since 
ultimately we're always going to need to propagate a SolrException with a 
status code to the remote client -- but i don't think anything else in Lucene 
Core wraps exceptions like this.

I don't know of any sane way to deal with this kind of problem -- just pointing 
out that knowing the field name that caused the problem seems equally important 
to knowing the uniqueKey. (in case anybody else has any good ideas).

In any case, we can make progress on the fairly easy part: annotating with the 
unqieuKey in Solr...

Benson, comments on your current pull request:

* there's some cut/paste comments/javadocs in the test configs/classes that 
need corrected
* considering things like SOLR-4992, i don't think adding a "catch (Throwable 
t)" is a good idea ... i would constrain this to RuntimeException
* take a look at AddUpdateCommand.getPrintableId
* your try/catch/wrap block is only arround one code path that calls 
IndexWriter.updateDocument\* ... there are others. The most 
straightforward/safe approach would probably be to refactor the entire 
{{addDoc(AddUpdateCommand)}} method along the lines of...{code}
  public int addDoc(AddUpdateCommand cmd) throws IOException {
    try { 
      return addDocInternal(cmd) 
    } catch (...) {
       ...
    }
  }
  // nocommit: javadocs as to purpose
  private int addDocInternal(AddUpdateCommand cmd) throws IOException {
    ...
  }
{code}
* this recipe is a bit cleaner for the type of assertion you are doing...{code}
  try {
    doSomethingThatShouldThrowAndException();
    fail("didn't get expected exception");
  } catch (ExpectedExceptionType e) {
    assertStuffAbout(e);
  }
{code}

> Better diagnosis of RuntimeExceptions in analysis
> -------------------------------------------------
>
>                 Key: SOLR-5623
>                 URL: https://issues.apache.org/jira/browse/SOLR-5623
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Benson Margulies
>
> If an analysis component (tokenizer, filter, etc) gets really into a hissy 
> fit and throws a RuntimeException, the resulting log traffic is less than 
> informative, lacking any pointer to the doc under discussion (in the doc 
> case). It would be more better if there was a catch/try shortstop that logged 
> this more informatively.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to