[jira] [Commented] (OAK-4585) Text extraction: runtime status monitoring

Thomas Mueller (JIRA) Thu, 04 Aug 2016 01:14:19 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407380#comment-15407380
 ]


Thomas Mueller commented on OAK-4585:
-------------------------------------

http://svn.apache.org/r1755145 (trunk)

Thanks Chetan. 

* Path: I now mainly log the path, as I see it's better than the content 
identity for the SegmentStore case. I still log the content identity for the 
FileDataStore case, so it should be easy to get hold of the binary (and even if 
the repository is not available).
* Trace level: I still use debug level, not sure why trace level would be 
better? I think people mainly use debug level when trying to find problems, and 
this doesn't seem to increase the log file dramatically.
* Log time taken, source size, extracted text size: done
* By the way there was a bug in the first patch, Long.parseLong instead of 
Long.getLong.


> Text extraction: runtime status monitoring
> ------------------------------------------
>
>                 Key: OAK-4585
>                 URL: https://issues.apache.org/jira/browse/OAK-4585
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 1.4.6, 1.5.7
>
>
> Text extraction is sometimes slow, and, in case of a bug in the text 
> extraction library, can even get stuck in an endless loop.
> Right now, it is not easy to understand what is going on, even when looking 
> at full thread dumps. (Debug) log information about the current state of text 
> extraction would be nice as well.
> I suggest we add debug level logging for the current extracted binary 
> (content identity). For larger binaries, we can also temporarily set the 
> thread name (append "Extracting <contentIdentity>"). That way, it is 
> relatively easy to see if text extraction is stuck simply looking at full 
> thread dumps, without having to change the log level and then reindex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OAK-4585) Text extraction: runtime status monitoring

Reply via email to