TokenStream.close() is called multiple times per TokenStream instance
---------------------------------------------------------------------
Key: LUCENE-2145
URL: https://issues.apache.org/jira/browse/LUCENE-2145
Project: Lucene - Java
Issue Type: Bug
Components: Index, QueryParser
Affects Versions: 3.0, 2.9.1, 2.9
Environment: Solr 1.4.0
Reporter: KuroSaka TeruHiko
I have a Tokenizer that uses an external resource. I wrote this Tokenizer so
that the external resource is released in its close() method.
This should work because close() is supposed to be called when the caller is
done with the TokenStream of which Tokenizer is a subclass. TokenStream's API
document
<http://lucene.apache.org/java/2_9_1/api/core/org/apache/lucene/analysis/TokenStream.html>
states:
{noformat}
6. The consumer calls close() to release any resource when finished using the
TokenStream.
{noformat}
When I used my Tokenizer from Solr 1.4.0, it did not work as expected. An
error analysis suggests an instance of my Tokenizer is used even after close()
is called and the external resource is released. After a further analysis it
seems that it is not Solr but Lucene itself that is breaking the contract.
This is happening in two places.
src/java/org/apache/lucene/queryParser/QueryParser.java:
protected Query getFieldQuery(String field, String queryText) throws
ParseException {
// Use the analyzer to get all the tokens, and then build a TermQuery,
// PhraseQuery, or nothing based on the term count
TokenStream source;
try {
source = analyzer.reusableTokenStream(field, new StringReader(queryText));
source.reset();
.
.
.
try {
// rewind the buffer stream
buffer.reset();
// close original stream - all tokens buffered
source.close(); // <---- HERE
}
src/java/org/apache/lucene/index/DocInverterPerField.java
public void processFields(final Fieldable[] fields,
final int count) throws IOException {
...
} finally {
stream.close();
}
Calling close() would be good if the TokenStream is not reusable one. But when
it is reusable, it might be used again, so the resource associated with the
TokenStream instance should not be released. close() needs to be called
selectively only when it know it is not going to be reused.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]