[
https://issues.apache.org/jira/browse/LUCENE-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2154:
----------------------------------
Attachment: LUCENE-2154.patch
Here is a first patch about cglib-generated proxy attributes.
In IRC we found out yesterday, that the proposed idea to share the attributes
accross all Multi*Enums would result in problems as the call to next() on any
sub-enum would overwrite the contents of the attributes of the previous
sub-enum which would make TermsEnum not working (because e.g. TermsEnum looks
forward by calling next() an all sub-enums and choosing the lowest term to
return - after calling each enums next() the attributes of the first enums
cannot be restored without captureState & co, as overwritten by the next() call
to the last enum).
This patch needs cglib-nodep-2.2.jar put into the lib-folder of the checkout
[http://sourceforge.net/projects/cglib/files/cglib2/2.2/cglib-nodep-2.2.jar/download].
It contains a test and that shows how the usage is. The central part is cglib's
Enhancer that creates a dynamic class extending ProxyAttributeImpl (which
defines the general AttributeImpl methods delegating to the delegate) and
implementing the requested Attribute interface using a MethodInterceptor.
Please note: This uses no reflection (only during in-memory class file
creation, which is only run one time on "loading" the proxy class). The proxy
implements MethodInterceptor and uses the fast MethodProxy class (which is also
generated by cglib for each proxied method, too) and can invoke the delegated
method directly (without reflection) on the delegate.
The test verifies everything works and also compares speed by using a
TermAttribute natively and proxied. The speed is lower (which is not caused by
reflection, but by the MethodInterceptor creating an array of parameters and
boxing/unboxing native parameters into the Object[]), but for the testcase I
have seen about only 50% more time needed.
The generated classes are cached and reused (like DEFAULT_ATTRIBUTE_FACTORY
does).
To get maximum speed and no external libraries, the code implemented by
Enhancer can be rewritten natively using the Apache Harmony
java.lang.reflect.Proxy implementation source code as basis. The hardest part
in generating bytecode is the ConstantPool in class files. But as the proxy
methods are simply delegating and no magic like boxing/unboxing is needed, the
generated bytecode is rather simple.
One other use-case for these proxies is AppendingTokenStream, which is not
possible since 3.0 without captureState (in old TS API it was possible, because
you could reuse the same TokenInstance even over the appended streams). In the
new TS api, the appending stream must have a "view" on the attributes of the
current consuming sub-stream.
> Need a clean way for Dir/MultiReader to "merge" the AttributeSources of the
> sub-readers
> ---------------------------------------------------------------------------------------
>
> Key: LUCENE-2154
> URL: https://issues.apache.org/jira/browse/LUCENE-2154
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: Flex Branch
> Reporter: Michael McCandless
> Fix For: Flex Branch
>
> Attachments: LUCENE-2154.patch
>
>
> The flex API allows extensibility at the Fields/Terms/Docs/PositionsEnum
> levels, for a codec to set custom attrs.
> But, it's currently broken for Dir/MultiReader, which must somehow share
> attrs across all the sub-readers. Somehow we must make a single attr source,
> and tell each sub-reader's enum to use that instead of creating its own.
> Hopefully Uwe can work some magic here :)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]