[ 
https://issues.apache.org/jira/browse/LUCENE-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507088
 ] 

Doron Cohen commented on LUCENE-937:
------------------------------------

> Mark: I assumed that an AL would be faster as all of the data is guaranteed 
> contiguous

Only the pointers to the objects are contiguous, right? The tokens themselves 
are, well, where they are. But with LinkedList there are new objects created, 
containing the tokens and the pointers to the other list members. So it may be 
safe to say that if you can estimate the list size (avoiding array grow), AL is 
preferable if there's no add/remove not at the end. 

> Michael: (~)  LL iterator comparable to AL

That's a good point. I had the impression that AL is always simpler than LL and 
unless removing or adding not at the end, it is preferable. (that's why I 
excluded the NgramTokenFiltrers that use LL.removeFirst()).  Now you're saying 
that with iteration (instead of direct access) LinkedList is supposed to be 
faster - could be, since then there's no need to grow the array. (however you 
have more "pointers"). 

With this reasoning - 
  - CompoundFileWriter - using iterator, no direct access.
  - MultipleTermPositions -  same.
  - DocumentWRiter - same.
So I am not so sure anymore about needing to change in these classes.

---------------

In summary since we can't assume estimating the size in advance, I think the 
best change would be as Michael suggested to use Iterator in 
CachingTokenFilter. 

> Make CachingTokenFilter faster
> ------------------------------
>
>                 Key: LUCENE-937
>                 URL: https://issues.apache.org/jira/browse/LUCENE-937
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: CachingTokenFilter.patch
>
>
> The wrong data structure was used for the CachingTokenFilter. It should be an 
> ArrayList rather than a LinkedList. There is a noticeable difference in speed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to