[
https://issues.apache.org/jira/browse/LUCENE-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892316#action_12892316
]
Robert Muir commented on LUCENE-2564:
-------------------------------------
There are more problems with this loader... it uses FileReader
(platform-dependent encoding).
I think we should break it to default to UTF-8, too.
> wordlistloader is inefficient
> -----------------------------
>
> Key: LUCENE-2564
> URL: https://issues.apache.org/jira/browse/LUCENE-2564
> Project: Lucene - Java
> Issue Type: Bug
> Components: contrib/analyzers
> Reporter: Robert Muir
> Assignee: Robert Muir
> Fix For: 3.1, 4.0
>
>
> WordListLoader is basically used for loading up stopwords lists, stem
> dictionaries, etc.
> Unfortunately the api returns Set<String> and sometimes even HashSet<String>
> or HashMap<String,String>
> I think we should break it and return CharArraySets and CharArrayMaps (but
> leave the return value as generic Set,Map).
> If someone objects to breaking it in 3.1, then we can do this only in 4.0,
> but i think it would be good to fix it both places.
> The reason is that if someone does new FooAnalyzer() a lot (probably not
> uncommon) i think its doing a bunch of useless copying.
> I think we should slap @lucene.internal on this API too, since thats mostly
> how its being used.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]