[jira] [Commented] (LUCENE-6212) Remove IndexWriter's per-document analyzer add/updateDocument APIs

Hoss Man (JIRA) Tue, 14 Jul 2015 15:27:37 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627163#comment-14627163
 ]


Hoss Man commented on LUCENE-6212:
----------------------------------

bq. Since Lucene is a library for developers and it's not an "end user product" 
I would prefer it could give me a bit more flexibility.

Unless i'm missunderstanding the context of your concern, totally flexability 
in the terms indexed is still available because you can index Documents 
containing IndexableFields that produce whatever TokenStream you want -- 
ignoring the Analyzer specified on the IndexWriter if you so choose.  

What this change did is make "the uncommon and easy to mess up case" (ask 
indexwriter to analyze your text using a diff analyzer for each doc) impossible 
-- but meanwhile both "the simple common case" (same analyzer for all docs) and 
"the expert level case" (i want to produce an arbitrary set of terms for each 
field and each document) are both still possible and easy.

----

In any event -- trying to have a discussion about this in the comments of a 
Jira that's been closed for several months is a really bad idea -- if you have 
questions/concerns about how to use the API, or how to upgrade your existing 
code, please address those to the java-user@lucene list where the entire 
community can help you (not just the handful of devs watching every jira issue)

> Remove IndexWriter's per-document analyzer add/updateDocument APIs
> ------------------------------------------------------------------
>
>                 Key: LUCENE-6212
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6212
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, 5.1, Trunk
>
>         Attachments: LUCENE-6212.patch
>
>
> IndexWriter already takes an analyzer up-front (via
> IndexWriterConfig), but it also allows you to specify a different one
> for each add/updateDocument.
> I think this is quite dangerous/trappy since it means you can easily
> index tokens for that document that don't match at search-time based
> on the search-time analyzer.
> I think we should remove this trap in 5.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6212) Remove IndexWriter's per-document analyzer add/updateDocument APIs

Reply via email to