[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-14 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455856#comment-13455856
 ] 

Jack Krupansky commented on LUCENE-4369:


At this stage of the discussion, is there any intention that the replacement 
for "string" fields will permit analysis, at least CharFilter analysis? After 
all, that is one of the main reasons people get confused. I'm okay with 
"ExactTextField" except that character filtering would be really nice and avoid 
confusion for Solr users. Of course, it would be ironic to call it "exact text" 
when it needs to be filtered.

OTOH, at the Lucene level, especially the Lucene querey parser, I can see why 
you would want the "string" field to prevent analysis - because there is no 
field-specific analysis, just one analyzer for all fields. Hmmm... maybe that 
should be proposed for the Lucene query parser to side step that particular 
rationale for wanting strings to be unanalyzed - provide a map of 
field-specific analyzers.

At the Solr schema level, we could simply have "string" be a TextField with 
only KeywordTokenizer and then users can copy and/or customize as they wish.

This begs the question of how or whether the Solr schema side of the house will 
rename the "string" field type, or keep it as string and simply change the 
StrField class name.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-14 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455825#comment-13455825
 ] 

Michael McCandless commented on LUCENE-4369:


I prefer ExactTextField over Untokenized (or UnAnalyzed) Field,
because that name matches the typical use-case of this field: you want
to index exactly the text value so you can later retrieve by that
value.

Yes, the field is untokenized, but this is something of an
implementation detail: that's just how it achieves exact matching.
And it's only one of the things it does (it also turns off norms, sets
DOCS_ONLY).

In general I think we should name things according to how they are
most likely to be used, not according to how they are implemented.

The goal here isn't to find a name that everybody loves ... only to
find one that nobody hates ... and I think ExactTextField is a big
improvement over StringField.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-13 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455149#comment-13455149
 ] 

Robert Muir commented on LUCENE-4369:
-

Stefan I dont really like that, because we want to make it easy for users who 
use
QueryParser (like some huge % of users) to analyze the same way at query-time.

The way people have always done this with lucene is to pass the same Analyzer at
index-time and query time.

If they use StringField, it breaks that! Thats why users get confused and thats
why I opened this issue.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-13 Thread Stefan Trcek (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455134#comment-13455134
 ] 

Stefan Trcek commented on LUCENE-4369:
--

Not just a rename and I don't know if it is viable:

The idea is: you start thinking about analyzing when adding fields
for some purpose, not when creating an IndexWriter. And the mode how to do it 
is tightened to the field.

How about to dismiss the Analyzer in IndexWriter/Config
and add all analyzing information to Field, something like

new TextField(...) // as keyword
new TextField(..., Analyzer, AnalyzingMode) // analyzed

or better

new TextField(..., AnalyzingMode.AS_IS) // as keyword
new TextField(..., new AnalyzingMode(Analyzer, ...)) // analyzed
new TextField(..., AnalyzingMode.STANDARD) // sugar

Then in the public API for IndexWriter there may be no need to use
- PerFieldAnalyzerWrapper
- Field.Index.NO
- KeywordAnalyzer

This also answers the not so easy question why and how to construct a
(field aware) analyzer as a parameter for IndexWriter/Config.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454254#comment-13454254
 ] 

Hoss Man commented on LUCENE-4369:
--

bq. the mailing list thread referenced from there is in my opinion unrelated.

Did you read the whole thread?  It's littered with comments about confusion 
between how that UN_TOKENIZED related to the Analyzer configured on the 
IndexWriter -- some people thought it ment the *tokenizer* in the Analyzer 
wouldn't be used, bu the rest of their analyzer would.  It's very 
representative of lots of other threads i'd seen over the years.

bq. I disagree when we're talking about Solr users who are just using the 
schema.xml file

I don't think anyone is talking about changing solr.StrField and solr.TextField 
-- this issue is about the convincient subclasses of oal.document.Field...

https://lucene.apache.org/core/4_0_0-BETA/core/org/apache/lucene/document/Field.html



> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454252#comment-13454252
 ] 

Steven Rowe commented on LUCENE-4369:
-

bq. I never understood the difference and why this was renamed in 2.4. For me 
the issue explains nothing and the mailing list thread referenced from there is 
in my opinion unrelated.

Yeah, no.  Totally related, see e.g. 


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454242#comment-13454242
 ] 

Erick Erickson commented on LUCENE-4369:


Shai:

bq: ...I don't think we should underestimate Lucene users to the point that 
they don't understand what an Analyzer...

I absolutely agree with you about _Lucene_ users, but I disagree when we're 
talking about _Solr_ users who are just using the schema.xml file. I flat 
guarantee that they don't always look under the covers. I've seen way more than 
one site with "solr rocks" as the first/newSearchers.

But all that said, I'm not doing the work so whatever gets chosen is fine with 
me.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454236#comment-13454236
 ] 

Uwe Schindler commented on LUCENE-4369:
---

I never understood the difference and why this was renamed in 2.4. For me the 
issue explains nothing and the mailing list thread referenced from there is in 
my opinion unrelated.

I am also fine with replacing tokenized with analyzed.

Inert question: why is it called Tokenizer and not Analyzerator?

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454205#comment-13454205
 ] 

Shai Erera commented on LUCENE-4369:


Great, then do we have a winner? :)

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454204#comment-13454204
 ] 

Hoss Man commented on LUCENE-4369:
--

Didn't we spcifically get rid of an enums called TOKENIZED and UN_TOKENIZED 
because they convoluted the concept of tokenization with analysis?  weren't 
there users who wanted "keyword" tokenization combined with other tokenfilters 
who thought UN_TOKENIZED was what they wanted?

Perhaps TextField should be renamed AnalyzedTextField and StringField should be 
NonAnalyzedTextField ?

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454200#comment-13454200
 ] 

Robert Muir commented on LUCENE-4369:
-

I am +1 for UntokenizedField too. This is much more intuitive than StringField!

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454194#comment-13454194
 ] 

Shai Erera commented on LUCENE-4369:


bq. I would like UntokenizedField

+1 for that. I don't think we should underestimate Lucene users to the point 
that they don't understand what an Analyzer is, or tokenization means. When 
they create IWC, they need to specify an Analyzer. I think, seriously, that 
Analyzer is as basic as Document.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454186#comment-13454186
 ] 

Steven Rowe commented on LUCENE-4369:
-

Some more choices: AsIsTextField, IntactTextField, UnSoiledTextField, 
HalfCaffLatteField

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454175#comment-13454175
 ] 

Uwe Schindler commented on LUCENE-4369:
---

WholeTextField sounds like Starbucks...

I would like UntokenizedField.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454172#comment-13454172
 ] 

Robert Muir commented on LUCENE-4369:
-

ok just a few downsides of 'whole': 
* it seems similar to full, like full-text field. but StringField is not that.
* then what is TextField, only partial?

Guys i realistically dont think we are going to come up with a perfect name 
here that everyone likes.

But I think enough people agree that StringField is bad.

I seriously propose ASDFGHIJField in the interim, we gotta make some 
incremental progress.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-12 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454167#comment-13454167
 ] 

Robert Muir commented on LUCENE-4369:
-

How about WholeTextField? thats fine with me. Does anyone object?

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453317#comment-13453317
 ] 

Steven Rowe commented on LUCENE-4369:
-

Serious suggestion: WholeTextField

(Following the raw/cooked food metaphor used in various computational contexts 
- "whole food" means unprocessed.)

I like ExactTextField too, but it's missing the beginning and end anchors: the 
intent is "exactly this search string", but it doesn't necessarily imply "and 
nothing else".  E.g. would a user armed only with the name assume that an 
ExactTextField query string "two three" would not match an indexed string "one 
two three four"?

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453174#comment-13453174
 ] 

Michael McCandless commented on LUCENE-4369:


I think it's useful to have a dedicated sugar field for things like primary 
keys, URLs, enumerated fields ("country", "state", "zip code"), entitlements 
fields (ACLs), tags, etc., and when users do this directly today I suspect they 
often forget to disable norms and index with docs-only.

But I agree the name is trappy now.

+1 for ExactTextField.  I don't really like "raw": it sounds too ... low level. 
 Like it's not even gonna be indexed or something.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453159#comment-13453159
 ] 

Robert Muir commented on LUCENE-4369:
-

{quote}
OK, an idea out of left field, why do we have a "string" as a type anyway? Does 
it make any sense to just remove it and have people use KeywordTokenizer when 
they want this behavior? I'm ready for this idea to be shot down in flames 

{quote}

I've said the same thing before, but I figure I won't get consensus for that. 

I'm happy to just get the name to be anything but String for now :)

Its still screwed up there are things like setBoost() at all on StringField 
when it omits norms etc,
and screwed up that it bypasses the Analyzer (the classic NOT_ANALYZED 
problem), but
fixing the name would at least help.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453151#comment-13453151
 ] 

Erick Erickson commented on LUCENE-4369:


Anything with "Raw" is good. The problem with "Keyword" or "Untokenized" or 
"Unanalyzed" in the name is that it rather assumes that the user is familiar 
with what those terms mean in Lucene. If they're experienced enough to 
understand _that_, they're less likely to fall into this error in the first 
place.

We could do something that removes it from consideration unless people dig. I 
understand it's a general field, but how about something like "Identifier" (I'm 
not too keen on that name actually). I'm reaching for something that is 
"naturally" thought of as a type suitable for  fields but requires 
one to dig a bit before using it for other fields.

OK, an idea out of left field, why do we have a "string" as a type anyway? Does 
it make any sense to just remove it and have people use KeywordTokenizer when 
they want this behavior? I'm ready for _this_ idea to be shot down in flames 


I suppose in the Solr world, we could just remove the "string" type from 
schema.xml and provide an example  that was only KeyworTokenized and 
avoid a world of confusion for many users.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453066#comment-13453066
 ] 

Steven Rowe commented on LUCENE-4369:
-

AuNaturelTextField

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453036#comment-13453036
 ] 

Robert Muir commented on LUCENE-4369:
-

{quote}
"Raw" is a good term, too.
{quote}

+1, lets think about that.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453022#comment-13453022
 ] 

Uwe Schindler commented on LUCENE-4369:
---

Thanks Jack, that exactly also my opinion, we just need good names. I like 
your's, too. "Raw" is a good term, too. The MatchOnly or ExactMatch terms are 
in my opinion not very good, sorry.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452998#comment-13452998
 ] 

Jack Krupansky commented on LUCENE-4369:


I would suggest "RawTextField". Or, "ExactTextField". Or, 
"UnanalyzedTextField". I mean, text is text to an average user. Generally, 
people should use TextField for text, but use StringField when they need the 
"exact", "raw" text "as is" and without being tokenized or otherwise changed.

"KeywordTokenizer" is confusing since it really is "NoTokenizer" or 
"ExactTextTokenizer" or "RawTextTokenizer".

Is there currently a wiki page that describes the distinction between "match" 
and "search"? I would not expect an average user to know the distinction.




> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452960#comment-13452960
 ] 

Robert Muir commented on LUCENE-4369:
-

{quote}
The names "ExactMatchField" or "MatchOnlyField" both have the problem, that 
they only refer to the indexing side.
{quote}

I dont know, I actually like ExactMatchField the best because it specifies 
exactly what I want it to specify.

MatchOnly is not as good because you can actually do things like sort (the 
javadocs mention this as one reason
you would use this field type), but ExactMatch just refers to the search 
behavior,
which is what I am really concerned about. It doesn't imply you cannot store 
it, it just tells you how the search
behavior behaves.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452957#comment-13452957
 ] 

Robert Muir commented on LUCENE-4369:
-

{quote}
I am against this, we should change this before Lucene 4.0. We have seen 
already on user list that many people understand it wrong, so for me this issue 
is a "Blocker" for 4.0.
{quote}

I disagree with this. I've watched NOT_ANALYZED pop up on the user list for 
older releases time after time, its frustrating, but this problem is nothing 
new.
Its not introduced with 4.0: I opened this issue because I thought was useful 
feedback from someone testing the Lucene 4.0 BETA and its really trivial to fix,
once we settle on a name.

I don't think we should try to block releases when nobody can even agree on a 
good name yet.

We should instead focus on picking a good name: we can implement this for 4.1 
or 5.0 or whatever.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452956#comment-13452956
 ] 

Uwe Schindler commented on LUCENE-4369:
---

The names "ExactMatchField" or "MatchOnlyField" both have the problem, that 
they only refer to the indexing side. I would be fine with that name, if it 
would be "unstored" by default, so you have to turn on storing explicit. If it 
is automatically stored, people will complain that their index has too many 
useless garbage, because they expected a ExactMatchField to be used only for 
"matching", so "storing" is wrong.

I would prefer: UntokenizedField or UntokenizedStringField

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452947#comment-13452947
 ] 

Robert Muir commented on LUCENE-4369:
-

{quote}
The problem with ExactMatch field is: If it is also stored, the name is 
misleasing again, so KeywordField is better.
{quote}

I dont understand how storing is related. storing is the same always.

{quote}
If we would 100% differentiate between stored and indexed fields while indexing 
(requiring that the field is also added 2 times, one time as indexed and one 
time as indexed), I would be fine with "MatchOnlyField" and "StoredStringField".
{quote}

In my opinion the only thing worse we could do to our .document API than 
StringField would be to require the user to add the field twice.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452945#comment-13452945
 ] 

Uwe Schindler commented on LUCENE-4369:
---

bq. We don't need to rush it, I think its fairly contained to change, we don't 
even have to deal with this for 
4.0 if we aren't happy: we can deprecate StringField just have it extend 
XXXField in a future 4.x release too.

I am against this, we should change this before Lucene 4.0. We have seen 
already on user list that many people understand it wrong, so for me this issue 
is a "Blocker" for 4.0.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452941#comment-13452941
 ] 

Uwe Schindler commented on LUCENE-4369:
---

Here the good old Lucene 1.9.1 API: 
http://memex.dsic.upv.es/pbs/Practicas/Lucene/api-1.9.1/org/apache/lucene/document/Field.html
 (see Field.Keyword, Field.Text, Field.Unstored)

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452933#comment-13452933
 ] 

Uwe Schindler commented on LUCENE-4369:
---

ExactMatchField sounds ok, but I don't really like it. On the other hand, we 
already had Field.KEYWORD(...) static factory in Lucene 1.x (amybe also early 
2.x), and that was always fine to me. The term Keyword is only misleading (for 
my german, library background - "Schlagwörter" in GER) to me, so I would like 
to have a good term that tells the user "this is a field thats taken as-is). In 
general I also dont really like the name KeywordTokenizer or KeywordAnalyzer, 
too, but thats given since long time - so coming from this name, 
KeywordTokenizer -> KeywordField might be a good idea (like NumericTokenStream 
-> NumericField), but

The problem with ExactMatch field is: If it is also stored, the name is 
misleasing again, so KeywordField is better. If we would 100% differentiate 
between stored and indexed fields while indexing (requiring that the field is 
also added 2 times, one time as indexed and one time as indexed), I would be 
fine with "MatchOnlyField" and "StoredStringField".

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452924#comment-13452924
 ] 

Chris Male commented on LUCENE-4369:


I like ExactMatchField, good suggestion.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452922#comment-13452922
 ] 

Robert Muir commented on LUCENE-4369:
-

I like ExactMatchField too.

I thought about Keyword too, but my concern is that this would get confused 
with 'search keywords' such as
the type used in META section of html documents. We could argue about the best 
field type for that :) but
I don't think this is it.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452920#comment-13452920
 ] 

Shai Erera commented on LUCENE-4369:


bq. So how about "ExactMatchField"?

+1 for that. I was actually going to propose "MatchExactField", but I don't 
mind the order of the words.

Also, since a way to search for these terms/fields using the regular query 
syntax would be through a PerFieldAnalyzerWrapper and assigning KeywordAnalyzer 
to that field (are there other ways), we can also call it KeywordField.

I don't like MatchOnlyField .. i.e. TextField also matches *only* the words 
that are indexed in that field.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Mark Harwood (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452914#comment-13452914
 ] 

Mark Harwood commented on LUCENE-4369:
--

Agreed on the need for a change - names are important.

I have a problem with using "match" on its own because the word is often 
associated with partial matching e.g. "best match" or "fuzzy match".
A quick google suggests "match" has more connotations with fuzziness than 
exactness - there are 162m results for "best match" vs only 45m results for 
"exact match".

So how about "ExactMatchField"?




> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452908#comment-13452908
 ] 

Robert Muir commented on LUCENE-4369:
-

Mark: I don't have strong feelings one way or the other. 

We don't need to rush it, I think its fairly contained to change, we don't even 
have to deal with this for 
4.0 if we aren't happy: we can deprecate StringField just have it extend 
XXXField in a future 4.x release too.

But I think the name StringField is not really good at all so its good to get 
all the ideas out here.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Mark Harwood (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452900#comment-13452900
 ] 

Mark Harwood commented on LUCENE-4369:
--

SingleTermField ?

Not sure "matching vs searching" is a commonly understood differentiation.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
> Attachments: LUCENE-4369.patch
>
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452884#comment-13452884
 ] 

Chris Male commented on LUCENE-4369:


As I say, I totally support renaming this field to something.  I think calling 
it anything else will help with distinguishing it from TextField so I'm +1 for 
MatchOnly.  Perhaps that'll encourage people to read the docs about it not 
being analyzed.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-11 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452874#comment-13452874
 ] 

Robert Muir commented on LUCENE-4369:
-

Chris: well there is a lot more to convey than the old Field.Index.NOT_ANALYZED:

# text is treated as if it went thru keywordanalyzer
# term frequencies and positions are omitted
# length normalization and index-time boosts are disabled

The idea of "MatchOnly" is to describe that the field is really only useful for 
matching,
not searching. The other 2 things this Field does wrt scoring and index options 
become important
when someone adds multiple instances under the same name: I think its important 
to convey
that its still only 'matching' and they wont have real scoring here.

The problem I see with "StringField" as a name is that it doesn't hint at any 
of this. The current
name can lead you to believe you should use it because you happen to have your 
content as a Java String.


> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-09 Thread Erick Erickson
+1 I've lost count of the number of times people on the user's list
have used the string type and wondered why searches on terms
in the field didn't work.

Erick

On Sun, Sep 9, 2012 at 6:46 PM, Chris Male (JIRA)  wrote:
>
> [ 
> https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451730#comment-13451730
>  ]
>
> Chris Male commented on LUCENE-4369:
> 
>
> I'm +1 for renaming this field (and even considering its long term future) 
> I'm just not sure how MatchOnlyField conveys the fact it bypasses analysis?
>
>> StringFields name is unintuitive and not helpful
>> 
>>
>> Key: LUCENE-4369
>> URL: https://issues.apache.org/jira/browse/LUCENE-4369
>> Project: Lucene - Core
>>  Issue Type: Bug
>>Reporter: Robert Muir
>>
>> There's a huge difference between TextField and StringField, StringField 
>> screws up scoring and bypasses your Analyzer.
>> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
>> example.)
>> The name we use here is vital, otherwise people will get bad results.
>> I think we should rename StringField to MatchOnlyField.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-09 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451730#comment-13451730
 ] 

Chris Male commented on LUCENE-4369:


I'm +1 for renaming this field (and even considering its long term future) I'm 
just not sure how MatchOnlyField conveys the fact it bypasses analysis?

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-09 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451710#comment-13451710
 ] 

Adrien Grand commented on LUCENE-4369:
--

+1

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful

2012-09-09 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451668#comment-13451668
 ] 

Michael McCandless commented on LUCENE-4369:


+1 for MatchOnlyField.

> StringFields name is unintuitive and not helpful
> 
>
> Key: LUCENE-4369
> URL: https://issues.apache.org/jira/browse/LUCENE-4369
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Robert Muir
>
> There's a huge difference between TextField and StringField, StringField 
> screws up scoring and bypasses your Analyzer.
> (see java-user thread "Custom Analyzer Not Called When Indexing" as an 
> example.)
> The name we use here is vital, otherwise people will get bad results.
> I think we should rename StringField to MatchOnlyField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org