[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870384#action_12870384
]
Shai Erera commented on LUCENE-2458:
bq. There will be tons of different opinions to
I can't tell if you are being obnoxious or seriously believe what you say.
You understand that cjkanalyzer is broke with this? You understand that
ngrams themselves capture information about position and it even works
nicely with scoring, and helps.
This hack doesn't help english. If you think
Obnoxiousness has certainly been in the air regarding this issue, I'll
give you that.
On Sunday, May 23, 2010, Robert Muir rcm...@gmail.com wrote:
I can't tell if you are being obnoxious or seriously believe what you say.
You understand that cjkanalyzer is broke with this? You understand that
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870410#action_12870410
]
Uwe Schindler commented on LUCENE-2458:
---
Hi Robert,
I also agree with Mark (as you
Robert - is the effect on scoring also on English and other European
languages? Or is it mostly for ngram-based languages, and especially CJK?
I want to stress that not all ngram-based languages are affected by this
behavior, especially those for which we do ngram just because of a lack of
good
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate
phrasequeries based on term count
Robert - is the effect on scoring also on English and other European
languages? Or is it mostly for ngram-based languages, and especially CJK?
I want to stress that not all ngram-based
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
From: Shai Erera [mailto:ser...@gmail.com]
Sent: Sunday, May 23, 2010 6:34 PM
To: dev@lucene.apache.org
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate
On Sun, May 23, 2010 at 12:34 PM, Shai Erera ser...@gmail.com wrote:
I want to stress that not all ngram-based languages are affected by this
behavior, especially those for which we do ngram just because of a lack of
good tokenizer.
They are also affected! Do you understand how the
[mailto:ser...@gmail.com]
Sent: Sunday, May 23, 2010 6:34 PM
To: dev@lucene.apache.org
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't
generate
phrasequeries based on term count
Robert - is the effect on scoring also on English and other European
languages
, 2010 6:34 PM
To: dev@lucene.apache.org
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't
generate
phrasequeries based on term count
Robert - is the effect on scoring also on English and other European
languages? Or is it mostly for ngram-based languages
@lucene.apache.org
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't
generate phrasequeries based on term count
These comments lead me to believe you don't understand the issue.
Do you understand that *ALL* CJK queries are made into phrase queries,
regardless of tokenizer
The QP should work like that:
(1) It parses the query, creating fragments
(2) It does some out-of-the-box handling of those fragments
People should be able to override that handling of fragments. But people
should not touch (1).
In fact QP should work like that:
(1) Tokenizer parses the
On Sun, May 23, 2010 at 1:00 PM, Uwe Schindler u...@thetaphi.de wrote:
I just want to make the feature accessible and documented without Version.
I think it is just a bug (a shoddy implementation that does not use
the syntax, whether it was quoted or not, since this has been thrown
away). In
So ... after a long IRC chat on this, I think this has just been worded
incorrectly (the issue). As I understand, there are two issues here:
1) QP loses a phrase info for fields -- the query f:abcd and f:abcd are
parsed the same, or handled the same. There is no way for the one extending
QP to
+1, this is what the patch does. I agree i did a crappy job explaining
the issue.
On Sun, May 23, 2010 at 2:25 PM, Shai Erera ser...@gmail.com wrote:
So ... after a long IRC chat on this, I think this has just been worded
incorrectly (the issue). As I understand, there are two issues here:
1)
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870317#action_12870317
]
Mark Miller commented on LUCENE-2458:
-
I still don't think this falls under bug
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870353#action_12870353
]
Shai Erera commented on LUCENE-2458:
FWIW, I agree w/ Mark. I don't think it's a bug,
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12869280#action_12869280
]
Michael McCandless commented on LUCENE-2458:
OK mulling some more on this
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867112#action_12867112
]
Robert Muir commented on LUCENE-2458:
-
{quote}
This is why I like the token attr based
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867117#action_12867117
]
Uwe Schindler commented on LUCENE-2458:
---
Sorry for intervening,
I am in the same
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867147#action_12867147
]
Yonik Seeley commented on LUCENE-2458:
--
bq This is why I like the token attr based
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867151#action_12867151
]
Robert Muir commented on LUCENE-2458:
-
{quote}
An attribute that says these tokens go
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866528#action_12866528
]
Michael McCandless commented on LUCENE-2458:
This is sneaky behavior on
The QueryParser also fails to correctly parse Hebrew acronyms; although not
being an integral part of the current discussion, I thought this would be
the best place to bring that up.
Hebrew acronyms are assembled of letters with a single double-quote char
within, example: MNKL (Hebrew for CEO).
On Wed, May 12, 2010 at 6:05 AM, Itamar Syn-Hershko ita...@code972.com wrote:
The QueryParser also fails to correctly parse Hebrew acronyms; although not
being an integral part of the current discussion, I thought this would be
the best place to bring that up.
Just as I don't think Analysis
On 5/12/10 9:25 AM, Robert Muir wrote:
(and, contrary to what you would believe from the
documentation, the choice of whether or not to make a PhraseQuery is
not based on syntax one bit!)
Thats a major exaggeration - quoting text plays a large role in whether
or not you will get a phrase
On Wed, May 12, 2010 at 11:16 AM, Mark Miller markrmil...@gmail.com wrote:
Thats a major exaggeration - quoting text plays a large role in whether or
not you will get a phrase query.
No, it has nothing to do with it in the implementation. It only
escapes the whitespace, but is discarded. This
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866595#action_12866595
]
Marvin Humphrey commented on LUCENE-2458:
-
I have mixed feelings about this for
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866603#action_12866603
]
Marvin Humphrey commented on LUCENE-2458:
-
Because they show its 10x better to
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1285#action_1285
]
Ivan Provalov commented on LUCENE-2458:
---
Robert has asked me to post our test
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866693#action_12866693
]
Marvin Humphrey commented on LUCENE-2458:
-
I'm honestly having a tough time
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866696#action_12866696
]
Hoss Man commented on LUCENE-2458:
--
bq. Instead the queryparser should only form
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866695#action_12866695
]
Robert Muir commented on LUCENE-2458:
-
{quote}
Change the initial split on whitespace
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866698#action_12866698
]
Robert Muir commented on LUCENE-2458:
-
bq. but all other things being equal lets keep
@lucene.apache.org
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate
phrasequeries based on term count
On Wed, May 12, 2010 at 6:30 PM, Itamar Syn-Hershko ita...@code972.com
wrote:
Never did I request the QP to do Analysis. I simply mentioned this bug
- what this definitely
@lucene.apache.org
Subject: Re: [jira] Commented: (LUCENE-2458) queryparser shouldn't generate
phrasequeries based on term count
On Wed, May 12, 2010 at 6:30 PM, Itamar Syn-Hershko ita...@code972.com
wrote:
Never did I request the QP to do Analysis. I simply mentioned this bug
- what
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866954#action_12866954
]
DM Smith commented on LUCENE-2458:
--
As I see it there are two issues:
1) Backward
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866341#action_12866341
]
Hoss Man commented on LUCENE-2458:
--
Robter: do you have a specific suggestion for what
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866353#action_12866353
]
Robert Muir commented on LUCENE-2458:
-
bq. ...what should the resulting Query object
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866363#action_12866363
]
Hoss Man commented on LUCENE-2458:
--
bq. a Boolean Query formed with the default operator.
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866368#action_12866368
]
Robert Muir commented on LUCENE-2458:
-
bq. That seems like equally bad default
[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866374#action_12866374
]
Robert Muir commented on LUCENE-2458:
-
by the way hoss man you said it best yourself:
42 matches
Mail list logo