Re: setAllowLeadingWildcard and PythonMultiFieldQueryParser

2010-08-19 Thread Aric Coady
On Aug 18, 2010, at 10:13 PM, Andi Vajda wrote:
 On Wed, 18 Aug 2010, Aric Coady wrote:
 #query = queryParser.parse(queryString)
 query = queryParser.parse(Version.LUCENE_CURRENT, queryString, fields,
 [BooleanClause.Occur.SHOULD, 
 BooleanClause.Occur.SHOULD],
 analyzer)
 
 Whenever there is a name conflict between a static and non-static method 
 detected by JCC, the static method wrapper is renamed to be suffixed with a 
 '_' and a warning is emitted by JCC.
 
 Does changing the code to use a parse_() method instead solve the problem ?
 (it's late here and I haven't tried it myself)

Ah, so there are couple different things going on here.  MultiFieldQueryParser 
has only static parse methods, except that it also inherits QueryParse.parse.  
Perhaps that's why JCC isn't supplying a parse_ method.

 lucene.MultiFieldQueryParser.parse
built-in method parse of type object at 0x10171d800
 lucene.MultiFieldQueryParser.parse_
Traceback (most recent call last):
  File stdin, line 1, in module
AttributeError: type object 'MultiFieldQueryParser' has no attribute 'parse_'
 lucene.QueryParser.parse
method 'parse' of 'QueryParser' objects

This gotcha has come up before:  
http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/201007.mbox/%3caanlktinkhxsiqp7jljz1q0cy6cv03y5umyzvg8a5d...@mail.gmail.com%3e.
  But as known limitations go, it's an easy workaround.  Just call 
QueryParser.parse with the parser object as the first argument.

As for the wildcard issue, I was trying to point out that I don't think it's a 
pylucene problem at all.  The example given was calling the static 
MultiFieldQueryParser.parse with a parser object, incorrectly expecting 
settings on the parser object to have an affect.  The fact that calling 
queryParser.parse(queryString) raises a TypeError is technically unrelated, 
although probably adding to the confusion.



RE: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Uwe Schindler
Hi all,

has anybody access to the mailing list configuration? All
build-failed/success mails do come through dev@lucene.apache.org, as the
*new* Hudson email address hud...@hudson.apache.org is not on the mailing
list.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Monday, August 16, 2010 10:57 PM
 To: dev@lucene.apache.org
 Subject: FW: Re: [Hudson] New Hudson master now running
 
 Hi,
 
 can anybody add hud...@hudson.apache.org to the d...@lao mailing list, so
it
 can post the Hudson status reports. Hudson changed its address for build
 notification emails (new master).
 
 There should already be some messages from the builds today in the
 moderation requests.
 
 Thanks,
 Uwe
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-236) Field collapsing

2010-08-19 Thread Evgeniy Serykh (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900225#action_12900225
 ] 

Evgeniy Serykh commented on SOLR-236:
-

I've patched release of solr 1.4.1.  When I try to execute query with 
collapsing 'numFound' value always equals 10 while 'rows' param not specified. 

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: Next

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, DocSetScoreCollector.java, 
 field-collapse-3.patch, field-collapse-4-with-solrj.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, 
 quasidistributed.additional.patch, SOLR-236-1_4_1.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, 
 SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Michael McCandless
Uwe are you able to see incoming emails that arrive to
hud...@hudson.apache.org's account?

If so you can subscribe this address the normal way...

Mike

On Thu, Aug 19, 2010 at 4:00 AM, Uwe Schindler u...@thetaphi.de wrote:
 Hi all,

 has anybody access to the mailing list configuration? All
 build-failed/success mails do come through dev@lucene.apache.org, as the
 *new* Hudson email address hud...@hudson.apache.org is not on the mailing
 list.

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Monday, August 16, 2010 10:57 PM
 To: dev@lucene.apache.org
 Subject: FW: Re: [Hudson] New Hudson master now running

 Hi,

 can anybody add hud...@hudson.apache.org to the d...@lao mailing list, so
 it
 can post the Hudson status reports. Hudson changed its address for build
 notification emails (new master).

 There should already be some messages from the builds today in the
 moderation requests.

 Thanks,
 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Uwe Schindler
I dont see any eMails. The problem with subscribing the normal way is, that
the subscription response cannot be processed (I have no access to the
mailbox, if it exists, of this user). So it must be manually added to the
list. Ideally as non-member but allowed to send emails unmoderated.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Thursday, August 19, 2010 11:29 AM
 To: dev@lucene.apache.org
 Subject: Re: Re: [Hudson] New Hudson master now running
 
 Uwe are you able to see incoming emails that arrive to
 hud...@hudson.apache.org's account?
 
 If so you can subscribe this address the normal way...
 
 Mike
 
 On Thu, Aug 19, 2010 at 4:00 AM, Uwe Schindler u...@thetaphi.de wrote:
  Hi all,
 
  has anybody access to the mailing list configuration? All
  build-failed/success mails do come through dev@lucene.apache.org, as
  the
  *new* Hudson email address hud...@hudson.apache.org is not on the
  mailing list.
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Uwe Schindler [mailto:u...@thetaphi.de]
  Sent: Monday, August 16, 2010 10:57 PM
  To: dev@lucene.apache.org
  Subject: FW: Re: [Hudson] New Hudson master now running
 
  Hi,
 
  can anybody add hud...@hudson.apache.org to the d...@lao mailing list,
  so
  it
  can post the Hudson status reports. Hudson changed its address for
  build notification emails (new master).
 
  There should already be some messages from the builds today in the
  moderation requests.
 
  Thanks,
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Michael McCandless
Have you tried less /var/spool/mail/hudson (when logged in as the
hudson user)?

Or does mail to hudson@ get forwarded elsewhere...?

Mike

On Thu, Aug 19, 2010 at 6:30 AM, Uwe Schindler u...@thetaphi.de wrote:
 I dont see any eMails. The problem with subscribing the normal way is, that
 the subscription response cannot be processed (I have no access to the
 mailbox, if it exists, of this user). So it must be manually added to the
 list. Ideally as non-member but allowed to send emails unmoderated.

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Thursday, August 19, 2010 11:29 AM
 To: dev@lucene.apache.org
 Subject: Re: Re: [Hudson] New Hudson master now running

 Uwe are you able to see incoming emails that arrive to
 hud...@hudson.apache.org's account?

 If so you can subscribe this address the normal way...

 Mike

 On Thu, Aug 19, 2010 at 4:00 AM, Uwe Schindler u...@thetaphi.de wrote:
  Hi all,
 
  has anybody access to the mailing list configuration? All
  build-failed/success mails do come through dev@lucene.apache.org, as
  the
  *new* Hudson email address hud...@hudson.apache.org is not on the
  mailing list.
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Uwe Schindler [mailto:u...@thetaphi.de]
  Sent: Monday, August 16, 2010 10:57 PM
  To: dev@lucene.apache.org
  Subject: FW: Re: [Hudson] New Hudson master now running
 
  Hi,
 
  can anybody add hud...@hudson.apache.org to the d...@lao mailing list,
  so
  it
  can post the Hudson status reports. Hudson changed its address for
  build notification emails (new master).
 
  There should already be some messages from the builds today in the
  moderation requests.
 
  Thanks,
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Uwe Schindler
Nobody has access to the new master machine anymore. Should I open an infra
issue, to add the new address to the list af allowed posters?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Thursday, August 19, 2010 12:45 PM
 To: dev@lucene.apache.org
 Subject: Re: Re: [Hudson] New Hudson master now running
 
 Have you tried less /var/spool/mail/hudson (when logged in as the hudson
 user)?
 
 Or does mail to hudson@ get forwarded elsewhere...?
 
 Mike
 
 On Thu, Aug 19, 2010 at 6:30 AM, Uwe Schindler u...@thetaphi.de wrote:
  I dont see any eMails. The problem with subscribing the normal way is,
  that the subscription response cannot be processed (I have no access
  to the mailbox, if it exists, of this user). So it must be manually
  added to the list. Ideally as non-member but allowed to send emails
 unmoderated.
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Thursday, August 19, 2010 11:29 AM
  To: dev@lucene.apache.org
  Subject: Re: Re: [Hudson] New Hudson master now running
 
  Uwe are you able to see incoming emails that arrive to
  hud...@hudson.apache.org's account?
 
  If so you can subscribe this address the normal way...
 
  Mike
 
  On Thu, Aug 19, 2010 at 4:00 AM, Uwe Schindler u...@thetaphi.de wrote:
   Hi all,
  
   has anybody access to the mailing list configuration? All
   build-failed/success mails do come through dev@lucene.apache.org,
   as the
   *new* Hudson email address hud...@hudson.apache.org is not on the
   mailing list.
  
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
   -Original Message-
   From: Uwe Schindler [mailto:u...@thetaphi.de]
   Sent: Monday, August 16, 2010 10:57 PM
   To: dev@lucene.apache.org
   Subject: FW: Re: [Hudson] New Hudson master now running
  
   Hi,
  
   can anybody add hud...@hudson.apache.org to the d...@lao mailing
list,
   so
   it
   can post the Hudson status reports. Hudson changed its address for
   build notification emails (new master).
  
   There should already be some messages from the builds today in the
   moderation requests.
  
   Thanks,
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen
   http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
   additional commands, e-mail: dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
additional
  commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Michael McCandless
Ahhh OK.  Oh well.

Who is moderator of the dev@ list?  I think whoever that is, is able
to add a new subscription?  (Ie this shouldn't require infra's help).

Mike

On Thu, Aug 19, 2010 at 6:50 AM, Uwe Schindler u...@thetaphi.de wrote:
 Nobody has access to the new master machine anymore. Should I open an infra
 issue, to add the new address to the list af allowed posters?

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Thursday, August 19, 2010 12:45 PM
 To: dev@lucene.apache.org
 Subject: Re: Re: [Hudson] New Hudson master now running

 Have you tried less /var/spool/mail/hudson (when logged in as the hudson
 user)?

 Or does mail to hudson@ get forwarded elsewhere...?

 Mike

 On Thu, Aug 19, 2010 at 6:30 AM, Uwe Schindler u...@thetaphi.de wrote:
  I dont see any eMails. The problem with subscribing the normal way is,
  that the subscription response cannot be processed (I have no access
  to the mailbox, if it exists, of this user). So it must be manually
  added to the list. Ideally as non-member but allowed to send emails
 unmoderated.
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Thursday, August 19, 2010 11:29 AM
  To: dev@lucene.apache.org
  Subject: Re: Re: [Hudson] New Hudson master now running
 
  Uwe are you able to see incoming emails that arrive to
  hud...@hudson.apache.org's account?
 
  If so you can subscribe this address the normal way...
 
  Mike
 
  On Thu, Aug 19, 2010 at 4:00 AM, Uwe Schindler u...@thetaphi.de wrote:
   Hi all,
  
   has anybody access to the mailing list configuration? All
   build-failed/success mails do come through dev@lucene.apache.org,
   as the
   *new* Hudson email address hud...@hudson.apache.org is not on the
   mailing list.
  
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
   -Original Message-
   From: Uwe Schindler [mailto:u...@thetaphi.de]
   Sent: Monday, August 16, 2010 10:57 PM
   To: dev@lucene.apache.org
   Subject: FW: Re: [Hudson] New Hudson master now running
  
   Hi,
  
   can anybody add hud...@hudson.apache.org to the d...@lao mailing
 list,
   so
   it
   can post the Hudson status reports. Hudson changed its address for
   build notification emails (new master).
  
   There should already be some messages from the builds today in the
   moderation requests.
  
   Thanks,
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen
   http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
   additional commands, e-mail: dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional
  commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Uwe Schindler
That was my question :-) Who is the moderator?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Thursday, August 19, 2010 12:52 PM
 To: dev@lucene.apache.org
 Subject: Re: Re: [Hudson] New Hudson master now running
 
 Ahhh OK.  Oh well.
 
 Who is moderator of the dev@ list?  I think whoever that is, is able to
add a
 new subscription?  (Ie this shouldn't require infra's help).
 
 Mike
 
 On Thu, Aug 19, 2010 at 6:50 AM, Uwe Schindler u...@thetaphi.de wrote:
  Nobody has access to the new master machine anymore. Should I open an
  infra issue, to add the new address to the list af allowed posters?
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Thursday, August 19, 2010 12:45 PM
  To: dev@lucene.apache.org
  Subject: Re: Re: [Hudson] New Hudson master now running
 
  Have you tried less /var/spool/mail/hudson (when logged in as the
  hudson user)?
 
  Or does mail to hudson@ get forwarded elsewhere...?
 
  Mike
 
  On Thu, Aug 19, 2010 at 6:30 AM, Uwe Schindler u...@thetaphi.de wrote:
   I dont see any eMails. The problem with subscribing the normal way
   is, that the subscription response cannot be processed (I have no
   access to the mailbox, if it exists, of this user). So it must be
   manually added to the list. Ideally as non-member but allowed to
   send emails
  unmoderated.
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
   -Original Message-
   From: Michael McCandless [mailto:luc...@mikemccandless.com]
   Sent: Thursday, August 19, 2010 11:29 AM
   To: dev@lucene.apache.org
   Subject: Re: Re: [Hudson] New Hudson master now running
  
   Uwe are you able to see incoming emails that arrive to
   hud...@hudson.apache.org's account?
  
   If so you can subscribe this address the normal way...
  
   Mike
  
   On Thu, Aug 19, 2010 at 4:00 AM, Uwe Schindler u...@thetaphi.de
 wrote:
Hi all,
   
has anybody access to the mailing list configuration? All
build-failed/success mails do come through dev@lucene.apache.org,
as the
*new* Hudson email address hud...@hudson.apache.org is not on the
mailing list.
   
Uwe
   
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
eMail: u...@thetaphi.de
   
   
-Original Message-
From: Uwe Schindler [mailto:u...@thetaphi.de]
Sent: Monday, August 16, 2010 10:57 PM
To: dev@lucene.apache.org
Subject: FW: Re: [Hudson] New Hudson master now running
   
Hi,
   
can anybody add hud...@hudson.apache.org to the d...@lao mailing
  list,
so
it
can post the Hudson status reports. Hudson changed its address
for
build notification emails (new master).
   
There should already be some messages from the builds today in
the
moderation requests.
   
Thanks,
Uwe
   
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
   
   
   
   
   
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
additional commands, e-mail: dev-h...@lucene.apache.org
   
   
  
  
-
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional
   commands, e-mail: dev-h...@lucene.apache.org
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality

2010-08-19 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900246#action_12900246
 ] 

Grant Ingersoll commented on SOLR-2010:
---

{quote}Adds support for shards. I originally implemented this by passing the 
SearchHandler to the SpellCheckComponent and then using an overloaded version 
of SearchHandler.handleRequestBody() to do the re-queries. I found this was 
unnecessary as we get the same results by calling the QueryComponent directly. 
{quote}

I haven't taken a look at the patch yet, but by the sounds of it, I still think 
the cleaner way to go is to make Solr have an option to specifically pass in 
which component to run and turn off all others.  This would be useful for other 
things, too.  Then you could just use the existing mechanisms.

 Improvements to SpellCheckComponent Collate functionality
 -

 Key: SOLR-2010
 URL: https://issues.apache.org/jira/browse/SOLR-2010
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, spellchecker
Affects Versions: 1.4.1
 Environment: Tested against trunk revision 966633
Reporter: James Dyer
Assignee: Grant Ingersoll
Priority: Minor
 Attachments: SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.txt


 Improvements to SpellCheckComponent Collate functionality
 Our project requires a better Spell Check Collator.  I'm contributing this as 
 a patch to get suggestions for improvements and in case there is a broader 
 need for these features.
 1. Only return collations that are guaranteed to result in hits if re-queried 
 (applying original fq params also).  This is especially helpful when there is 
 more than one correction per query.  The 1.4 behavior does not verify that a 
 particular combination will actually return hits.
 2. Provide the option to get multiple collation suggestions
 3. Provide extended collation results including the # of hits re-querying 
 will return and a breakdown of each misspelled word and its correction.
 This patch is similar to what is described in SOLR-507 item #1.  Also, this 
 patch provides a viable workaround for the problem discussed in SOLR-1074.  A 
 dictionary could be created that combines the terms from the multiple fields. 
  The collator then would prune out any spurious suggestions this would cause.
 This patch adds the following spellcheck parameters:
 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try 
 before giving up.  Lower values ensure better performance.  Higher values may 
 be necessary to find a collation that can return results.  Default is 0, 
 which maintains backwards-compatible behavior (do not check collations).
 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 
 1, which maintains backwards-compatible behavior.
 3. spellcheck.collateExtendedResult - if true, returns an expanded response 
 format detailing collations found.  default is false, which maintains 
 backwards-compatible behavior.  When true, output is like this (in context):
 lst name=spellcheck
   lst name=suggestions
   lst name=hopq
   int name=numFound94/int
   int name=startOffset7/int
   int name=endOffset11/int
   arr name=suggestion
   strhope/str
   strhow/str
   strhope/str
   strchops/str
   strhoped/str
   etc
   /arr
   lst name=faill
   int name=numFound100/int
   int name=startOffset16/int
   int name=endOffset21/int
   arr name=suggestion
   strfall/str
   strfails/str
   strfail/str
   strfill/str
   strfaith/str
   strall/str
   etc
   /arr
   /lst
   lst name=collation
   str name=collationQueryTitle:(how AND fails)/str
   int name=hits2/int
   lst name=misspellingsAndCorrections
   str name=hopqhow/str
   str name=faillfails/str
   /lst
   /lst
   lst name=collation
   str name=collationQueryTitle:(hope AND faith)/str
   int name=hits2/int
   lst name=misspellingsAndCorrections
   str name=hopqhope/str

[jira] Resolved: (LUCENE-2606) optimize contrib/regex for flex

2010-08-19 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-2606.
-

Resolution: Fixed

Committed revision 987129.

 optimize contrib/regex for flex
 ---

 Key: LUCENE-2606
 URL: https://issues.apache.org/jira/browse/LUCENE-2606
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-2606.patch, LUCENE-2606.patch


 * changes RegexCapabilities match(String) to match(BytesRef)
 * the jakarta and jdk impls uses CharacterIterator/CharSequence matching 
 against the utf16result instead.
 * i also reuse the matcher for jdk, i don't see why we didnt do this before 
 but it makes sense esp since we reuse the CSQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-961) RegexCapabilities is not Serializable

2010-08-19 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-961:
---

Fix Version/s: 4.0

 RegexCapabilities is not Serializable
 -

 Key: LUCENE-961
 URL: https://issues.apache.org/jira/browse/LUCENE-961
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.2
Reporter: Konrad Rokicki
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.0


 The class RegexQuery is marked Serializable by its super class, but it 
 contains a RegexCapabilities which is not Serializable. Thus attempting to 
 serialize the query results in an exception. 
 Making RegexCapabilities serializable should be no problem since its 
 subclasses contain only serializable classes (java.util.regex.Pattern and 
 org.apache.regexp.RE).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-961) RegexCapabilities is not Serializable

2010-08-19 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-961.


Resolution: Fixed

fixed with LUCENE-2606

 RegexCapabilities is not Serializable
 -

 Key: LUCENE-961
 URL: https://issues.apache.org/jira/browse/LUCENE-961
 Project: Lucene - Java
  Issue Type: Bug
  Components: QueryParser
Affects Versions: 2.2
Reporter: Konrad Rokicki
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.0


 The class RegexQuery is marked Serializable by its super class, but it 
 contains a RegexCapabilities which is not Serializable. Thus attempting to 
 serialize the query results in an exception. 
 Making RegexCapabilities serializable should be no problem since its 
 subclasses contain only serializable classes (java.util.regex.Pattern and 
 org.apache.regexp.RE).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2058) Adds optional slop to pf2, pf3 and pf parameters

2010-08-19 Thread Ron Mayer (JIRA)
Adds optional slop to pf2, pf3 and pf parameters 
-

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor


http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
{quote}
FromRon Mayer r...@0ape.com
 my results might
 be even better if I had a couple different pf2s with different ps's
 at the same time.

 In particular.   One with ps=0 to put a high boost on ones the have
 the right ordering of words.  For example insuring that:
  red hat black jacket
 boosts only red hats and not black hats.

 And another pf2 with a more modest boost with ps=5 or so to handle
 the query above also boosting docs with red baseball hat.
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
{quote}
FromYonik Seeley yo...@lucidimagination.com

Perhaps fold it into the pf/pf2 syntax?

pf=text^2// current syntax... makes phrases with a boost of 2
pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
a boost of 2

That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
text:foo bar~1^2

-Yonik
http
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
{quote}
FromChris Hostetter hossman_luc...@fucit.org

Big +1 to this idea ... the existing ps param can stick arround as the 
default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
fields using the ~ syntax.

-Hoss
{quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2058) Adds optional slop to pf2, pf3 and pf parameters

2010-08-19 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


Attachment: pf2_with_slop.patch

This patch is my first draft at implementing this feature.

Any feedback would be appreciated.

It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismaxfl=id,text,scoreq=enterprise+search+foobarps=5qf=textdebugQuery=truepf2=name~0^ps=7pf2=name^12+name~10]

into what I believe is the desired parsed query:

+((text:enterpris) (text:search) (text:foobar)) ((name:enterprise 
search~5^12.0) (name:search foobar~5^12.0)) ((name:enterprise 
search^.0) (name:search foobar^.0)) ((name:enterprise search~10) 
(name:search foobar~10))

which looks like it should give a high boost to docs where both words appear 
right next to each other, but still substantial boosts to docs where the pairs 
of words are a few words apart.

 Adds optional slop to pf2, pf3 and pf parameters 
 -

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor
 Attachments: pf2_with_slop.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
 {quote}
 From  Ron Mayer r...@0ape.com
  my results might
  be even better if I had a couple different pf2s with different ps's
  at the same time.
  In particular.   One with ps=0 to put a high boost on ones the have
  the right ordering of words.  For example insuring that:
   red hat black jacket
  boosts only red hats and not black hats.
  And another pf2 with a more modest boost with ps=5 or so to handle
  the query above also boosting docs with red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 text:foo bar~1^2
 -Yonik
 http
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (SOLR-2058) Adds optional slop to pf2, pf3 and pf parameters

2010-08-19 Thread Ron Mayer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900260#action_12900260
 ] 

Ron Mayer edited comment on SOLR-2058 at 8/19/10 8:04 AM:
--

This patch is my first draft at implementing this feature.

Any feedback would be appreciated.

It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismaxfl=id,text,scoreq=enterprise+search+foobarps=5qf=textdebugQuery=truepf2=name~0^pf2=name^12+name~10]

into what I believe is the desired parsed query:

+((text:enterpris) (text:search) (text:foobar)) ((name:enterprise 
search~5^12.0) (name:search foobar~5^12.0)) ((name:enterprise 
search^.0) (name:search foobar^.0)) ((name:enterprise search~10) 
(name:search foobar~10))

which looks like it should give a high boost to docs where both words appear 
right next to each other, but still substantial boosts to docs where the pairs 
of words are a few words apart.

  was (Author: ramayer):
This patch is my first draft at implementing this feature.

Any feedback would be appreciated.

It seems to happily turn a query like
[http://localhost:8983/solr/select?defType=edismaxfl=id,text,scoreq=enterprise+search+foobarps=5qf=textdebugQuery=truepf2=name~0^ps=7pf2=name^12+name~10]

into what I believe is the desired parsed query:

+((text:enterpris) (text:search) (text:foobar)) ((name:enterprise 
search~5^12.0) (name:search foobar~5^12.0)) ((name:enterprise 
search^.0) (name:search foobar^.0)) ((name:enterprise search~10) 
(name:search foobar~10))

which looks like it should give a high boost to docs where both words appear 
right next to each other, but still substantial boosts to docs where the pairs 
of words are a few words apart.
  
 Adds optional slop to pf2, pf3 and pf parameters 
 -

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor
 Attachments: pf2_with_slop.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
 {quote}
 From  Ron Mayer r...@0ape.com
  my results might
  be even better if I had a couple different pf2s with different ps's
  at the same time.
  In particular.   One with ps=0 to put a high boost on ones the have
  the right ordering of words.  For example insuring that:
   red hat black jacket
  boosts only red hats and not black hats.
  And another pf2 with a more modest boost with ps=5 or so to handle
  the query above also boosting docs with red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 text:foo bar~1^2
 -Yonik
 http
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1316) Create autosuggest component

2010-08-19 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900275#action_12900275
 ] 

Andrzej Bialecki  commented on SOLR-1316:
-

Robert,
bq. But i have trouble understanding what the point of all the complexity in 
this method is... if its just as its documented, it seems it could be much 
simpler: (eg b-a). So I feel like I am missing something.

I don't get it either... it's a borrowed code after all ;) Anyway, I replaced 
this method with the following:

{code}
  private static int compareCharsAlphabetically(char cCompare2, char cRef) {
return Character.toLowerCase(cCompare2) - Character.toLowerCase(cRef);
  }
{code}

and all tests pass, including those that test for correctness of returned 
suggestions and consistency between Jaspell and TST. I also ran testBenchmark() 
and differences in timings are negligible.

Grant,
bq. I think we should at least open an issue for it and link to this one when 
this one gets committed, as it takes a while to build.
Yes, I'll open an issue when this gets committed.

If there are no further objections I'd like to commit this.


 Create autosuggest component
 

 Key: SOLR-1316
 URL: https://issues.apache.org/jira/browse/SOLR-1316
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Jason Rutherglen
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: Next

 Attachments: SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, 
 SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, 
 SOLR-1316.patch, suggest.patch, suggest.patch, suggest.patch, TST.zip

   Original Estimate: 96h
  Remaining Estimate: 96h

 Autosuggest is a common search function that can be integrated
 into Solr as a SearchComponent. Our first implementation will
 use the TernaryTree found in Lucene contrib. 
 * Enable creation of the dictionary from the index or via Solr's
 RPC mechanism
 * What types of parameters and settings are desirable?
 * Hopefully in the future we can include user click through
 rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1486) Wildcards, ORs etc inside Phrase queries

2010-08-19 Thread Terje Eggestad (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900278#action_12900278
 ] 

Terje Eggestad commented on LUCENE-1486:


Hi 

I'm about begin using the ComplexPhraseQueryParser with 3.0.2 as we need 
wildcard with phrases and proximity 

Our customers have a habit of including '-' in phrases which seem to trigger a 
bug :

If you add the following tests to the TestComplexPhraseQueryParser class:

checkMatches(\joe john nosuchword\, );  
checkMatches(\joe-john-nosuchword\, );  
checkMatches(\john-nosuchword smith\, );  

AND add a rewrite() in checkMatches() just after parse :
Query q = qp.parse(qString);
IndexReader reader = searcher.getIndexReader();  // 
need for rewrite
q = q.rewrite(reader); 


The first two is OK, and is rewritten to:

spanNear([name:joe, name:john, name:nosuchword], 0, true)
name:joe john nosuchword


The third bomb out on 

java.lang.IllegalArgumentException: Unknown query type 
org.apache.lucene.search.PhraseQuery found in phrase query string 
john-nosuchword smith
at 
org.apache.lucene.queryParser.ComplexPhraseQueryParser$ComplexPhraseQuery.rewrite(ComplexPhraseQueryParser.java:281)
at 
org.apache.lucene.queryParser.TestComplexPhraseQuery.checkMatches(TestComplexPhraseQuery.java:120)
.
.
.


I made a fix that *seem* to fixit, but I feel on very shaky ground here.
I've made so many debugging hack around that I can't make a propper patch, but 
I added this fix to ComplexPhraseQueryParser::rewrite()
just before the place the exception is thrown:

   } else {
if (qc instanceof TermQuery) {
TermQuery tq = (TermQuery) qc;
allSpanClauses[i] = new SpanTermQuery(tq.getTerm());

// START  FIX A-B C phrases
} else if (qc instanceof PhraseQuery) {
PhraseQuery pq = (PhraseQuery) qc;
Term[] subterms = pq.getTerms();

SpanQuery[] clauses = new SpanQuery[subterms.length];
for (int j = 0; j  subterms.length; j++) {
clauses[j] = new SpanTermQuery(subterms[j]);
}
allSpanClauses[i] = new SpanNearQuery(clauses, 0, true);
// END FIX
}   else {

throw new IllegalArgumentException(Unknown query type 
\
+ qc.getClass().getName()
+ \ found in phrase query string \
+ phrasedQueryStringContents + \);
}






 Wildcards, ORs etc inside Phrase queries
 

 Key: LUCENE-1486
 URL: https://issues.apache.org/jira/browse/LUCENE-1486
 Project: Lucene - Java
  Issue Type: Improvement
  Components: QueryParser
Affects Versions: 2.4
Reporter: Mark Harwood
Priority: Minor
 Fix For: 4.0

 Attachments: ComplexPhraseQueryParser.java, 
 junit_complex_phrase_qp_07_21_2009.patch, 
 junit_complex_phrase_qp_07_22_2009.patch, Lucene-1486 non default 
 field.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, 
 LUCENE-1486.patch, LUCENE-1486.patch, TestComplexPhraseQuery.java


 An extension to the default QueryParser that overrides the parsing of 
 PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries.
 The implementation feels a little hacky - this is arguably better handled in 
 QueryParser itself. This works as a proof of concept  for much of the query 
 parser syntax. Examples from the Junit test include:
   checkMatches(\j*   smyth~\, 1,2); //wildcards and fuzzies 
 are OK in phrases
   checkMatches(\(jo* -john)  smith\, 2); // boolean logic 
 works
   checkMatches(\jo*  smith\~2, 1,2,3); // position logic 
 works.
   
   checkBadQuery(\jo*  id:1 smith\); //mixing fields in a 
 phrase is bad
   checkBadQuery(\jo* \smith\ \); //phrases inside phrases 
 is bad
   checkBadQuery(\jo* [sma TO smZ]\ \); //range queries 
 inside phrases not supported
 Code plus Junit test to follow...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Does expungeDeletes need calling during an optimize?

2010-08-19 Thread Eric Pugh
It is deprecated!  

Mostly I was trying to better understand the specifics of what expungeDeletes 
means, and so was looking at the codebase.  In terms of the actual impact of 
doing expungeDeletes, it seems like it's a mini optimize that you would use 
in an environment where you are indexing and deleting documents rapidly, and 
don't want file size to grow too quickly, yet can't afford the cost of frequent 
optimize operations?

Eric



On Aug 19, 2010, at 1:17 AM, Lance Norskog wrote:

 Isn't DUH deprecated by now? Should anyone use it? Does it still work?
 
 On Wed, Aug 18, 2010 at 2:15 PM, Eric Pugh
 ep...@opensourceconnections.com wrote:
 Okay, I tried to clarify wiki page.
 
 Eric
 
 On Aug 18, 2010, at 4:56 PM, Yonik Seeley wrote:
 
 On Wed, Aug 18, 2010 at 4:46 PM, Eric Pugh
 ep...@opensourceconnections.com wrote:
 So would it make sense to update the wiki page to say the expungeDeletes 
 only makes sense as a commit parameter, not an optimize parameter?
 
 Perhaps a better way of putting it is that expungeDeletes on an
 optimize call is redundant - an optimize always expunges deletes.
 
 -Yonik
 http://www.lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
 http://www.opensourceconnections.com
 Co-Author: Solr 1.4 Enterprise Search Server available from 
 http://www.packtpub.com/solr-1-4-enterprise-search-server
 Free/Busy: http://tinyurl.com/eric-cal
 
 
 
 
 
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 
 
 -- 
 Lance Norskog
 goks...@gmail.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from 
http://www.packtpub.com/solr-1-4-enterprise-search-server
Free/Busy: http://tinyurl.com/eric-cal









-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1316) Create autosuggest component

2010-08-19 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-1316:


Attachment: SOLR-1316.patch

Updated patch.

 Create autosuggest component
 

 Key: SOLR-1316
 URL: https://issues.apache.org/jira/browse/SOLR-1316
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Jason Rutherglen
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: Next

 Attachments: SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, 
 SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, 
 SOLR-1316.patch, SOLR-1316.patch, suggest.patch, suggest.patch, 
 suggest.patch, TST.zip

   Original Estimate: 96h
  Remaining Estimate: 96h

 Autosuggest is a common search function that can be integrated
 into Solr as a SearchComponent. Our first implementation will
 use the TernaryTree found in Lucene contrib. 
 * Enable creation of the dictionary from the index or via Solr's
 RPC mechanism
 * What types of parameters and settings are desirable?
 * Hopefully in the future we can include user click through
 rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.

2010-08-19 Thread Robert Muir (JIRA)
Allow customizing how WordDelimiterFilter tokenizes text.
-

 Key: SOLR-2059
 URL: https://issues.apache.org/jira/browse/SOLR-2059
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Robert Muir
Priority: Minor
 Fix For: 3.1, 4.0
 Attachments: SOLR-2059.patch

By default, WordDelimiterFilter assigns 'types' to each character (computed 
from Unicode Properties).
Based on these types and the options provided, it splits and concatenates text.

In some circumstances, you might need to tweak the behavior of how this works.
It seems the filter already had this in mind, since you can pass in a custom 
byte[] type table.
But its not exposed in the factory.

I think you should be able to customize the defaults with a configuration file:
{noformat}
# A customized type mapping for WordDelimiterFilterFactory
# the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM
# 
# the default for any character without a mapping is always computed from 
# Unicode character properties

# Map the $, %, '.', and ',' characters to DIGIT 
# This might be useful for financial data.
$ = DIGIT
% = DIGIT
. = DIGIT
\u002C = DIGIT
{noformat}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2059) Allow customizing how WordDelimiterFilter tokenizes text.

2010-08-19 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2059:
--

Attachment: SOLR-2059.patch

 Allow customizing how WordDelimiterFilter tokenizes text.
 -

 Key: SOLR-2059
 URL: https://issues.apache.org/jira/browse/SOLR-2059
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Robert Muir
Priority: Minor
 Fix For: 3.1, 4.0

 Attachments: SOLR-2059.patch


 By default, WordDelimiterFilter assigns 'types' to each character (computed 
 from Unicode Properties).
 Based on these types and the options provided, it splits and concatenates 
 text.
 In some circumstances, you might need to tweak the behavior of how this works.
 It seems the filter already had this in mind, since you can pass in a custom 
 byte[] type table.
 But its not exposed in the factory.
 I think you should be able to customize the defaults with a configuration 
 file:
 {noformat}
 # A customized type mapping for WordDelimiterFilterFactory
 # the allowable types are: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, SUBWORD_DELIM
 # 
 # the default for any character without a mapping is always computed from 
 # Unicode character properties
 # Map the $, %, '.', and ',' characters to DIGIT 
 # This might be useful for financial data.
 $ = DIGIT
 % = DIGIT
 . = DIGIT
 \u002C = DIGIT
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1316) Create autosuggest component

2010-08-19 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900282#action_12900282
 ] 

Robert Muir commented on SOLR-1316:
---

{quote}
and all tests pass, including those that test for correctness of returned 
suggestions and consistency between Jaspell and TST. I also ran testBenchmark() 
and differences in timings are negligible.
{quote}

Thanks Andrzej!


 Create autosuggest component
 

 Key: SOLR-1316
 URL: https://issues.apache.org/jira/browse/SOLR-1316
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Jason Rutherglen
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: Next

 Attachments: SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, 
 SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, SOLR-1316.patch, 
 SOLR-1316.patch, SOLR-1316.patch, suggest.patch, suggest.patch, 
 suggest.patch, TST.zip

   Original Estimate: 96h
  Remaining Estimate: 96h

 Autosuggest is a common search function that can be integrated
 into Solr as a SearchComponent. Our first implementation will
 use the TernaryTree found in Lucene contrib. 
 * Enable creation of the dictionary from the index or via Solr's
 RPC mechanism
 * What types of parameters and settings are desirable?
 * Hopefully in the future we can include user click through
 rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2608) Allow for specification of spell checker accuracy when calling suggestSimilar

2010-08-19 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-2608:


Attachment: LUCENE-2608.patch

Adds the accuracy functionality to the Lucene spell checker and also adds 
support to Solr.  For Lucene, the change is backward compatible, for Solr, it 
is not back compatible for those who implement their own SolrSpellChecker, as I 
introduced a more future proof parameter passing capability.  I also added a 
means for per implementation parameters to be passed in.  Interpretation of 
those entries are entirely up to the implementation.

 Allow for specification of spell checker accuracy when calling suggestSimilar
 -

 Key: LUCENE-2608
 URL: https://issues.apache.org/jira/browse/LUCENE-2608
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/spellchecker
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Attachments: LUCENE-2608.patch


 There is really no need for accuracy to be a class variable in the 
 Spellchecker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Combined Lucene/Solr Issues

2010-08-19 Thread Uwe Schindler
Form me it does not matter, but when I open new issues, I do it against the 
project where the “bug” is visible. If there is also code committed to Solr, 
but the main task is Lucene this is fine.

 

If we have a new functionality that affects both projects, you can create a 
secondary issue in the other project and link them.

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de/ http://www.thetaphi.de

eMail: u...@thetaphi.de

 

From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Wednesday, August 18, 2010 10:03 PM
To: dev@lucene.apache.org; simon.willna...@gmail.com
Cc: yo...@lucidimagination.com
Subject: Re: Combined Lucene/Solr Issues

 

 

On Wed, Aug 18, 2010 at 3:55 PM, Simon Willnauer 
simon.willna...@googlemail.com wrote:

 

I would appreciate creating two issues and use one only for reference
and link it by the one which contains patches and discussion if the
changes are large. Using SOLR- vs. LUCENE- I'd decide on a case by
case basis depending which project / codebase might undergo the
most significant changes. Generally,  referencing the issues in
CHANGES.TXT sounds like a good idea.

 

I don't think this is realistic. often a patch needs to change lucene and solr 
code in one commit.

 

Personally, i don't waste any time thinking about whether the issue is SOLR or 
LUCENE, and I think two JIRAs is actually confusing.


-- 
Robert Muir
rcm...@gmail.com



[jira] Updated: (SOLR-2058) Adds optional phrase slop to pf2, pf3 and pf parameters with field~slop^boost syntax

2010-08-19 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


Summary: Adds optional phrase slop to pf2, pf3 and pf parameters 
with field~slop^boost syntax  (was: Adds optional slop to pf2, pf3 and 
pf parameters )

 Adds optional phrase slop to pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor
 Attachments: pf2_with_slop.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
 {quote}
 From  Ron Mayer r...@0ape.com
  my results might
  be even better if I had a couple different pf2s with different ps's
  at the same time.
  In particular.   One with ps=0 to put a high boost on ones the have
  the right ordering of words.  For example insuring that:
   red hat black jacket
  boosts only red hats and not black hats.
  And another pf2 with a more modest boost with ps=5 or so to handle
  the query above also boosting docs with red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 text:foo bar~1^2
 -Yonik
 http
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2058) Adds optional phrase slop to pf2, pf3 and pf parameters with field~slop^boost syntax

2010-08-19 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


 Original Estimate: (was: 168h)
Remaining Estimate: (was: 168h)

 Adds optional phrase slop to pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor
 Attachments: pf2_with_slop.patch


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
 {quote}
 From  Ron Mayer r...@0ape.com
  my results might
  be even better if I had a couple different pf2s with different ps's
  at the same time.
  In particular.   One with ps=0 to put a high boost on ones the have
  the right ordering of words.  For example insuring that:
   red hat black jacket
  boosts only red hats and not black hats.
  And another pf2 with a more modest boost with ps=5 or so to handle
  the query above also boosting docs with red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 text:foo bar~1^2
 -Yonik
 http
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2608) Allow for specification of spell checker accuracy when calling suggestSimilar

2010-08-19 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900302#action_12900302
 ] 

Grant Ingersoll commented on LUCENE-2608:
-

Committed revision 987179.  (Trunk)

 Allow for specification of spell checker accuracy when calling suggestSimilar
 -

 Key: LUCENE-2608
 URL: https://issues.apache.org/jira/browse/LUCENE-2608
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/spellchecker
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Attachments: LUCENE-2608.patch


 There is really no need for accuracy to be a class variable in the 
 Spellchecker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2010) Improvements to SpellCheckComponent Collate functionality

2010-08-19 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2010:
-

Attachment: SOLR-2010.patch

Third version (with .patch extension.  I had used .txt extension with 2nd 
version).  Works with trunk rev#986945.

This time SpellCheckCollator calls the SearchHandler instead of calling the 
QueryComponent.  This required exposing a reference to the SearchHandler on the 
ResponseBuilder.  Also a new overloaded method in 
SearchHandler.processRequestBody() lets you override the list of components to 
run.  In this case we just have it run QueryComponent.

This revision has 2 potential benefits: 
 
(1) the overloaded method in SearchHandler may prove useful to other components 
in the future.  

(2) there may be a way to get SearchHandler to requery all the shards at once 
and then there would be no need to reintegrate the Collations in 
SearchHandler.finishStage().  However, see my comment in SpellCheckCollator 
lines 56-57.  Likely I am calling SpellCheckCollator during the wrong stage 
of the distributed request but I a need to find out more specifically how 
shards work to determine how to further improve this here.  As time allows I 
will do my own investigating but anyone's advice would be greatly appreciated.

Finally, this version corrects a bug that would have caused one of the test 
scenarios in DistributedSpellCheckComponentTest to fail.  Unfortunately in the 
2nd version, I had left some scenarios commented-out and did not catch this 
until now.


 Improvements to SpellCheckComponent Collate functionality
 -

 Key: SOLR-2010
 URL: https://issues.apache.org/jira/browse/SOLR-2010
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, spellchecker
Affects Versions: 1.4.1
 Environment: Tested against trunk revision 966633
Reporter: James Dyer
Assignee: Grant Ingersoll
Priority: Minor
 Attachments: SOLR-2010.patch, SOLR-2010.patch, SOLR-2010.patch, 
 SOLR-2010.txt


 Improvements to SpellCheckComponent Collate functionality
 Our project requires a better Spell Check Collator.  I'm contributing this as 
 a patch to get suggestions for improvements and in case there is a broader 
 need for these features.
 1. Only return collations that are guaranteed to result in hits if re-queried 
 (applying original fq params also).  This is especially helpful when there is 
 more than one correction per query.  The 1.4 behavior does not verify that a 
 particular combination will actually return hits.
 2. Provide the option to get multiple collation suggestions
 3. Provide extended collation results including the # of hits re-querying 
 will return and a breakdown of each misspelled word and its correction.
 This patch is similar to what is described in SOLR-507 item #1.  Also, this 
 patch provides a viable workaround for the problem discussed in SOLR-1074.  A 
 dictionary could be created that combines the terms from the multiple fields. 
  The collator then would prune out any spurious suggestions this would cause.
 This patch adds the following spellcheck parameters:
 1. spellcheck.maxCollationTries - maximum # of collation possibilities to try 
 before giving up.  Lower values ensure better performance.  Higher values may 
 be necessary to find a collation that can return results.  Default is 0, 
 which maintains backwards-compatible behavior (do not check collations).
 2. spellcheck.maxCollations - maximum # of collations to return.  Default is 
 1, which maintains backwards-compatible behavior.
 3. spellcheck.collateExtendedResult - if true, returns an expanded response 
 format detailing collations found.  default is false, which maintains 
 backwards-compatible behavior.  When true, output is like this (in context):
 lst name=spellcheck
   lst name=suggestions
   lst name=hopq
   int name=numFound94/int
   int name=startOffset7/int
   int name=endOffset11/int
   arr name=suggestion
   strhope/str
   strhow/str
   strhope/str
   strchops/str
   strhoped/str
   etc
   /arr
   lst name=faill
   int name=numFound100/int
   int name=startOffset16/int
   int name=endOffset21/int
   arr name=suggestion
   strfall/str
   strfails/str
   strfail/str
   strfill/str
   strfaith/str
  

DocFrequencyValueSource

2010-08-19 Thread Ryan McKinley
Is there any general interest in a DocFrequencyValueSource?
https://issues.apache.org/jira/browse/SOLR-1694

Point is to let you sort on the frequency of a term

It is pretty simple and i think general

ryan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: DocFrequencyValueSource

2010-08-19 Thread Yonik Seeley
On Thu, Aug 19, 2010 at 12:01 PM, Ryan McKinley ryan...@gmail.com wrote:
 Is there any general interest in a DocFrequencyValueSource?
 https://issues.apache.org/jira/browse/SOLR-1694

 Point is to let you sort on the frequency of a term

The frequency of a term within a field?  That would be term frequency?
Anyway, the following issue has already been committed and might meet
your needs?
https://issues.apache.org/jira/browse/SOLR-1932

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (SOLR-1694) DocFrequencyValueSource

2010-08-19 Thread Ryan McKinley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-1694.
-

Resolution: Duplicate

included within SOLR-1932

 DocFrequencyValueSource
 ---

 Key: SOLR-1694
 URL: https://issues.apache.org/jira/browse/SOLR-1694
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Ryan McKinley
Priority: Minor
 Attachments: SOLR-1694-DocFrequencyValueSource.patch


 A ValueSource to expose the document frequency for a given field.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: DocFrequencyValueSource

2010-08-19 Thread Ryan McKinley
ah yes -- i'll take a stab at porting SOLR-1932 to 3.1.x


On Thu, Aug 19, 2010 at 12:24 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Thu, Aug 19, 2010 at 12:01 PM, Ryan McKinley ryan...@gmail.com wrote:
 Is there any general interest in a DocFrequencyValueSource?
 https://issues.apache.org/jira/browse/SOLR-1694

 Point is to let you sort on the frequency of a term

 The frequency of a term within a field?  That would be term frequency?
 Anyway, the following issue has already been committed and might meet
 your needs?
 https://issues.apache.org/jira/browse/SOLR-1932

 -Yonik
 http://www.lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Re: [Hudson] New Hudson master now running

2010-08-19 Thread Chris Hostetter

: That was my question :-) Who is the moderator?

Per joe on #asfinfra it's ehatcher and husted.

(We should probably file a request to have some more moderators added .. 
volunteers?)


-Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Combined Lucene/Solr Issues

2010-08-19 Thread Chris Hostetter
: Form me it does not matter, but when I open new issues, I do it against 
: the project where the “bug” is visible. If there is also code committed 
: to Solr, but the main task is Lucene this is fine.

Right ... i think it's handy to still have the SOLR bug queue for people 
to file bugs against Solr, if they wind up requiring fixes further down 
the tree then so be it.

: Personally, i don't waste any time thinking about whether the issue is 
: SOLR or LUCENE, and I think two JIRAs is actually confusing.

If you know from the outset when you create an issue (ie: tracking an 
improvement, or a new feature) that it requires updating the whole tree 
then it should definitely be a LUCENE issue.  even if you aren't sure it 
makes sense to start using LUCENE, but having SOLR arround for Solr users 
to file bugs is handy.

Worst case scenerio: if it starts out as a SOLR issue and then the scope 
gets bigger, creating a new LUCENE issue to track it (and linking the two) 
seems trivial to me.

As far as refrencing LUCENE-* issues directly in Solr's CHANGES.txt -- 
sure, why not?



-Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Hudson build is back to normal : Solr-trunk #1224

2010-08-19 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Solr-trunk/1224/



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



/trunk modules ant dist-maven issues

2010-08-19 Thread Ryan McKinley
Is anyone building maven artifacts on trunk?

I'm unable to run the modules component, from:
/trunk/modules/analys, if i run:

$ ant dist-maven

I eventually get:

[artifact:deploy]
[artifact:deploy] Error deploying artifact
'org.apache.lucene:lucene-analyzers:jar': Error deploying artifact:
File 
C:\workspace\apache\lucene\modules\analysis\build\common\lucene-analyzers-4.0-dev.jar
does not exist
[artifact:deploy]

it is looking for 'lucene-analyzers-4.0-dev.jar' however the file that
does exist is 'lucene-analyzers-common-4.0-dev.jar'

Am I doing something wrong, or is anyone building modules for maven?

Thanks
ryan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [Hudson] New Hudson master now running

2010-08-19 Thread Erik Hatcher
Apologies for the delays, ping me in real-time on #lucene in IRC if  
you need me, all list mail to me is a huge big black hole.


I've auth'd in the hudson mails now, so should be coming through from  
now on.  Lemme know (via IRC please) if there's still an issue I can  
help with.


Erik

On Aug 19, 2010, at 2:07 PM, Chris Hostetter wrote:



: That was my question :-) Who is the moderator?

Per joe on #asfinfra it's ehatcher and husted.

(We should probably file a request to have some more moderators  
added ..

volunteers?)


-Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: /trunk modules ant dist-maven issues

2010-08-19 Thread Robert Muir
On Thu, Aug 19, 2010 at 2:46 PM, Ryan McKinley ryan...@gmail.com wrote:

 Is anyone building maven artifacts on trunk?

 I'm unable to run the modules component, from:
 /trunk/modules/analys, if i run:

 $ ant dist-maven

 I eventually get:

 [artifact:deploy]
 [artifact:deploy] Error deploying artifact
 'org.apache.lucene:lucene-analyzers:jar': Error deploying artifact:
 File
 C:\workspace\apache\lucene\modules\analysis\build\common\lucene-analyzers-4.0-dev.jar
 does not exist
 [artifact:deploy]

 it is looking for 'lucene-analyzers-4.0-dev.jar' however the file that
 does exist is 'lucene-analyzers-common-4.0-dev.jar'

 Am I doing something wrong, or is anyone building modules for maven?


I doubt maven works at all with modules... was probably broken in shuffling
things around. if you want and know how to fix it please don't hesitate

-- 
Robert Muir
rcm...@gmail.com


Re: /trunk modules ant dist-maven issues

2010-08-19 Thread Ryan McKinley
thanks --- i almost have it working, and will post shortly


On Thu, Aug 19, 2010 at 2:53 PM, Robert Muir rcm...@gmail.com wrote:


 On Thu, Aug 19, 2010 at 2:46 PM, Ryan McKinley ryan...@gmail.com wrote:

 Is anyone building maven artifacts on trunk?

 I'm unable to run the modules component, from:
 /trunk/modules/analys, if i run:

 $ ant dist-maven

 I eventually get:

 [artifact:deploy]
 [artifact:deploy] Error deploying artifact
 'org.apache.lucene:lucene-analyzers:jar': Error deploying artifact:
 File
 C:\workspace\apache\lucene\modules\analysis\build\common\lucene-analyzers-4.0-dev.jar
 does not exist
 [artifact:deploy]

 it is looking for 'lucene-analyzers-4.0-dev.jar' however the file that
 does exist is 'lucene-analyzers-common-4.0-dev.jar'

 Am I doing something wrong, or is anyone building modules for maven?


 I doubt maven works at all with modules... was probably broken in shuffling
 things around. if you want and know how to fix it please don't hesitate

 --
 Robert Muir
 rcm...@gmail.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax

2010-08-19 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


Summary: Adds optional phrase slop to edismax pf2, pf3 and pf 
parameters with field~slop^boost syntax  (was: Adds optional phrase slop to 
pf2, pf3 and pf parameters with field~slop^boost syntax)

 Adds optional phrase slop to edismax pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor
 Attachments: pf2_with_slop.patch


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
 {quote}
 From  Ron Mayer r...@0ape.com
  my results might
  be even better if I had a couple different pf2s with different ps's
  at the same time.
  In particular.   One with ps=0 to put a high boost on ones the have
  the right ordering of words.  For example insuring that:
   red hat black jacket
  boosts only red hats and not black hats.
  And another pf2 with a more modest boost with ps=5 or so to handle
  the query above also boosting docs with red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 text:foo bar~1^2
 -Yonik
 http
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
 {quote}
 From  Chris Hostetter hossman_luc...@fucit.org
 Big +1 to this idea ... the existing ps param can stick arround as the 
 default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
 fields using the ~ syntax.
 -Hoss
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1553) extended dismax query parser

2010-08-19 Thread Ron Mayer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900412#action_12900412
 ] 

Ron Mayer commented on SOLR-1553:
-

I very much like edismax - The pf2 parameter in particular is doing wonders 
for getting my most relevant documents to the very top of the list in one of my 
apps.


 extended dismax query parser
 

 Key: SOLR-1553
 URL: https://issues.apache.org/jira/browse/SOLR-1553
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 1.5, 3.1, 4.0

 Attachments: edismax.unescapedcolon.bug.test.patch, 
 edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch


 An improved user-facing query parser based on dismax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2058) Adds optional phrase slop to edismax pf2, pf3 and pf parameters with field~slop^boost syntax

2010-08-19 Thread Ron Mayer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ron Mayer updated SOLR-2058:


Description: 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
{quote}
FromRon Mayer r...@0ape.com
... my results might  be even better if I had a couple different pf2s with 
different ps's  at the same time.   In particular.   One with ps=0 to put a 
high boost on ones the have  the right ordering of words.  For example insuring 
that [the query]:
  red hat black jacket
 boosts only documents with red hats and not black hats.   And another pf2 
with a more modest boost with ps=5 or so to handle the query above also 
boosting docs with 
  red baseball hat.
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
{quote}
FromYonik Seeley yo...@lucidimagination.com
Perhaps fold it into the pf/pf2 syntax?

pf=text^2// current syntax... makes phrases with a boost of 2
pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
a boost of 2

That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
{{text:foo bar~1^2}}

-Yonik
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
{quote}
FromChris Hostetter hossman_luc...@fucit.org

Big +1 to this idea ... the existing ps param can stick arround as the 
default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
fields using the ~ syntax.

-Hoss
{quote}

  was:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
{quote}
FromRon Mayer r...@0ape.com
 my results might
 be even better if I had a couple different pf2s with different ps's
 at the same time.

 In particular.   One with ps=0 to put a high boost on ones the have
 the right ordering of words.  For example insuring that:
  red hat black jacket
 boosts only red hats and not black hats.

 And another pf2 with a more modest boost with ps=5 or so to handle
 the query above also boosting docs with red baseball hat.
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
{quote}
FromYonik Seeley yo...@lucidimagination.com

Perhaps fold it into the pf/pf2 syntax?

pf=text^2// current syntax... makes phrases with a boost of 2
pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
a boost of 2

That actually seems pretty natural given the lucene query syntax - an
actual boosted sloppy phrase query already looks like
text:foo bar~1^2

-Yonik
http
{quote}

[http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3calpine.deb.1.10.1008161300510.6...@radix.cryptio.net%3e]
{quote}
FromChris Hostetter hossman_luc...@fucit.org

Big +1 to this idea ... the existing ps param can stick arround as the 
default for any field that doesn't specify it's own slop in the pf/pf2/pf3 
fields using the ~ syntax.

-Hoss
{quote}


 Adds optional phrase slop to edismax pf2, pf3 and pf parameters with 
 field~slop^boost syntax
 

 Key: SOLR-2058
 URL: https://issues.apache.org/jira/browse/SOLR-2058
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Affects Versions: 3.1, 4.0
 Environment: n/a
Reporter: Ron Mayer
Priority: Minor
 Attachments: pf2_with_slop.patch


 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3c4c659119.2010...@0ape.com%3e
 {quote}
 From  Ron Mayer r...@0ape.com
 ... my results might  be even better if I had a couple different pf2s with 
 different ps's  at the same time.   In particular.   One with ps=0 to put a 
 high boost on ones the have  the right ordering of words.  For example 
 insuring that [the query]:
   red hat black jacket
  boosts only documents with red hats and not black hats.   And another 
 pf2 with a more modest boost with ps=5 or so to handle the query above also 
 boosting docs with 
   red baseball hat.
 {quote}
 [http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3caanlktimd+v3g6d_mnhp+jykkd+dej8fvmvf_1lqoi...@mail.gmail.com%3e]
 {quote}
 From  Yonik Seeley yo...@lucidimagination.com
 Perhaps fold it into the pf/pf2 syntax?
 pf=text^2// current syntax... makes phrases with a boost of 2
 pf=text~1^2  // proposed syntax... makes phrases with a slop of 1 and
 a boost of 2
 That actually seems pretty natural given the lucene query syntax - an
 actual boosted sloppy phrase query already looks like
 {{text:foo bar~1^2}}
 

[jira] Updated: (SOLR-1566) Allow components to add fields to outgoing documents

2010-08-19 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated SOLR-1566:
--

Attachment: SOLR-1566-gsi.patch

Here's a half-baked patch that I'm putting up b/c Ryan indicated to me he might 
have some time to work on it.  It doesn't have the call back capabilities I 
think we need (that can then hook into Functions) but it does have:

1. Per request storage (although I think we should just re-use the 
SolrQueryRequest.context Map)
2. ResponseWriter integration (i.e. no SolrIndexSearcher integration)



 Allow components to add fields to outgoing documents
 

 Key: SOLR-1566
 URL: https://issues.apache.org/jira/browse/SOLR-1566
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Noble Paul
Assignee: Grant Ingersoll
 Fix For: Next

 Attachments: SOLR-1566-gsi.patch, SOLR-1566.patch, SOLR-1566.patch, 
 SOLR-1566.patch, SOLR-1566.patch


 Currently it is not possible for components to add fields to outgoing 
 documents which are not in the the stored fields of the document.  This makes 
 it cumbersome to add computed fields/metadata .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2034) javabin should use UTF-8, not modified UTF-8

2010-08-19 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900423#action_12900423
 ] 

Hoss Man commented on SOLR-2034:


bq. I don't think adding many hoops for back compatibility is worth the 
trouble. Note that that does not mean people can not use solrj to talk across 
different versions - they may have to use xml though

Agreed, my chief concern is what happens when someone tries to use SolrJ 1.4 to 
talk to Solr 3.1 w/javabin (or vice versa).

A) If they get an error: great, i'm totaly fine with that -- we just document 
that they should use XML in this case.

B) If the commands succeed, but the string data is _always_ corrupted, that's 
not ideal -- but not totally horrible since the probably should be immediately 
obvious and should have read the documentation and known not to do that.

C) if the commands succeed, but the string data is _sometimes_ corrupted (as i 
recall, not every character is different in UTF8 vs Java's  modified UTF8, 
correct?) then that seems really bad ... people may start using javabin to 
update their index and not notice for quite some time that big hard to identify 
chunks of their data are corrupted.

as long a someone sanity checks that the situation is either #A or #B before 
committing, i'm totally cool with it ... but #C scares the bejesus out of me.

(i'll try to run some tests myself in the next few days if no one else gets a 
chance)


 javabin should use UTF-8, not modified UTF-8
 

 Key: SOLR-2034
 URL: https://issues.apache.org/jira/browse/SOLR-2034
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: SOLR-2034.patch, SOLR-2034.patch


 for better interoperability, javabin should use standard UTF-8 instead of 
 modified UTF-8 (http://www.unicode.org/reports/tr26/)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2608) Allow for specification of spell checker accuracy when calling suggestSimilar

2010-08-19 Thread Grant Ingersoll (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll updated LUCENE-2608:


Attachment: LUCENE-2608-3x.patch

Here's a 3.x version of the patch

 Allow for specification of spell checker accuracy when calling suggestSimilar
 -

 Key: LUCENE-2608
 URL: https://issues.apache.org/jira/browse/LUCENE-2608
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/spellchecker
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Attachments: LUCENE-2608-3x.patch, LUCENE-2608.patch


 There is really no need for accuracy to be a class variable in the 
 Spellchecker

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2034) javabin should use UTF-8, not modified UTF-8

2010-08-19 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12900442#action_12900442
 ] 

Robert Muir commented on SOLR-2034:
---

Hoss man, I hear your concerns but i don't understand how we can address any of 
this.

This is really one of the problems of modified-UTF8, and really my big concern 
with using it (that clients will be wrongly written, see my example above). Its 
not really possible or reasonable to address it at the javabin layer: it needs 
to be done at a higher protocol level.

of course, if we can figure this out, then maybe it would be easy to provide 
back compat too, but i didnt see any obvious places in the code where any 
versioning is written over the wire.


 javabin should use UTF-8, not modified UTF-8
 

 Key: SOLR-2034
 URL: https://issues.apache.org/jira/browse/SOLR-2034
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: SOLR-2034.patch, SOLR-2034.patch


 for better interoperability, javabin should use standard UTF-8 instead of 
 modified UTF-8 (http://www.unicode.org/reports/tr26/)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-2034) javabin should use UTF-8, not modified UTF-8

2010-08-19 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2034:
--

Attachment: SOLR-2034.patch

OK, i bumped the byte version as Yonik suggested, and tried to use an old 
client.

Here's the exception:

{noformat}
java.lang.RuntimeException: Invalid version or the data in not in 'javabin' 
format
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at 
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:39)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:477)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
{noformat}


 javabin should use UTF-8, not modified UTF-8
 

 Key: SOLR-2034
 URL: https://issues.apache.org/jira/browse/SOLR-2034
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: SOLR-2034.patch, SOLR-2034.patch, SOLR-2034.patch


 for better interoperability, javabin should use standard UTF-8 instead of 
 modified UTF-8 (http://www.unicode.org/reports/tr26/)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [Lucene-java Wiki] Update of CommittersResources by HossMan

2010-08-19 Thread Uwe Schindler
Thanks for adding this link!

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Apache Wiki [mailto:wikidi...@apache.org]
 Sent: Thursday, August 19, 2010 8:28 PM
 To: Apache Wiki
 Subject: [Lucene-java Wiki] Update of CommittersResources by HossMan
 
 Dear Wiki user,
 
 You have subscribed to a wiki page or wiki category on Lucene-java Wiki for
 change notification.
 
 The CommittersResources page has been changed by HossMan.
 http://wiki.apache.org/lucene-
 java/CommittersResources?action=diffrev1=14rev2=15
 
 --
 
   Information for and by committers to make our lives easier
 
   == Issues, Bugs, JIRA ==
 +
 -   PatchCheckList
 + PatchCheckList
 
   == Releases ==
 
 @@ -27, +28 @@
 
 
   AboutThisWiki
 
 + [[https://svn.apache.org/repos/private/committers/docs/mailinglists.txt|info
 on mailinglist tools, including how to lookup the subscription nad moderator
 info for mailing lists]]
 +
   == Adding Committers ==
 
   How to add, and things to do, when adding a new Committer...


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



LUCENE-1910

2010-08-19 Thread Thomas D'Silva
I would like to contribute a class based on the MoreLikeThis class
in contrib/queries that generates a query based on the tags associated
with a document.


I created a patch LUCENE-1910.patch that demonstrates this class. Based on
feedback, I moved the information gain calculation code to a separate class
and
refactored the code MoreLikeThisUsingTags class to include more descriptive
variables and comments and the ASL.

I would appreciate any more feedback or comments.

https://issues.apache.org/jira/browse/LUCENE-1910

https://issues.apache.org/jira/browse/LUCENE-1910Thanks,
Thomas


How to change this site?

2010-08-19 Thread Uwe Schindler
I found this on Apache's website:

http://projects.apache.org/projects/lucene_java.html

How to change the doap and where is it?

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: How to change this site?

2010-08-19 Thread Uwe Schindler
Problem solved, missed that in the release todo.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Thursday, August 19, 2010 11:45 PM
 To: dev@lucene.apache.org
 Subject: How to change this site?
 
 I found this on Apache's website:
 
 http://projects.apache.org/projects/lucene_java.html
 
 How to change the doap and where is it?
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r987324 - /lucene/java/site/docs/doap.rdf

2010-08-19 Thread Uwe Schindler
 Uwe, you have several copy/paste typos - several of the revision values 
 don't
 match the name values, e.g.:
 
Version
 +nameLucene 3.0.2/name
 +created2010-06-18/created
 +revision2.9.3/revision

Thanks I fixed this and the other ones. My fault, dumb copypaste.

 Steve
 
  -Original Message-
  From: uschind...@apache.org [mailto:uschind...@apache.org]
  Sent: Thursday, August 19, 2010 5:49 PM
  To: java-comm...@lucene.apache.org
  Subject: svn commit: r987324 - /lucene/java/site/docs/doap.rdf
 
  Author: uschindler
  Date: Thu Aug 19 21:48:53 2010
  New Revision: 987324
 
  URL: http://svn.apache.org/viewvc?rev=987324view=rev
  Log:
  Missed to change that file. Sorry
 
  Modified:
  lucene/java/site/docs/doap.rdf
 
  Modified: lucene/java/site/docs/doap.rdf
  URL:
 
 http://svn.apache.org/viewvc/lucene/java/site/docs/doap.rdf?rev=987324r1=
  987323r2=987324view=diff
 
 
 ==
  
  --- lucene/java/site/docs/doap.rdf (original)
  +++ lucene/java/site/docs/doap.rdf Thu Aug 19 21:48:53 2010
  @@ -28,6 +28,31 @@
   category rdf:resource=http://projects.apache.org/category/database;
  /
   release
 Version
  +nameLucene 3.0.2/name
  +created2010-06-18/created
  +revision2.9.3/revision
  +  /Version
  +  Version
  +nameLucene 2.9.3/name
  +created2010-06-18/created
  +revision2.9.3/revision
  +  /Version
  +  Version
  +nameLucene 3.0.1/name
  +created2010-02-26/created
  +revision2.9.3/revision
  +  /Version
  +  Version
  +nameLucene 2.9.2/name
  +created2010-02-26/created
  +revision2.9.2/revision
  +  /Version
  +  Version
  +nameLucene 3.0.0/name
  +created2009-11-25/created
  +revision2.9.3/revision
  +  /Version
  +  Version
   nameLucene 2.9.1/name
   created2009-11-06/created
   revision2.9.1/revision
 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1682) Implement CollapseComponent

2010-08-19 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1682:
---

Attachment: SOLR-1682.patch

Here's an updated patch that merges in Martijn change, and implements some more 
tests (using the new JSON test method).  I also went with the name doclist 
for now.

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
 Fix For: Next

 Attachments: field-collapsing.patch, SOLR-1682.patch, 
 SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, 
 SOLR-1682_prototype.patch, SOLR-1682_prototype.patch, 
 SOLR-1682_prototype.patch, SOLR-236.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Combined Lucene/Solr Issues

2010-08-19 Thread Simon Willnauer
 Worst case scenerio: if it starts out as a SOLR issue and then the scope
 gets bigger, creating a new LUCENE issue to track it (and linking the two)
 seems trivial to me.

Thanks hoss for expressing what i tried to do :) That all makes perfect sense!

simon

On 8/19/10, Grant Ingersoll gsing...@apache.org wrote:

 On Aug 19, 2010, at 2:14 PM, Chris Hostetter wrote:

 : Form me it does not matter, but when I open new issues, I do it against
 : the project where the “bug” is visible. If there is also code committed
 : to Solr, but the main task is Lucene this is fine.

 Right ... i think it's handy to still have the SOLR bug queue for people

 to file bugs against Solr, if they wind up requiring fixes further down
 the tree then so be it.

 +1


 : Personally, i don't waste any time thinking about whether the issue is
 : SOLR or LUCENE, and I think two JIRAs is actually confusing.

 If you know from the outset when you create an issue (ie: tracking an
 improvement, or a new feature) that it requires updating the whole tree
 then it should definitely be a LUCENE issue.  even if you aren't sure it
 makes sense to start using LUCENE, but having SOLR arround for Solr users
 to file bugs is handy.

 This is what I did for LUCENE-2608.


 Worst case scenerio: if it starts out as a SOLR issue and then the scope
 gets bigger, creating a new LUCENE issue to track it (and linking the two)

 seems trivial to me.

 As far as refrencing LUCENE-* issues directly in Solr's CHANGES.txt --
 sure, why not?

 Again, I did.

 -Grant
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Combined Lucene/Solr Issues

2010-08-19 Thread Yonik Seeley
On Thu, Aug 19, 2010 at 6:25 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
 Worst case scenerio: if it starts out as a SOLR issue and then the scope
 gets bigger, creating a new LUCENE issue to track it (and linking the two)
 seems trivial to me.

 Thanks hoss for expressing what i tried to do :) That all makes perfect sense!

This should not be mandated of course - it can still often make sense
for a single issue to cover changes to lucene and solr (and
lucene/solr modules).

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Hudson build is back to normal : Lucene-trunk #1264

2010-08-19 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Lucene-trunk/1264/changes



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1682) Implement CollapseComponent

2010-08-19 Thread Yonik Seeley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-1682:
---

Attachment: SOLR-1682.patch

Here's a patch that adds support for retrieving scores also.
I think we're getting close to something committable!

 Implement CollapseComponent
 ---

 Key: SOLR-1682
 URL: https://issues.apache.org/jira/browse/SOLR-1682
 Project: Solr
  Issue Type: Sub-task
  Components: search
Reporter: Martijn van Groningen
Assignee: Shalin Shekhar Mangar
 Fix For: Next

 Attachments: field-collapsing.patch, SOLR-1682.patch, 
 SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, SOLR-1682.patch, 
 SOLR-1682.patch, SOLR-1682_prototype.patch, SOLR-1682_prototype.patch, 
 SOLR-1682_prototype.patch, SOLR-236.patch


 Child issue of SOLR-236. This issue is dedicated to field collapsing in 
 general and all its code (CollapseComponent, DocumentCollapsers and 
 CollapseCollectors). The main goal is the finalize the request parameters and 
 response format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Hudson build is back to normal : Lucene-3.x #92

2010-08-19 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Lucene-3.x/92/changes



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org