[jira] Commented: (LUCENE-2167) Implement StandardTokenizer with the UAX#29 Standard

2010-05-27 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872124#action_12872124
 ] 

Uwe Schindler commented on LUCENE-2167:
---

Hi Steven,

looks cool, I have some suggestions:
- Must it be a maven plugin? From what I see, the same code could be done as a 
simple Java Class with main() like Roberts ICU converter. The external 
dependency to httpclient can be replaces by simply java.net.HttpUrlConnection 
and the URL itsself (you can even set the no-cache directives). Its much easier 
from ant to invoke a java method as a build step. So why not refactor a little 
bit to use a main() method that acceps the target directory.
- You use the HTML root zone database from iana. The format of this file is 
hard to parse and may change suddenly. BIND administrators know, that there is 
also the root zone file available for BIND in ste standardized named-format @ 
[http://www.internic.net/zones/root.zone] (ASCII only, as DNS is ASCII only). 
You just have to use all rows that are not comments and contain "NS" as second 
token. The Nameservers behind are not used, just use the DNS name before. This 
should be much easier to do. A python script may also work well.
- You can write the Last-Modified-Header of the HTTP-date 
(HttpURLConnection.getLastModified()) also into the generated file.
- The database only contains the punycode enabled DNS names. But users use the 
non-encoded variants, so you should decode punycode, too [we need ICU for that 
:( ] and create patterns for that, too.
- About changes in analyzer syntax because of that: This should not be a 
problem, as the IANA only *adds* new zones to the file and very seldom removes 
some (like old yugoslavian zones). As eMails and Webadresses should *not* 
appear in tokenized text *before* they are in the zone file, its no problem 
that they suddenly later are marked as "URL/eMail" (as they cannot appear 
before). So in my opinion we can update the zone database even in minor Lucene 
releases without breaking analyzers.


Fine idea!

> Implement StandardTokenizer with the UAX#29 Standard
> 
>
> Key: LUCENE-2167
> URL: https://issues.apache.org/jira/browse/LUCENE-2167
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/analyzers
>Affects Versions: 3.1
>Reporter: Shyamal Prasad
>Assignee: Steven Rowe
>Priority: Minor
> Attachments: LUCENE-2167-lucene-buildhelper-maven-plugin.patch, 
> LUCENE-2167.benchmark.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> It would be really nice for StandardTokenizer to adhere straight to the 
> standard as much as we can with jflex. Then its name would actually make 
> sense.
> Such a transition would involve renaming the old StandardTokenizer to 
> EuropeanTokenizer, as its javadoc claims:
> bq. This should be a good tokenizer for most European-language documents
> The new StandardTokenizer could then say
> bq. This should be a good tokenizer for most languages.
> All the english/euro-centric stuff like the acronym/company/apostrophe stuff 
> can stay with that EuropeanTokenizer, and it could be used by the european 
> analyzers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-2167) Implement StandardTokenizer with the UAX#29 Standard

2010-05-27 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872124#action_12872124
 ] 

Uwe Schindler edited comment on LUCENE-2167 at 5/27/10 3:21 AM:


Hi Steven,

looks cool, I have some suggestions:
- Must it be a maven plugin? From what I see, the same code could be done as a 
simple Java Class with main() like Roberts ICU converter. The external 
dependency to httpclient can be replaces by simply java.net.HttpUrlConnection 
and the URL itsself (you can even set the no-cache directives). Its much easier 
from ant to invoke a java method as a build step. So why not refactor a little 
bit to use a main() method that acceps the target directory.
- You use the HTML root zone database from IANA. The format of this file is 
hard to parse and may change suddenly. BIND administrators know, that there is 
also the root zone file available for BIND in the standardized named-format @ 
[http://www.internic.net/zones/root.zone] (ASCII only, as DNS is ASCII only). 
You just have to use all rows that are not comments and contain "NS" as second 
token. The nameservers behind are not used, just use the DNS name before. This 
should be much easier to do. A python script may also work well.
- You can write the Last-Modified-Header of the HTTP-date 
(HttpURLConnection.getLastModified()) also into the generated file.
- The database only contains the punycode enabled DNS names. But users use the 
non-encoded variants, so you should decode punycode, too [we need ICU for that 
:( ] and create patterns for that, too.
- About changes in analyzer syntax because of regeneration: This should not be 
a problem, as the IANA only *adds* new zones to the file and very seldom 
removes some (like old yugoslavian zones). As eMails and Webadresses should 
*not* appear in tokenized text *before* they are in the zone file, its no 
problem that they suddenly later are marked as "URL/eMail" (as they cannot 
appear before). So in my opinion we can update the zone database even in minor 
Lucene releases without breaking analyzers.


Fine idea!

  was (Author: thetaphi):
Hi Steven,

looks cool, I have some suggestions:
- Must it be a maven plugin? From what I see, the same code could be done as a 
simple Java Class with main() like Roberts ICU converter. The external 
dependency to httpclient can be replaces by simply java.net.HttpUrlConnection 
and the URL itsself (you can even set the no-cache directives). Its much easier 
from ant to invoke a java method as a build step. So why not refactor a little 
bit to use a main() method that acceps the target directory.
- You use the HTML root zone database from iana. The format of this file is 
hard to parse and may change suddenly. BIND administrators know, that there is 
also the root zone file available for BIND in ste standardized named-format @ 
[http://www.internic.net/zones/root.zone] (ASCII only, as DNS is ASCII only). 
You just have to use all rows that are not comments and contain "NS" as second 
token. The Nameservers behind are not used, just use the DNS name before. This 
should be much easier to do. A python script may also work well.
- You can write the Last-Modified-Header of the HTTP-date 
(HttpURLConnection.getLastModified()) also into the generated file.
- The database only contains the punycode enabled DNS names. But users use the 
non-encoded variants, so you should decode punycode, too [we need ICU for that 
:( ] and create patterns for that, too.
- About changes in analyzer syntax because of that: This should not be a 
problem, as the IANA only *adds* new zones to the file and very seldom removes 
some (like old yugoslavian zones). As eMails and Webadresses should *not* 
appear in tokenized text *before* they are in the zone file, its no problem 
that they suddenly later are marked as "URL/eMail" (as they cannot appear 
before). So in my opinion we can update the zone database even in minor Lucene 
releases without breaking analyzers.


Fine idea!
  
> Implement StandardTokenizer with the UAX#29 Standard
> 
>
> Key: LUCENE-2167
> URL: https://issues.apache.org/jira/browse/LUCENE-2167
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/analyzers
>Affects Versions: 3.1
>Reporter: Shyamal Prasad
>Assignee: Steven Rowe
>Priority: Minor
> Attachments: LUCENE-2167-lucene-buildhelper-maven-plugin.patch, 
> LUCENE-2167.benchmark.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> It would be really nice for StandardTokenizer to adhere straigh

[jira] Updated: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2455:
---

Attachment: LUCENE-2455_trunk.patch

Like the 3x patch, only this one changes IndexFileNames.segmentFileName to take 
another parameter for custom names, as well as update some jdocs to match flex 
(Codecs). I think this is ready to go in.

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Thoughts on CMS and SMS

2010-05-27 Thread Shai Erera
Hi

I've been thinking recently why are these two named like they are ... with a
MS we're basically asking two questions: (1) should it block other merges
from happening (or app thread from continuing) and (2) should it do its
merges concurrently?

SMS answers 'true' to (1) and 'false' to (2), while CMS answers the
opposite.

BUT, there's really no reason why these two are coupled. E.g. someone who
wants to block other merges from running, or hold the app thread until
merges are finished, does not necessarily want the merges to run in
sequence. Those are two different decisions. Even if you want to block the
application thread and other merges, you can still benefit form having the
merges run concurrently.

So, I was thinking that what we really want is a BlockingMS and
NonBlockingMS that can be used according to the answer you look for in (1),
and then we can have variants for both that execute the merges concurrently
or not. I think that serial merging should be supported w/ BlockingMS only,
but am interested in other opinions. One of the scenarios for serial merging
is if the application wants to ensure no additional threads are spawned
other than what it decided to spawn, and therefore it can only be used w/
the BlockingMS.Another scenario is to control IO, but in this case a
NonBlockingSerialMS may fit as well (depends if you think other merges may
start while this one is running).

In fact, w/o changing much, we can have CMS optionally block other merges /
app thread by waiting until all merges are finished. We may even stick w/
the current SMS/CMS names, just documenting that CMS can be used to block
threads, only merges will be done concurrently.

What do you think?

Shai


[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872140#action_12872140
 ] 

Uwe Schindler commented on LUCENE-2455:
---

Should we not add a 3.1 index (created with HEAD 3.x branch) to the 
TestBackwardsCompatibility? So we can verify that preflex indexes with new CFS 
header also work?

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872149#action_12872149
 ] 

Shai Erera commented on LUCENE-2455:


Yes! I'll add them and update the tests. Will post a patch after I get more 
comments

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872153#action_12872153
 ] 

Shai Erera commented on LUCENE-2455:


Hmm ... I've created the indexes using the 3x branch, copied them to trunk and 
updated TestBackwardsCompatibility to refer to them. All tests pass except for 
testNumericFields. It fails on both CFS and non-CFS indexes, and so I'm not 
sure it's related to this issue at all. The failure is this:

{code}
junit.framework.AssertionFailedError: wrong number of hits expected:<1> but 
was:<0>
at 
org.apache.lucene.index.TestBackwardsCompatibility.testNumericFields(TestBackwardsCompatibility.java:773)
{code}

Can you try to run it on your checkout?

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2455:
--

Attachment: index.31.cfs.zip
index.31.nocfs.zip

For me it passes.

Are you sure that you used the *latest* checkout of 3x. I added the index 
generation code yesterday after your last 3x commit. This code was not merged 
to 3x from trunk, as it was postflex added. This is done sice yesterday:
{noformat}
Author: uschindler
Date: Wed May 26 13:13:10 2010
New Revision: 948420

URL: http://svn.apache.org/viewvc?rev=948420&view=rev
Log:
Merge the 3.0 index backwards tests from trunk (numeric field support). This 
makes it consistent across all branches.

Modified:
lucene/dev/branches/branch_3x/lucene/src/test/org/apache/lucene/index/   
(props changed)

lucene/dev/branches/branch_3x/lucene/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java
   (contents, props changed)
{noformat}

I attached the generated ZIP files from my 3x checkout.

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: index.31.cfs.zip, index.31.nocfs.zip, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872182#action_12872182
 ] 

Shai Erera commented on LUCENE-2455:


Yes - after I updated my checkout and re-create the indexes, the test passes. 
So I will include them with this patch as well.

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: index.31.cfs.zip, index.31.nocfs.zip, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Adding CLucene as a Lucene subproject

2010-05-27 Thread Grant Ingersoll
Hi Itamar,

The PMC is discussing this internally.   It is pretty clear at this point that 
CLucene needs to go through Incubation, so by all means start the process there 
(see http://incubator.apache.org).  Also, keep in mind, that many things have 
changed for Lucene in regards to subprojects in recent months, so despite the 
fact that we have other "ports" under Lucene right now due to legacy reasons, 
it doesn't imply that adding another would be good for this particular TLP.  
That isn't to say that CLucene doesn't have a place at the ASF.  Generally 
speaking, the ASF does not want "umbrella" projects covering disjoint 
communities b/c it often leads to unnecessary bureaucratic problems at the PMC 
level related to things like voting, releases, project direction, etc. 

At any rate, we should be wrapping up our PMC discussion by tomorrow, so I will 
update this thread at that time.

Thanks,
Grant

On May 26, 2010, at 7:36 PM, Itamar Syn-Hershko wrote:

> Earwin,
> 
> Considering the fact that Lucy is not planning on complementing Lucene's
> API, while CLucene's goal is to be a one-by-one port, I would say no
> relationship. Also, CLucene is written in C++ and not C.
> 
> Re. Sphinx - I'm not familiar enough with it to really comment on this, but
> I'd assume this is more of a Lucene vs Sphinx question. Do the prons and
> cons, throw in some benchmarks, and then give CLucene a couple of extra
> points for being cross-platform, and 5-10 times faster than the equivallent
> Lucene version. I had a quick look, and it seems like Lucene is much more
> scalable (esp. considering the latest developments) and some even claim that
> "Lucene performance is unmatched".
> 
> Also, unless a restrictive license isn't a show stopper for you, Sphinx is
> released under GPL.
> 
> Itamar. 
> 
> -Original Message-
> From: Earwin Burrfoot [mailto:ear...@gmail.com] 
> Sent: Thursday, May 27, 2010 2:06 AM
> To: dev@lucene.apache.org
> Subject: Re: Adding CLucene as a Lucene subproject
> 
> I wonder, what's going to be the relationship between this and Lucy?
> Also, how do both of them compare to Sphinx?
> 
> 2010/5/27 Itamar Syn-Hershko :
>> Ryan, thanks. I understand, and obviously if the PMC will think the 
>> same this is what we'll be doing.
>> 
>> Unfortunately, I haven't heard from the PMC yet, and I'm not sure 
>> where this is going exactly. If the proposal is what keeping this from 
>> being discussed, do let me know. Otherwise, I'm hoping someone with 
>> good knowledge of this process could respond and help us move this 
>> forward. I can be contacted privately if needed.
>> 
>> Itamar.
>> 
>> -Original Message-
>> From: Ryan McKinley [mailto:ryan...@gmail.com]
>> Sent: Thursday, May 27, 2010 1:31 AM
>> To: dev@lucene.apache.org
>> Subject: Re: Adding CLucene as a Lucene subproject
>> 
>> Thanks Itamar-
>> 
>> Apologies since the last email is not very clear...   Not speaking as 
>> the PMC, my feeling is that for this an incubation process will be 
>> needed.  With Apache, the projects are more about the community then 
>> the code -- since there exists a CLucene community with its own 
>> culture etc, i think the incubation process makes sense (that is the 
>> whole point of the incubator - in my opinion)  For incubation, CLucene 
>> would need a champion
>> 
>> again, just throwing it out there, and *not* speaking as the PMC.
>> 
>> ryan
>> 
>> 
>> On Wed, May 26, 2010 at 4:56 PM, Itamar Syn-Hershko 
>> 
>> wrote:
>>> Ryan,
>>> 
>>> I'm not familiar with the Apache way of doing things. It is my
>> understanding
>>> that if the invitation is initiated by the PMC itself, no incubation
>> process
>>> nor a champion are required. Considering CLucene's age and proven
>> stability,
>>> I was hoping we could go that route. If we need a PMC member as a
>> champion,
>>> may this be a call for one.
>>> 
>>> Considering CLucene is targetting a very different users base than 
>>> Lucene is, I don't see how it can possibly be a distraction. On the 
>>> countrary - many optimizations done in CLucene back in the old days 
>>> were later adapted by Lucene itself, and I'm sure this will continue. 
>>> Also, I believe CLucene has a great part in promoting Lucene, 
>>> especially among non-Java
>> developers.
>>> So, I don't see how CLucene is a moot point more than Lucene.Net for
>> example
>>> has ever been.
>>> 
>>> I would love to be working with anyone to get this process properly
>> defined
>>> and started. A proposal is being worked on.
>>> 
>>> Itamar.
>>> 
>>> -Original Message-
>>> From: Ryan McKinley [mailto:ryan...@gmail.com]
>>> Sent: Wednesday, May 26, 2010 10:51 PM
>>> To: dev@lucene.apache.org
>>> Subject: Re: Adding CLucene as a Lucene subproject
>>> 
>>> Having skimmed most of the thread...
>>> 
>>> It seems the question of if CLucene should be a sub-project may be
>> premature
>>> considering that it would really need someone in the PMC to champion 
>>> it -- do the real work to make i

Re: Contribution for Research

2010-05-27 Thread Grant Ingersoll


On May 26, 2010, at 11:30 AM, Shardul Bhatt wrote:

> Hi All,
> 
> I am Shardul Bhatt, a Software Developer from India.
> 
> I have used Lucene on a project and am keen to contribute to Lucene.
> 
> I know it takes much more than just the desire to be able to contribute to 
> Open Source. At this point in time I am trying to figure out how to go about 
> it. Apparently the most widely accepted method is to use it and debug it, 
> using Eclipse, to understand how it all gels together. This method is 
> certainly good but the initial effort is huge and needs a lot of motivation 
> to hang on.

http://wiki.apache.org/lucene-java/HowToContribute and 
http://wiki.apache.org/solr/HowToContribute describe most of the things 
necessary to get started.  
> 
>   
> 
> On Wed, May 26, 2010 at 8:31 PM, Alberto Bacchelli  
> wrote:
> Dear Lucene developers,
> 
>  I'm Alberto Bacchelli, a Ph.D. student in software engineering.
> 
> We want to help new developers who join a new software system, and
> we believe that a good first impression would attract more contributors.
> 
> Imagine a new developer joining Lucene:
> As a first step, he needs a high-level view of the system.
> Then, and this is what we want to address, he needs to know
> what the most important classes of the system are --the hotspots.
> 
> 
> We'd like to find *automated* methods to suggest a newbie
> which classes he should start to study/understand.
> 
> 
> To find the best recommendation method, we must know
> the important classes of the system, and you,
> as the system developers, are the only ones who can
> answer this question.
> 
> If you agree to do so (and I really hope so :) )
> we will create a small questionnaire for you,
> that will take less than 15 minutes to be completed.

Can you just send the questionnaire to the list?
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter

2010-05-27 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2295:
--

Attachment: LUCENE-2295-trunk.patch

Updated patch for trunk.

> Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
> same functionality as MaxFieldLength provided on IndexWriter
> ---
>
> Key: LUCENE-2295
> URL: https://issues.apache.org/jira/browse/LUCENE-2295
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Shai Erera
>Assignee: Uwe Schindler
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2295-trunk.patch, LUCENE-2295.patch
>
>
> A spinoff from LUCENE-2294. Instead of asking the user to specify on 
> IndexWriter his requested MFL limit, we can get rid of this setting entirely 
> by providing an Analyzer which will wrap any other Analyzer and its 
> TokenStream with a TokenFilter that keeps track of the number of tokens 
> produced and stop when the limit has reached.
> This will remove any count tracking in IW's indexing, which is done even if I 
> specified UNLIMITED for MFL.
> Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2295) Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the same functionality as MaxFieldLength provided on IndexWriter

2010-05-27 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2295:
--

Fix Version/s: 3.1

> Create a MaxFieldLengthAnalyzer to wrap any other Analyzer and provide the 
> same functionality as MaxFieldLength provided on IndexWriter
> ---
>
> Key: LUCENE-2295
> URL: https://issues.apache.org/jira/browse/LUCENE-2295
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/analyzers
>Reporter: Shai Erera
>Assignee: Uwe Schindler
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2295-trunk.patch, LUCENE-2295.patch
>
>
> A spinoff from LUCENE-2294. Instead of asking the user to specify on 
> IndexWriter his requested MFL limit, we can get rid of this setting entirely 
> by providing an Analyzer which will wrap any other Analyzer and its 
> TokenStream with a TokenFilter that keeps track of the number of tokens 
> produced and stop when the limit has reached.
> This will remove any count tracking in IW's indexing, which is done even if I 
> specified UNLIMITED for MFL.
> Let's try to do it for 3.1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (SOLR-1930) remove solr deprecations

2010-05-27 Thread Yonik Seeley (JIRA)
remove solr deprecations


 Key: SOLR-1930
 URL: https://issues.apache.org/jira/browse/SOLR-1930
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Fix For: 4.0


Remove deprecations and unused classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1930) remove solr deprecations

2010-05-27 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872207#action_12872207
 ] 

Yonik Seeley commented on SOLR-1930:


I've targeted this toward 4.0... not being too aggressive about changes for 3.1 
(which may not be too far away) should make it easier for users to upgrade.

> remove solr deprecations
> 
>
> Key: SOLR-1930
> URL: https://issues.apache.org/jira/browse/SOLR-1930
> Project: Solr
>  Issue Type: Improvement
>Reporter: Yonik Seeley
> Fix For: 4.0
>
>
> Remove deprecations and unused classes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2167) Implement StandardTokenizer with the UAX#29 Standard

2010-05-27 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872221#action_12872221
 ] 

Steven Rowe commented on LUCENE-2167:
-

bq. Must it be a maven plugin? [...] Its much easier from ant to invoke a java 
method as a build step.

Lucene's build could be converted to Maven, though, and this could be a place 
for build-related stuff.

Maven Ant Tasks allows for Ant to call full Maven builds without a Maven 
installation: http://maven.apache.org/ant-tasks/examples/mvn.html

bq. From what I see, the same code could be done as a simple Java Class with 
main() like Roberts ICU converter. [snip]

I hadn't seen Robert's ICU converter - I'll take a look.

bq. A python script may also work well.

Perl is my scripting language of choice, not Python, but yes, a script would 
likely do the trick, assuming there are no external (Java) dependencies.  (And 
as you pointed out, HttpComponents, the only dependency of the Maven plugin, 
does not need to be a dependency.)

bq. You use the HTML root zone database from IANA. The format of this file is 
hard to parse and may change suddenly. BIND administrators know, that there is 
also the root zone file available for BIND in the standardized named-format @ 
http://www.internic.net/zones/root.zone (ASCII only, as DNS is ASCII only).

I think I'll stick with the HTML version for now - there are no decoded 
versions of the internationalized TLDs and no descriptive information in the 
named-format version.  I agree the HTML format is not ideal, but it took me 
just a little while to put together the regexes to parse it; when the format 
changes, the effort to fix will likely be similarly small.

bq. You can write the Last-Modified-Header of the HTTP-date 
(HttpURLConnection.getLastModified()) also into the generated file.

Excellent idea, I searched the HTML page source for this kind of information 
but it wasn't there.

bq. The database only contains the punycode enabled DNS names. But users use 
the non-encoded variants, so you should decode punycode, too [we need ICU for 
that :( ] and create patterns for that, too.

I agree.  However, I looked into what's required to do internationalized domain 
names properly, and it's quite complicated.  I plan on doing what you suggest 
eventually, both for TLDs and all other domain labels, but I'd rather finish 
the ASCII implementation and deal with IRIs in a separate follow-on issue.

bq. About changes in analyzer syntax because of regeneration: This should not 
be a problem, as the IANA only adds new zones to the file and very seldom 
removes some (like old yugoslavian zones). As eMails and Webadresses should not 
appear in tokenized text before they are in the zone file, its no problem that 
they suddenly later are marked as "URL/eMail" (as they cannot appear before). 
So in my opinion we can update the zone database even in minor Lucene releases 
without breaking analyzers.

+1


> Implement StandardTokenizer with the UAX#29 Standard
> 
>
> Key: LUCENE-2167
> URL: https://issues.apache.org/jira/browse/LUCENE-2167
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/analyzers
>Affects Versions: 3.1
>Reporter: Shyamal Prasad
>Assignee: Steven Rowe
>Priority: Minor
> Attachments: LUCENE-2167-lucene-buildhelper-maven-plugin.patch, 
> LUCENE-2167.benchmark.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> It would be really nice for StandardTokenizer to adhere straight to the 
> standard as much as we can with jflex. Then its name would actually make 
> sense.
> Such a transition would involve renaming the old StandardTokenizer to 
> EuropeanTokenizer, as its javadoc claims:
> bq. This should be a good tokenizer for most European-language documents
> The new StandardTokenizer could then say
> bq. This should be a good tokenizer for most languages.
> All the english/euro-centric stuff like the acronym/company/apostrophe stuff 
> can stay with that EuropeanTokenizer, and it could be used by the european 
> analyzers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1316) Create autosuggest component

2010-05-27 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872225#action_12872225
 ] 

Grant Ingersoll commented on SOLR-1316:
---

And what are the units for the time?  I assume ms.  Do you have any tests for 
the "average" case?  i.e. lookup of a top 5-10 for a given prefix?

> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1316.patch, suggest.patch, suggest.patch, 
> suggest.patch, TST.zip
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: TestBackwardsCompatibility

2010-05-27 Thread Michael McCandless
I think there is no code to remove, on stopping support for indices <=
1.9.x.  At least on a quick look, I can't find any such code...

The index did change in 2.1 (and many changes came after that), but we
still support 2.0, so we can't remove that code.

Come 4.0 we should have more fun :)

Mike

On Mon, May 24, 2010 at 12:34 PM, Shai Erera  wrote:
> So do we want to just remove the 1x indexes from :z and 2x from trunk?
> Or do we also want to remove the live migration code? How can one
> start with that for example? Are there constants to look for for
> example?
>
> Shai
>
> On Monday, May 24, 2010, Mark Miller  wrote:
>> On 5/24/10 11:25 AM, Michael McCandless wrote:
>>
>> Yes, I think we can remove support for 1.9 indexes as of 3.0:
>>
>>      http://wiki.apache.org/lucene-java/BackwardsCompatibility
>>
>> So starting with 3.0 the oldest index we must support are those written by 
>> 2.0.
>>
>> Mike
>>
>> On Sun, May 23, 2010 at 12:56 AM, Shai Erera  wrote:
>>
>> Hi
>>
>> I'm working on adding support for addIndexes* in TestBackwardsCompatibility,
>> and I've noticed it still reads 1.9 indexes. Is that intentional? Shouldn't
>> 3x stop supporting 1.9?
>>
>> Shai
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>> We really need to update that wiki page - mucho changes.
>>
>> --
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: TestBackwardsCompatibility

2010-05-27 Thread Shai Erera
Well ... we could remove the 1.9 indexes (though that's not so valuable)

But in trunk, we can remove all 2.x related code already no?

Shai

On Thu, May 27, 2010 at 6:00 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> I think there is no code to remove, on stopping support for indices <=
> 1.9.x.  At least on a quick look, I can't find any such code...
>
> The index did change in 2.1 (and many changes came after that), but we
> still support 2.0, so we can't remove that code.
>
> Come 4.0 we should have more fun :)
>
> Mike
>
> On Mon, May 24, 2010 at 12:34 PM, Shai Erera  wrote:
> > So do we want to just remove the 1x indexes from :z and 2x from trunk?
> > Or do we also want to remove the live migration code? How can one
> > start with that for example? Are there constants to look for for
> > example?
> >
> > Shai
> >
> > On Monday, May 24, 2010, Mark Miller  wrote:
> >> On 5/24/10 11:25 AM, Michael McCandless wrote:
> >>
> >> Yes, I think we can remove support for 1.9 indexes as of 3.0:
> >>
> >>  http://wiki.apache.org/lucene-java/BackwardsCompatibility
> >>
> >> So starting with 3.0 the oldest index we must support are those written
> by 2.0.
> >>
> >> Mike
> >>
> >> On Sun, May 23, 2010 at 12:56 AM, Shai Erera  wrote:
> >>
> >> Hi
> >>
> >> I'm working on adding support for addIndexes* in
> TestBackwardsCompatibility,
> >> and I've noticed it still reads 1.9 indexes. Is that intentional?
> Shouldn't
> >> 3x stop supporting 1.9?
> >>
> >> Shai
> >>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >>
> >>
> >> We really need to update that wiki page - mucho changes.
> >>
> >> --
> >> - Mark
> >>
> >> http://www.lucidimagination.com
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: TestBackwardsCompatibility

2010-05-27 Thread Michael McCandless
On Thu, May 27, 2010 at 11:08 AM, Shai Erera  wrote:
> Well ... we could remove the 1.9 indexes (though that's not so valuable)

We may as well?

> But in trunk, we can remove all 2.x related code already no?

Yes, you're right!

Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: TestBackwardsCompatibility

2010-05-27 Thread Shai Erera
Ok ... that was rather fast and short !

So regarding trunk, is SegmentInfos the only place to look in? Can you give
me more pointers? I'd like to create an issue for that (not sure I'll do it
myself though), so capturing all the information there would be good.

Shai

On Thu, May 27, 2010 at 6:10 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> On Thu, May 27, 2010 at 11:08 AM, Shai Erera  wrote:
> > Well ... we could remove the 1.9 indexes (though that's not so valuable)
>
> We may as well?
>
> > But in trunk, we can remove all 2.x related code already no?
>
> Yes, you're right!
>
> Mike
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] Updated: (LUCENE-2167) Implement StandardTokenizer with the UAX#29 Standard

2010-05-27 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2167:
--

Attachment: LUCENE-2167-jflex-tld-macro-gen.patch

Here my patch with the TLD-macro generator:
- Uses zone database from DNS (downloaded)
- Outputs correct platform dependent newlines, else commits with SVN fail
- Has no comments :(
- Is included into build.xml. Run ant gen-tlds in modules/analysis/common

The resulting macro is almost identical, 4 TLDs are missing, but the file on 
internic.net is actual (see last mod date). The comments are not available, of 
course.

> Implement StandardTokenizer with the UAX#29 Standard
> 
>
> Key: LUCENE-2167
> URL: https://issues.apache.org/jira/browse/LUCENE-2167
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/analyzers
>Affects Versions: 3.1
>Reporter: Shyamal Prasad
>Assignee: Steven Rowe
>Priority: Minor
> Attachments: LUCENE-2167-jflex-tld-macro-gen.patch, 
> LUCENE-2167-lucene-buildhelper-maven-plugin.patch, 
> LUCENE-2167.benchmark.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> It would be really nice for StandardTokenizer to adhere straight to the 
> standard as much as we can with jflex. Then its name would actually make 
> sense.
> Such a transition would involve renaming the old StandardTokenizer to 
> EuropeanTokenizer, as its javadoc claims:
> bq. This should be a good tokenizer for most European-language documents
> The new StandardTokenizer could then say
> bq. This should be a good tokenizer for most languages.
> All the english/euro-centric stuff like the acronym/company/apostrophe stuff 
> can stay with that EuropeanTokenizer, and it could be used by the european 
> analyzers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: TestBackwardsCompatibility

2010-05-27 Thread Michael McCandless
SegmentInfo and SegmentInfos (separately).  You can compare the
version headers at the top of SegmentInfos, in 2.9.x vs 3.0.x, to see
which ones can go.

FieldInfos can lose its FORMAT_PRE I think, and TermVectorsReader can
lose most of its old formats.

Mike

On Thu, May 27, 2010 at 11:13 AM, Shai Erera  wrote:
> Ok ... that was rather fast and short !
>
> So regarding trunk, is SegmentInfos the only place to look in? Can you give
> me more pointers? I'd like to create an issue for that (not sure I'll do it
> myself though), so capturing all the information there would be good.
>
> Shai
>
> On Thu, May 27, 2010 at 6:10 PM, Michael McCandless
>  wrote:
>>
>> On Thu, May 27, 2010 at 11:08 AM, Shai Erera  wrote:
>> > Well ... we could remove the 1.9 indexes (though that's not so valuable)
>>
>> We may as well?
>>
>> > But in trunk, we can remove all 2.x related code already no?
>>
>> Yes, you're right!
>>
>> Mike
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Contribution for Research

2010-05-27 Thread Alberto Bacchelli

Hi Grant,

On 5/27/10 2:32 PM, Grant Ingersoll wrote:




[...]


If you agree to do so (and I really hope so :) )
we will create a small questionnaire for you,
that will take less than 15 minutes to be completed.


Can you just send the questionnaire to the list?


I am creating the questionnaire in these days.
We are conducting a pilot survey to find the most appropriate
questions and format.

I sent the e-mail before the actual questionnaire because
I wanted to have some feedback from you, developers, on the topic,
in order to produce a better survey :)


Anyway, in the next days, the form will be ready, and it will be
"automatic web interface stuff", an a couple of open
questions.

I strongly hope that you can give your contribution, too.

Thank you,
 Alberto

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: TestBackwardsCompatibility

2010-05-27 Thread Earwin Burrfoot
I wanted to do this for some time, so let's open an issue!

On Thu, May 27, 2010 at 19:13, Shai Erera  wrote:
> Ok ... that was rather fast and short !
>
> So regarding trunk, is SegmentInfos the only place to look in? Can you give
> me more pointers? I'd like to create an issue for that (not sure I'll do it
> myself though), so capturing all the information there would be good.
>
> Shai
>
> On Thu, May 27, 2010 at 6:10 PM, Michael McCandless
>  wrote:
>>
>> On Thu, May 27, 2010 at 11:08 AM, Shai Erera  wrote:
>> > Well ... we could remove the 1.9 indexes (though that's not so valuable)
>>
>> We may as well?
>>
>> > But in trunk, we can remove all 2.x related code already no?
>>
>> Yes, you're right!
>>
>> Mike
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Phone: +7 (495) 683-567-4
ICQ: 104465785

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-2455) Some house cleaning in addIndexes*

2010-05-27 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-2455.


Lucene Fields: [New, Patch Available]  (was: [New])
   Resolution: Fixed

Committed revision 948861 (trunk).

> Some house cleaning in addIndexes*
> --
>
> Key: LUCENE-2455
> URL: https://issues.apache.org/jira/browse/LUCENE-2455
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Trivial
> Fix For: 3.1, 4.0
>
> Attachments: index.31.cfs.zip, index.31.nocfs.zip, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, 
> LUCENE-2455_3x.patch, LUCENE-2455_3x.patch, LUCENE-2455_trunk.patch
>
>
> Today, the use of addIndexes and addIndexesNoOptimize is confusing - 
> especially on when to invoke each. Also, addIndexes calls optimize() in 
> the beginning, but only on the target index. It also includes the 
> following jdoc statement, which from how I understand the code, is 
> wrong: _After this completes, the index is optimized._ -- optimize() is 
> called in the beginning and not in the end. 
> On the other hand, addIndexesNoOptimize does not call optimize(), and 
> relies on the MergeScheduler and MergePolicy to handle the merges. 
> After a short discussion about that on the list (Thanks Mike for the 
> clarifications!) I understand that there are really two core differences 
> between the two: 
> * addIndexes supports IndexReader extensions
> * addIndexesNoOptimize performs better
> This issue proposes the following:
> # Clear up the documentation of each, spelling out the pros/cons of 
>   calling them clearly in the javadocs.
> # Rename addIndexesNoOptimize to addIndexes
> # Remove optimize() call from addIndexes(IndexReader...)
> # Document that clearly in both, w/ a recommendation to call optimize() 
>   before on any of the Directories/Indexes if it's a concern. 
> That way, we maintain all the flexibility in the API - 
> addIndexes(IndexReader...) allows for using IR extensions, 
> addIndexes(Directory...) is considered more efficient, by allowing the 
> merges to happen concurrently (depending on MS) and also factors in the 
> MP. So unless you have an IR extension, addDirectories is really the one 
> you should be using. And you have the freedom to call optimize() before 
> each if you care about it, or don't if you don't care. Either way, 
> incurring the cost of optimize() is entirely in the user's hands. 
> BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler 
> nor MergePolicy, but rather call SegmentMerger directly. This might be 
> another place for improvement. I'll look into it, and if it's not too 
> complicated, I may cover it by this issue as well. If you have any hints 
> that can give me a good head start on that, please don't be shy :). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: TestBackwardsCompatibility

2010-05-27 Thread Shai Erera
Opened LUCENE-2480.

Shai

On Thu, May 27, 2010 at 6:37 PM, Earwin Burrfoot  wrote:

> I wanted to do this for some time, so let's open an issue!
>
> On Thu, May 27, 2010 at 19:13, Shai Erera  wrote:
> > Ok ... that was rather fast and short !
> >
> > So regarding trunk, is SegmentInfos the only place to look in? Can you
> give
> > me more pointers? I'd like to create an issue for that (not sure I'll do
> it
> > myself though), so capturing all the information there would be good.
> >
> > Shai
> >
> > On Thu, May 27, 2010 at 6:10 PM, Michael McCandless
> >  wrote:
> >>
> >> On Thu, May 27, 2010 at 11:08 AM, Shai Erera  wrote:
> >> > Well ... we could remove the 1.9 indexes (though that's not so
> valuable)
> >>
> >> We may as well?
> >>
> >> > But in trunk, we can remove all 2.x related code already no?
> >>
> >> Yes, you're right!
> >>
> >> Mike
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >
> >
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
> Phone: +7 (495) 683-567-4
> ICQ: 104465785
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] Created: (LUCENE-2480) Remove support for pre-3.0 indexes

2010-05-27 Thread Shai Erera (JIRA)
Remove support for pre-3.0 indexes
--

 Key: LUCENE-2480
 URL: https://issues.apache.org/jira/browse/LUCENE-2480
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Priority: Minor
 Fix For: 4.0


We should remove support for 2.x (and 1.9) indexes in 4.0. It seems that 
nothing can be done in 3x because there is no special code which handles 1.9, 
so we'll leave it there. This issue should cover:
# Remove the .zip indexes
# Remove the unnecessary code from SegmentInfo and SegmentInfos. Mike suggests 
we compare the version headers at the top of SegmentInfos, in 2.9.x vs 3.0.x, 
to see which ones can go.
# remove FORMAT_PRE from FieldInfos
# Remove old format from TermVectorsReader

If you know of other places where code can be removed, then please post a 
comment here.

I don't know when I'll have time to handle it, definitely not in the next few 
days. So if someone wants to take a stab at it, be my guest.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2167) Implement StandardTokenizer with the UAX#29 Standard

2010-05-27 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-2167:
--

Attachment: LUCENE-2167-jflex-tld-macro-gen.patch

Small update (dont output lastMod date if internic.net gave none)

> Implement StandardTokenizer with the UAX#29 Standard
> 
>
> Key: LUCENE-2167
> URL: https://issues.apache.org/jira/browse/LUCENE-2167
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/analyzers
>Affects Versions: 3.1
>Reporter: Shyamal Prasad
>Assignee: Steven Rowe
>Priority: Minor
> Attachments: LUCENE-2167-jflex-tld-macro-gen.patch, 
> LUCENE-2167-jflex-tld-macro-gen.patch, 
> LUCENE-2167-lucene-buildhelper-maven-plugin.patch, 
> LUCENE-2167.benchmark.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, 
> LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch, LUCENE-2167.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> It would be really nice for StandardTokenizer to adhere straight to the 
> standard as much as we can with jflex. Then its name would actually make 
> sense.
> Such a transition would involve renaming the old StandardTokenizer to 
> EuropeanTokenizer, as its javadoc claims:
> bq. This should be a good tokenizer for most European-language documents
> The new StandardTokenizer could then say
> bq. This should be a good tokenizer for most languages.
> All the english/euro-centric stuff like the acronym/company/apostrophe stuff 
> can stay with that EuropeanTokenizer, and it could be used by the european 
> analyzers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-1316) Create autosuggest component

2010-05-27 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872263#action_12872263
 ] 

Andrzej Bialecki  commented on SOLR-1316:
-

Yes, sorry, it's [ms] of elapsed time for adding 100k strings or looking up 
100k strings, though I admit I was lazy and used the same strings for lookup as 
I did for the build phase ... I'll change this so that it properly looks up 
prefixes and I'll post new results.

> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1316.patch, suggest.patch, suggest.patch, 
> suggest.patch, TST.zip
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2481) Enhance SnapshotDeletionPolicy to allow taking multiple snapshots

2010-05-27 Thread Shai Erera (JIRA)
Enhance SnapshotDeletionPolicy to allow taking multiple snapshots
-

 Key: LUCENE-2481
 URL: https://issues.apache.org/jira/browse/LUCENE-2481
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Reporter: Shai Erera
Assignee: Shai Erera
Priority: Minor
 Fix For: 3.1, 4.0


A spin off from here: 
http://www.gossamer-threads.com/lists/lucene/java-dev/99161?do=post_view_threaded#99161

I will:
# Replace snapshot() with snapshot(String), so that one can name/identify the 
snapshot
# Add some supporting methods, like release(String), getSnapshots() etc.
# Some unit tests of course.

This is mostly written already - I want to contribute it. I've also written a 
PersistentSDP, which persists the snapshots on stable storage (a Lucene index 
in this case) to support opening an IW with existing snapshots already, so they 
don't get deleted. If it's interesting, I can contribute it as well.

Porting my patch to the new API. Should post it soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Contribution for Research

2010-05-27 Thread Shardul Bhatt
Thanks for you reply Grant.

I would start with working on the documentation

Regards,
Shardul.

On Thu, May 27, 2010 at 6:02 PM, Grant Ingersoll wrote:

>
>
> On May 26, 2010, at 11:30 AM, Shardul Bhatt wrote:
>
> > Hi All,
> >
> > I am Shardul Bhatt, a Software Developer from India.
> >
> > I have used Lucene on a project and am keen to contribute to Lucene.
> >
> > I know it takes much more than just the desire to be able to contribute
> to Open Source. At this point in time I am trying to figure out how to go
> about it. Apparently the most widely accepted method is to use it and debug
> it, using Eclipse, to understand how it all gels together. This method is
> certainly good but the initial effort is huge and needs a lot of motivation
> to hang on.
>
> http://wiki.apache.org/lucene-java/HowToContribute and
> http://wiki.apache.org/solr/HowToContribute describe most of the things
> necessary to get started.
> >
> >
> >
> > On Wed, May 26, 2010 at 8:31 PM, Alberto Bacchelli <
> alberto.bacche...@usi.ch> wrote:
> > Dear Lucene developers,
> >
> >  I'm Alberto Bacchelli, a Ph.D. student in software engineering.
> >
> > We want to help new developers who join a new software system, and
> > we believe that a good first impression would attract more contributors.
> >
> > Imagine a new developer joining Lucene:
> > As a first step, he needs a high-level view of the system.
> > Then, and this is what we want to address, he needs to know
> > what the most important classes of the system are --the hotspots.
> >
> >
> > We'd like to find *automated* methods to suggest a newbie
> > which classes he should start to study/understand.
> >
> >
> > To find the best recommendation method, we must know
> > the important classes of the system, and you,
> > as the system developers, are the only ones who can
> > answer this question.
> >
> > If you agree to do so (and I really hope so :) )
> > we will create a small questionnaire for you,
> > that will take less than 15 minutes to be completed.
>
> Can you just send the questionnaire to the list?
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] Updated: (LUCENE-2481) Enhance SnapshotDeletionPolicy to allow taking multiple snapshots

2010-05-27 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-2481:
---

Attachment: LUCENE-2481-3x.patch

Enhancements to SnapshotDeletionPolicy + tests.

Also, I've added a PersistentSDP, which persists the snapshots information in a 
Lucene Directory. In case the JVM crashes, the info from the Directory can be 
used to open an IndexWriter on the other Directory w/ already snapshotted 
commits (prevents their deletion).

> Enhance SnapshotDeletionPolicy to allow taking multiple snapshots
> -
>
> Key: LUCENE-2481
> URL: https://issues.apache.org/jira/browse/LUCENE-2481
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Shai Erera
>Assignee: Shai Erera
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2481-3x.patch
>
>
> A spin off from here: 
> http://www.gossamer-threads.com/lists/lucene/java-dev/99161?do=post_view_threaded#99161
> I will:
> # Replace snapshot() with snapshot(String), so that one can name/identify the 
> snapshot
> # Add some supporting methods, like release(String), getSnapshots() etc.
> # Some unit tests of course.
> This is mostly written already - I want to contribute it. I've also written a 
> PersistentSDP, which persists the snapshots on stable storage (a Lucene index 
> in this case) to support opening an IW with existing snapshots already, so 
> they don't get deleted. If it's interesting, I can contribute it as well.
> Porting my patch to the new API. Should post it soon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2482) Index sorter

2010-05-27 Thread Andrzej Bialecki (JIRA)
Index sorter


 Key: LUCENE-2482
 URL: https://issues.apache.org/jira/browse/LUCENE-2482
 Project: Lucene - Java
  Issue Type: New Feature
  Components: contrib/*
Affects Versions: 3.1
Reporter: Andrzej Bialecki 
 Fix For: 3.1
 Attachments: indexSorter.patch

A tool to sort index according to a float document weight. Documents with high 
weight are given low document numbers, which means that they will be first 
evaluated. When using a strategy of "early termination" of queries (see 
TimeLimitedCollector) such sorting significantly improves the quality of 
partial results.

(Originally this tool was created by Doug Cutting in Nutch, and used norms as 
document weights - thus the ordering was limited by the limited resolution of 
norms. This is a pure Lucene version of the tool, and it uses arbitrary floats 
from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2482) Index sorter

2010-05-27 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated LUCENE-2482:
--

Attachment: indexSorter.patch

Patch with the tool and a unit test.

> Index sorter
> 
>
> Key: LUCENE-2482
> URL: https://issues.apache.org/jira/browse/LUCENE-2482
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 3.1
>Reporter: Andrzej Bialecki 
> Fix For: 3.1
>
> Attachments: indexSorter.patch
>
>
> A tool to sort index according to a float document weight. Documents with 
> high weight are given low document numbers, which means that they will be 
> first evaluated. When using a strategy of "early termination" of queries (see 
> TimeLimitedCollector) such sorting significantly improves the quality of 
> partial results.
> (Originally this tool was created by Doug Cutting in Nutch, and used norms as 
> document weights - thus the ordering was limited by the limited resolution of 
> norms. This is a pure Lucene version of the tool, and it uses arbitrary 
> floats from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2482) Index sorter

2010-05-27 Thread Eks Dev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872357#action_12872357
 ] 

Eks Dev commented on LUCENE-2482:
-

nice! 
There is also another interesting use case for sorting index, performance and 
index size!

We use a couple of fields with low cardinality (zip code, user group... and 
likes). Having index sorted on these makes rle compression of  postings really 
effective, making it possible to load all values into couple of M-bytes of ram.
At a moment we just sort collection before indexing.

Would  it be possible somehow to use a combination of stored fields and to 
specify comparator? Even comparing them as byte[] would do the trick for this 
business case as it is only important to keep the same values together, order 
is irrelevant. Of course, having decoder to decode byte[] before comparing 
would be useful (e.g. for composite fields) , but would work in many cases 
without it.   

This works fine even with moderate update rate, as you can re-sort 
periodically. It does not have to be totally sorted, everything works, just 
slightly more memory is needed for filters

With flex, having postings that use rle compression is quite possible ... this 
tool could become "optimizeHard()" tool for some indexes :)

> Index sorter
> 
>
> Key: LUCENE-2482
> URL: https://issues.apache.org/jira/browse/LUCENE-2482
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 3.1
>Reporter: Andrzej Bialecki 
> Fix For: 3.1
>
> Attachments: indexSorter.patch
>
>
> A tool to sort index according to a float document weight. Documents with 
> high weight are given low document numbers, which means that they will be 
> first evaluated. When using a strategy of "early termination" of queries (see 
> TimeLimitedCollector) such sorting significantly improves the quality of 
> partial results.
> (Originally this tool was created by Doug Cutting in Nutch, and used norms as 
> document weights - thus the ordering was limited by the limited resolution of 
> norms. This is a pure Lucene version of the tool, and it uses arbitrary 
> floats from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2482) Index sorter

2010-05-27 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872366#action_12872366
 ] 

Andrzej Bialecki  commented on LUCENE-2482:
---

Re: combination of fields + a comparator: sure, why not, take a look at the 
implementation of the DocScore inner class - you can stuff whatever you want 
there.

I'm not sure if I follow your use case though ... please remember that this 
re-sorting is applied exactly the same to all postings, so savings on one list 
may cause bloat on another list.

> Index sorter
> 
>
> Key: LUCENE-2482
> URL: https://issues.apache.org/jira/browse/LUCENE-2482
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 3.1
>Reporter: Andrzej Bialecki 
> Fix For: 3.1
>
> Attachments: indexSorter.patch
>
>
> A tool to sort index according to a float document weight. Documents with 
> high weight are given low document numbers, which means that they will be 
> first evaluated. When using a strategy of "early termination" of queries (see 
> TimeLimitedCollector) such sorting significantly improves the quality of 
> partial results.
> (Originally this tool was created by Doug Cutting in Nutch, and used norms as 
> document weights - thus the ordering was limited by the limited resolution of 
> norms. This is a pure Lucene version of the tool, and it uses arbitrary 
> floats from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr version housekeeping in Jira/Wiki (1.5, 1.6, 3.x, 4.x, etc...)

2010-05-27 Thread Chris Hostetter

FYI: I'm going to start working on this today, but i don't expect i'll 
finish it all in one go, so i'll reply to this message with incremental 
updates as i finish things and use the thread as an audit log.

: Date: Tue, 25 May 2010 11:34:25 -0700 (PDT)
: From: Chris Hostetter 
: Reply-To: dev@lucene.apache.org
: To: Lucene Dev 
: Subject: Solr version housekeeping in Jira/Wiki (1.5, 1.6, 3.x, 4.x, etc...)
: 
: 
: A while back, after the trunk merge (but before the 3x branch fork) yonik and
: i spear-headed a healthy depate on the list about whether the next version of
: Solr should have a lock-step version number with Lucene ... while i've
: generally come arround to yonik's way of thinking, that's *not* what this
: thread is about (i say that up front in the hopes of preventing this thread
: from devolving into a continued debate about internal vs marketing version
: numbers)
: 
: Independent of the questions of what branch the next version of SOlr should be
: released on, or what version number "label" it should be called, is the issue
: of keeping straight what bug fixes and features have been added to what
: branches.  Several issues in Jira were marked as "Fixed" in 1.5, prior to the
: trunk merge but with the ambiguity about how the versioning was going to
: evlove, were never bulk updated to indicate that they were actaully going be
: fixed in 3.1 (or 4.0).  Now that we may (or may not) ever have a 1.5 release,
: it can be hard to look at a Jira issue and make sense of where the chnages
: were actually commited.  This has been componded by some committers (i take
: responsibility for being the majority of the problem) continuing to mark
: issues they commit as being fixed in "1.5" even though they commited to the
: "trunk" (after the lucene/solr trunk merge)
: 
: Likewise for the way we annotate information in the Solr wiki.  Several bits
: of documentation are annoated as being in 1.5, but nothing is marked as 3.1 or
: 4.1
: 
: What i'd like to propose is that we focus on making sure the "Fix Version" in
: Jira and the annotations on the wiki correctly reflect the "next" version of
: the *branches* where changes have been commited. Even if (in the unlikely
: event) the final version numbers that we release are ultimatley differnet, we
: can at least be reasonably confident that a simple batch replace will work.
: 
: In concrete terms, these are the steps i'm planning to take in a few days
: unless someone objects, or suggests a simpler path...
: 
: 1) create a new Jira version for Solr called "next" as a way to track
: unresolved issues that people generally feel should be fixed in the "next"
: feature release.
: 
: 2) bulk change any Solr issue currently UNRESOLVED with a "Fix Version" or
: 1.5, 1.6, 3.1, or 4.0 so that it's new Fix Version is "next"
: 
: 3) Compute three diffs, one for each of each of these three CHANGES.txt
: files...
: 
: http://svn.apache.org/viewvc/lucene/solr/branches/branch-1.5-dev/CHANGES.txt
: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/CHANGES.txt
: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/CHANGES.txt
:   ...against the official 1.4 CHANGES.txt...
: http://svn.apache.org/viewvc/lucene/solr/tags/release-1.4.0/CHANGES.txt
: 
: 4) merge the diffs from step#3 into a 4 column report, listing every issue
: mentioned in any of those three CHANGES.txt files and which "branches" it has
: been commited to.
: 
: 5) using the report for step#4, manually update every individual issue so that
: the Fix Version accurately the list of *possible* versions that issue will be
: fixed in, if there is a release off of those respective branches (ie: some
: subset of (1.5, 3.1, 4.0))
: 
: 6) delete "1.6" as a Solr version in Jira.
: 
: 7) Update the Solr1.5 wiki page to link to the 1.5 branch in SVN, and add a
: note that such a release may never actually happen...
: http://wiki.apache.org/solr/Solr1.5
: 
: 8) Create new wiki pages for Solr3.1 and Solr4.0, model them after the Solr1.5
: page with pointers to what branch of SVN development is taking place on and
: where to track issues fixed on those branches.  (we can also add verbage here
: about the merged lucene/solr dev model, and why the 3x branch was created, but
: we can worry about that later)
: 
: 9) Audit every link to the Solr1.5 page, and add links to the new Solr3.1 and
: Solr4.0 pages as needed...
: 
http://wiki.apache.org/solr/Solr1.5?action=fullsearch&context=180&value=linkto%3A%22Solr1.5%22
: 
: 
: ...I'm not particularly looking forward to step #5, but it's the only safe way
: i can thin of to make everything is correct.  I'm open to other suggestions.
: 
: 
: -Hoss
: 
: 
: -
: To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
: For additional commands, e-mail: dev-h...@lucene.apache.org
: 



-Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucen

[jira] Commented: (LUCENE-2482) Index sorter

2010-05-27 Thread Eks Dev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872386#action_12872386
 ] 

Eks Dev commented on LUCENE-2482:
-

Re: I'm not sure if I follow your use case though

Simple case, you have a 100Mio docs with 2 fields, CITY and  TEXT

sorting on CITY makes postings look like: 
Orlando:  -
 New York:   
-
perfectly compressible. 

without really affecting distribution (compressibility) of terms from the TEXT 
field.

If CITY would remain in unsorted order (e.g. uniform distribution), you deal 
with very large postings for all terms coming from this field  

Sorting on many fields helps often, e.g. if you have hierarchical compositions 
like 1 CITY with many  ZIP_CODES...  philosophically, sorting always increases 
compressibility and improves locality of reference... but sure, you need to 
know what you want

> Index sorter
> 
>
> Key: LUCENE-2482
> URL: https://issues.apache.org/jira/browse/LUCENE-2482
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: contrib/*
>Affects Versions: 3.1
>Reporter: Andrzej Bialecki 
> Fix For: 3.1
>
> Attachments: indexSorter.patch
>
>
> A tool to sort index according to a float document weight. Documents with 
> high weight are given low document numbers, which means that they will be 
> first evaluated. When using a strategy of "early termination" of queries (see 
> TimeLimitedCollector) such sorting significantly improves the quality of 
> partial results.
> (Originally this tool was created by Doug Cutting in Nutch, and used norms as 
> document weights - thus the ordering was limited by the limited resolution of 
> norms. This is a pure Lucene version of the tool, and it uses arbitrary 
> floats from a specified stored field).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1316) Create autosuggest component

2010-05-27 Thread Andrzej Bialecki (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated SOLR-1316:


Attachment: SOLR-1316.patch

Updated patch - previous version had a missing "else" in TSTLookup.

Improved benchmark - the other results are useless, the running time was too 
short. Also, this time the lookup keys are strict prefixes between 1/3 and 1/2 
length of the key.  Result counts are cross-validated between JaSpell and TST 
implementations.

Here are the new timings (times in [ms] for 100k items):

{code}
JaspellLookup:  buildTime=448   lookupTime=1316
JaspellLookup:  buildTime=379   lookupTime=1073
JaspellLookup:  buildTime=399   lookupTime=709
JaspellLookup:  buildTime=405   lookupTime=698
JaspellLookup:  buildTime=454   lookupTime=758
JaspellLookup:  buildTime=451   lookupTime=746
JaspellLookup:  buildTime=436   lookupTime=886
JaspellLookup:  buildTime=424   lookupTime=696
JaspellLookup:  buildTime=402   lookupTime=697
JaspellLookup:  buildTime=415   lookupTime=1156
JaspellLookup:  buildTime=413   lookupTime=693
JaspellLookup:  buildTime=429   lookupTime=698
JaspellLookup:  buildTime=411   lookupTime=885
JaspellLookup:  buildTime=402   lookupTime=688
JaspellLookup:  buildTime=398   lookupTime=691
JaspellLookup:  buildTime=405   lookupTime=1152
JaspellLookup:  buildTime=405   lookupTime=695
JaspellLookup:  buildTime=410   lookupTime=1009
JaspellLookup:  buildTime=409   lookupTime=891
JaspellLookup:  buildTime=400   lookupTime=685
TSTLookup:  buildTime=185   lookupTime=289
TSTLookup:  buildTime=161   lookupTime=427
TSTLookup:  buildTime=173   lookupTime=311
TSTLookup:  buildTime=183   lookupTime=304
TSTLookup:  buildTime=177   lookupTime=311
TSTLookup:  buildTime=175   lookupTime=287
TSTLookup:  buildTime=173   lookupTime=431
TSTLookup:  buildTime=161   lookupTime=278
TSTLookup:  buildTime=161   lookupTime=282
TSTLookup:  buildTime=177   lookupTime=453
TSTLookup:  buildTime=157   lookupTime=286
TSTLookup:  buildTime=160   lookupTime=432
TSTLookup:  buildTime=161   lookupTime=281
TSTLookup:  buildTime=160   lookupTime=275
TSTLookup:  buildTime=160   lookupTime=454
TSTLookup:  buildTime=178   lookupTime=298
TSTLookup:  buildTime=181   lookupTime=289
TSTLookup:  buildTime=159   lookupTime=432
TSTLookup:  buildTime=164   lookupTime=285
TSTLookup:  buildTime=159   lookupTime=480
{code}

> Create autosuggest component
> 
>
> Key: SOLR-1316
> URL: https://issues.apache.org/jira/browse/SOLR-1316
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1316.patch, SOLR-1316.patch, suggest.patch, 
> suggest.patch, suggest.patch, TST.zip
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Autosuggest is a common search function that can be integrated
> into Solr as a SearchComponent. Our first implementation will
> use the TernaryTree found in Lucene contrib. 
> * Enable creation of the dictionary from the index or via Solr's
> RPC mechanism
> * What types of parameters and settings are desirable?
> * Hopefully in the future we can include user click through
> rates to boost those terms/phrases higher

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr version housekeeping in Jira/Wiki (1.5, 1.6, 3.x, 4.x, etc...)

2010-05-27 Thread Chris Hostetter

Steps #1, #2, and #6 have been completed.  

The full list of issues affected by Step #2 can be located using a Jira 
search for "hossversioncleanup20100527" in comments.

: 1) create a new Jira version for Solr called "next" as a way to track
: unresolved issues that people generally feel should be fixed in the "next"
: feature release.

: 2) bulk change any Solr issue currently UNRESOLVED with a "Fix Version" or
: 1.5, 1.6, 3.1, or 4.0 so that it's new Fix Version is "next"

: 6) delete "1.6" as a Solr version in Jira.



-Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr version housekeeping in Jira/Wiki (1.5, 1.6, 3.x, 4.x, etc...)

2010-05-27 Thread Chris Hostetter

: 3) Compute three diffs, one for each of each of these three CHANGES.txt
: files...
...
: 4) merge the diffs from step#3 into a 4 column report, listing every issue
: mentioned in any of those three CHANGES.txt files and which "branches" it has
: been commited to.

Below is the report generated from those diffs, the sequence of commands 
was...

curl http://svn.apache.org/repos/asf/lucene/solr/tags/release-1.4.0/CHANGES.txt 
> 1.4.txt
curl 
http://svn.apache.org/repos/asf/lucene/solr/branches/branch-1.5-dev/CHANGES.txt 
> 1.5.txt
curl 
http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/solr/CHANGES.txt 
> 3.1.txt
curl http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/CHANGES.txt > 4.0.txt
 
diff 1.4.txt 1.5.txt > 1.5.diff
diff 1.4.txt 3.1.txt > 3.1.diff
diff 1.4.txt 4.0.txt > 4.0.diff

perl -nle 'print "https://issues.apache.org/jira/browse/$1\t1.5"; while 
/(SOLR-\d+)/g' 1.5.diff | sort -u > 1.5.fixed.txt
perl -nle 'print "https://issues.apache.org/jira/browse/$1\t3.1"; while 
/(SOLR-\d+)/g' 3.1.diff | sort -u > 3.1.fixed.txt
perl -nle 'print "https://issues.apache.org/jira/browse/$1\t4.0"; while 
/(SOLR-\d+)/g' 4.0.diff | sort -u > 4.0.fixed.txt

join -t $'\t' -a 1 -a 2 1.5.fixed.txt 3.1.fixed.txt > tmp.txt
join -t $'\t' -a 1 -a 2 tmp.txt 4.0.fixed.txt > all.fixed.txt

...and here's the final contents of all.fixed.txt...

https://issues.apache.org/jira/browse/SOLR-1131 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1139 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1177 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1268 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1297 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1302 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1357 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1379 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1432 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1516 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1522 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1532 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1538 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1553 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1558 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1561 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1563 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1569 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1570 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1571 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1572 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1574 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1577 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1579 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1580 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1582 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1584 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1586 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1587 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1588 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1590 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1592 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1593 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1595 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1596 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1601 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1608 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1610 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1611 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1615 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1621 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1624 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1625 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1628 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1635 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1637 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1651 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1653 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1657 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1660 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1661 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1662 1.5 3.1 4.0
https://issues.apache.org/jira/browse/SOLR-1667 1.5 3.1 4.0
https://issues.apache.org/jira/b

[jira] Updated: (SOLR-1139) SolrJ TermsComponent Query and Response Support

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1139:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> SolrJ TermsComponent Query and Response Support
> ---
>
> Key: SOLR-1139
> URL: https://issues.apache.org/jira/browse/SOLR-1139
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 1.4
>Reporter: Matt Weber
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1139-WITH_SORT_SUPPORT.patch, SOLR-1139.patch, 
> SOLR-1139.patch, SOLR-1139.patch, SOLR-1139.patch, SOLR-1139.patch, 
> SOLR-1139.patch, SOLR-1139.patch
>
>
> SolrJ should support the new TermsComponent that was introduced in Solr 1.4.  
> It should be able to:
> - set TermsComponent query parameters via SolrQuery
> - parse the TermsComponent response

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1131:
---

Fix Version/s: 3.1
   4.0

Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Allow a single field type to index multiple fields
> --
>
> Key: SOLR-1131
> URL: https://issues.apache.org/jira/browse/SOLR-1131
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Ryan McKinley
>Assignee: Grant Ingersoll
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: diff.patch, SOLR-1131-IndexMultipleFields.patch, 
> SOLR-1131.Mattmann.121009.patch.txt, SOLR-1131.Mattmann.121109.patch.txt, 
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, 
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, 
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch, 
> SOLR-1131.patch, SOLR-1131.patch, SOLR-1131.patch
>
>
> In a few special cases, it makes sense for a single "field" (the concept) to 
> be indexed as a set of Fields (lucene Field).  Consider SOLR-773.  The 
> concept "point" may be best indexed in a variety of ways:
>  * geohash (sincle lucene field)
>  * lat field, lon field (two double fields)
>  * cartesian tiers (a series of fields with tokens to say if it exists within 
> that region)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1177) Distributed TermsComponent

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1177:
---

Fix Version/s: 3.1
   4.0

Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Distributed TermsComponent
> --
>
> Key: SOLR-1177
> URL: https://issues.apache.org/jira/browse/SOLR-1177
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4
>Reporter: Matt Weber
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, 
> SOLR-1177.patch, SOLR-1177.patch, SOLR-1177.patch, TermsComponent.java, 
> TermsComponent.patch
>
>
> TermsComponent should be distributed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1302) Fun with Distances - Add Distance functions for a variety of things

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1302:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Fun with Distances - Add Distance functions for a variety of things
> ---
>
> Key: SOLR-1302
> URL: https://issues.apache.org/jira/browse/SOLR-1302
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1302.patch, SOLR-1302.patch, SOLR-1302.patch
>
>
> There are many distance functions that are useful to have:
> 1. Great Circle (lat/lon) and other geo distances
> 2. Euclidean (Vector)
> 3. Manhattan (Vector)
> 4. Cosine (Vector)
> For the vector ones, the idea is that the fields on a document can be used to 
> determine the vector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1268) Incorporate Lucene's FastVectorHighlighter

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1268:
---

Fix Version/s: 3.1
   4.0
  Description: 
Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Incorporate Lucene's FastVectorHighlighter
> --
>
> Key: SOLR-1268
> URL: https://issues.apache.org/jira/browse/SOLR-1268
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1268-0_fragsize.patch, SOLR-1268-0_fragsize.patch, 
> SOLR-1268.patch, SOLR-1268.patch, SOLR-1268.patch
>
>
> Correcting Fix Version based on CHANGES.txt, see this thread for more 
> details...
> http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1297) Enable sorting by Function Query

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1297:
---

Fix Version/s: 1.5
   3.1
   4.0
   (was: Next)

Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Enable sorting by Function Query
> 
>
> Key: SOLR-1297
> URL: https://issues.apache.org/jira/browse/SOLR-1297
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1297-2.patch, SOLR-1297.patch
>
>
> It would be nice if one could sort by FunctionQuery.  See also SOLR-773, 
> where this was first mentioned by Yonik as part of the generic solution to 
> geo-search

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1357) SolrInputDocument cannot process dynamic fields

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1357:
---

Fix Version/s: 1.5
   3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> SolrInputDocument cannot process dynamic fields
> ---
>
> Key: SOLR-1357
> URL: https://issues.apache.org/jira/browse/SOLR-1357
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Avlesh Singh
>Assignee: Noble Paul
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1357.patch, SOLR-1357.patch
>
>
> Adding data via {{SolrInputDocument}} is normally done by calling the 
> {{addField}} method with a field name, field value and an optional boost.  In 
> case of dynamic fields, if field names are known upfront, then caller of this 
> method just passes in the right name and it automatically works.
> This does not go well with users who use {...@interface Field}} annotations 
> for automatic binding. 
> As of SOLR-1129, users can annotate {{Map propertyName}} with 
> {...@field ("field_*")}} kind of annotations to bind dynamic field data to. 
> {{SolrInputDocument}} should exhibit the same behavior.  The field {{value}} 
> currently supported are - primitive, array, collection or an instance of 
> Iterable. It can also take {{Map}} as values. If the field, for which 
> {{addField}} method is called, is of dynamicField type (which can be derived 
> from the field name), then the keys of the {{Map}}, passed as value, should 
> be used to "compose" the correct field name.
> This should be supported
> {code:java}
> //This code sample should populate the dynamic fields "brands_Nokia" and 
> "brands_Samsung"
> public class MyBean{
>   @Field("brands_*)
>   Map brands;
>   
>   ...
> }
> Map brands= new HashMap();
> brands.put("Nokia", 1000);
> brands.put("Samsung", 100);
> MyBean myBean = new MyBean();
> myBean.setBrands(brands);
> solrServer.addBean(myBean);
> {code}
> We can think of supporting this too ...
> {code:java}
> //This code sample should populate the dynamic fields "brands_Nokia" and 
> "brands_Samsung"
> Map brands= new HashMap();
> brands.put("Nokia", 1000);
> brands.put("Samsung", 100);
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField("brands_*", brands);
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1379) Add RAMDirectoryFactory

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1379:
---

Fix Version/s: 4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Add RAMDirectoryFactory
> ---
>
> Key: SOLR-1379
> URL: https://issues.apache.org/jira/browse/SOLR-1379
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.3
>Reporter: Alex Baranov
>Priority: Minor
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-1379.patch, SOLR-1379.patch
>
>
> Implement class RAMDirectoryFactory to make possible using RAMDirectory by 
> adding the next configuration in solrconfig.xml:
> {code} class="org.apache.solr.core.RAMDirectoryFactory"/>{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1432) FunctionQueries aren't correctly weighted

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1432:
---

Fix Version/s: 1.5
   3.1
   4.0
   (was: 1.4)


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


(Note: this issue is "special" ... it was originally marked fixed in 1.4 
because some changes were made for 1.4 -- but those changes were broken so i'm 
removing 1.4 from the Fix list)

> FunctionQueries aren't correctly weighted
> -
>
> Key: SOLR-1432
> URL: https://issues.apache.org/jira/browse/SOLR-1432
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1432.patch, SOLR-1432.patch
>
>
> Nested queries in function queries aren't weighted correctly with the proper 
> Searcher, and this is now even more serious with per-segment searching in 
> Lucene/Solr.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1538) Solr possible deadlock source (FindBugs report)

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1538:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Solr possible deadlock source (FindBugs report)
> ---
>
> Key: SOLR-1538
> URL: https://issues.apache.org/jira/browse/SOLR-1538
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
> Environment: platform independent
>Reporter: gabriele renzi
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1538.patch
>
>   Original Estimate: 0.17h
>  Remaining Estimate: 0.17h
>
> The code to get the latest accessed items in ConcurrentLRUCache looks like
> {code:title=ConcurrentLRUCache.java|}
>  public Map getOldestAccessedItems(int n) {
> markAndSweepLock.lock();
> Map result = new LinkedHashMap();
> TreeSet tree = new TreeSet();
> try {
>...
> } finally {
>   markAndSweepLock.unlock();
> }
> {code}
> (this method is apparently unused though) and in 
> {code}
>public Map getLatestAccessedItems(int n) {
>  // we need to grab the lock since we are changing lastAccessedCopy
>  markAndSweepLock.lock();
>  Map result = new LinkedHashMap();
>  TreeSet tree = new TreeSet();
>  try {
> ...
> {code}
> The impression is that if an OOM situation occurs on the allocation of the 
> local LinkedHashMap and TreeSet the lock would not be unlocked anymore.
> The quick fix would be to move the lock() call after the allocations, and 
> this does not seem to imply any problem. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1522) DIH:Show proper message if

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1522:
---

Fix Version/s: 1.5
   3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> DIH:Show proper message if 

[jira] Updated: (SOLR-1516) DocumentList and Document QueryResponseWriter

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1516:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> DocumentList and Document QueryResponseWriter
> -
>
> Key: SOLR-1516
> URL: https://issues.apache.org/jira/browse/SOLR-1516
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
> Environment: My MacBook Pro laptop.
>Reporter: Chris A. Mattmann
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1516.Mattmann.101809.patch.txt, 
> SOLR-1516.Mattmann.112409.patch.txt, SOLR-1516.patch, SOLR-1516.patch, 
> SOLR-1516.patch
>
>
> I tried to implement a custom QueryResponseWriter the other day and was 
> amazed at the level of unmarshalling and weeding through objects that was 
> necessary just to format the output o.a.l.Document list. As a user, I wanted 
> to be able to implement either 2 functions:
> * process a document at a time, and format it (for speed/efficiency)
> * process all the documents at once, and format them (in case an aggregate 
> calculation is necessary for outputting)
> So, I've decided to contribute 2 simple classes that I think are sufficiently 
> generic and reusable. The first is o.a.s.request.DocumentResponseWriter -- it 
> handles the first bullet above. The second is 
> o.a.s.request.DocumentListResponseWriter. Both are abstract base classes and 
> require the user to implement either an #emitDoc function (in the case of 
> bullet 1), or an #emitDocList function (in the case of bullet 2). Both 
> classes provide an #emitHeader and #emitFooter function set that handles 
> formatting and output before the Document list is processed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1532) allow StreamingUpdateSolrServer to use a provided HttpClient

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1532:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> allow StreamingUpdateSolrServer to use a provided HttpClient
> 
>
> Key: SOLR-1532
> URL: https://issues.apache.org/jira/browse/SOLR-1532
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Affects Versions: 1.4
>Reporter: gabriele renzi
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1532.patch, SOLR-1532.patch
>
>
> As of r830319 StreamingUpdateSolrServer does not allow calling code to 
> provide an HttpClient, and this implies client code cannot reuse an existing 
> connection manager, the patch  adds a new constructor and refactors the old 
> one to use this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1558) QueryElevationComponent should fail to init if uniqueKey field isn't string

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1558:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> QueryElevationComponent should fail to init if uniqueKey field isn't string
> ---
>
> Key: SOLR-1558
> URL: https://issues.apache.org/jira/browse/SOLR-1558
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3, 1.4
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.5, 3.1, 4.0
>
>
> QueryElevationComponent fails in confusing ways if you use it on a schema 
> where the uniqueKey fieldtype is not a StrField.  This is easy to assert at 
> init, so we should do that.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1553) extended dismax query parser

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1553:
---

Fix Version/s: 1.5
   3.1
   4.0
   (was: Next)


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> extended dismax query parser
> 
>
> Key: SOLR-1553
> URL: https://issues.apache.org/jira/browse/SOLR-1553
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: edismax.unescapedcolon.bug.test.patch, 
> edismax.userFields.patch, SOLR-1553.patch, SOLR-1553.pf-refactor.patch
>
>
> An improved user-facing query parser based on dismax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1561) Import Lucene 2.9.1 Geospatial JAR

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1561:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Import Lucene 2.9.1 Geospatial JAR
> --
>
> Key: SOLR-1561
> URL: https://issues.apache.org/jira/browse/SOLR-1561
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
>
> Bring in the spatial contrib jar so that we can use its utilities, etc. where 
> appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1563) binary fields caused a null pointer exception in the luke request handler

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1563:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> binary fields caused a null pointer exception in the luke request handler
> -
>
> Key: SOLR-1563
> URL: https://issues.apache.org/jira/browse/SOLR-1563
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Hoss Man
>Priority: Critical
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1563.patch
>
>
> Multiple reports of NPEs when using Solr 1.4 - so far these all seem to 
> relate to getting a null returned by Fieldable.stringValue when it isn't 
> expected or accounted for.  Thread where this was initially discussed...
> http://old.nabble.com/NPE-when-trying-to-view-a-specific-document-via-Luke-to26330237.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1571) unicode collation support

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1571:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> unicode collation support
> -
>
> Key: SOLR-1571
> URL: https://issues.apache.org/jira/browse/SOLR-1571
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Reporter: Robert Muir
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1571.patch
>
>
> This patch adds support for unicode collation (searching and sorting).
> Unicode collation is helpful in a search engine, for many languages you want 
> things to match or sort differently.
> You might even want to use copyfield and support different sort 
> orders/matching schemes if you need to support multiple languages.
> This is simply a factory for lucene's CollationKeyFilter, which indexes 
> binary collation keys in a special format that preserves binary sort order.
> I've added support for creating a Collator in two ways:
> * system collator from a Locale spec (language + country + variant)
> * tailored collator from custom rules in a text file
> in no way is there an option to use the "default" locale of the jvm, (I 
> consider this a bit dangerous)
> in this patch, it is mandatory to define the locale explicitly for a system 
> collator.
> The required lucene-collation-2.9.1.jar is only 12KB.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1569) Allow literal Strings in functions

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1569:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Allow literal Strings in functions
> --
>
> Key: SOLR-1569
> URL: https://issues.apache.org/jira/browse/SOLR-1569
> Project: Solr
>  Issue Type: Bug
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1569.patch
>
>
> Some functions (for instance, those who take a geohash) may need to pass 
> literal strings.  This patch modifies the FunctionQParser to allow for quoted 
> strings in functions (either single quote or double quote) to be passed 
> through as a LiteralValueSource.  It also adds the LiteralValueSource.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1570) Complain loudly if uniqueKey field is definied but not stored=true,multiValued=false

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1570:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Complain loudly if uniqueKey field is definied but not 
> stored=true,multiValued=false
> 
>
> Key: SOLR-1570
> URL: https://issues.apache.org/jira/browse/SOLR-1570
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1570.patch
>
>
> When loading a new schema, Solr should log some "SEVERE" warnings if the 
> schema uses a uniqueKey field, but that field/type don't match the expected 
> needs of unieuqKey field for most functionality to work (stored=true, 
> multiValued=false) ... that way people won't (have any reason to) be suprised 
> when things break later)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1572) FastLRUCache doesn't correctly implement LRU after 2B accesses

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1572:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> FastLRUCache doesn't correctly implement LRU after 2B accesses
> --
>
> Key: SOLR-1572
> URL: https://issues.apache.org/jira/browse/SOLR-1572
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 1.5, 3.1, 4.0
>
>
> FastLRUCache doesn't correctly implement LRU after 2B accesses due to 
> Integer.MAX_VALUE being used internally instead of Long.MAX_VALUE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1577) undesirable dataDir default in example config

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1577:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> undesirable dataDir default in example config
> -
>
> Key: SOLR-1577
> URL: https://issues.apache.org/jira/browse/SOLR-1577
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1577.patch
>
>
> dataDir in the example solrconfig.xml defaults to ./solr/data (as opposed to 
> the solr home)
> http://search.lucidimagination.com/search/document/7759f05f576d6727
> http://search.lucidimagination.com/search/document/c5ae6fa490d0f59a

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1574) simpler builtin functions

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1574:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> simpler builtin functions
> -
>
> Key: SOLR-1574
> URL: https://issues.apache.org/jira/browse/SOLR-1574
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1574.patch, SOLR-1574.patch
>
>
> Make it easier and less error prone to add simple functions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1579) CLONE -stats.jsp XML escaping

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1579:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> CLONE -stats.jsp XML escaping
> -
>
> Key: SOLR-1579
> URL: https://issues.apache.org/jira/browse/SOLR-1579
> Project: Solr
>  Issue Type: Bug
>  Components: web gui
>Affects Versions: 1.4
>Reporter: David Bowen
>Assignee: Hoss Man
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1579.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The fix to SOLR-1008 was wrong.  It used chardata escaping for a value that 
> is an attribute value.
> I.e. instead of XML.escapeCharData it should call XML.escapeAttributeValue.
> Otherwise, any query used as a key in the filter cache whose printed 
> representation contains a double-quote character causes invalid XML to be 
> generated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1582) DocumentBuilder does not properly handle binary field copy fields

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1582:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> DocumentBuilder does not properly handle binary field copy fields
> -
>
> Key: SOLR-1582
> URL: https://issues.apache.org/jira/browse/SOLR-1582
> Project: Solr
>  Issue Type: Bug
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Trivial
> Fix For: 1.5, 3.1, 4.0
>
>
> In DocumentBuilder, around lines 267, the BinaryField is created, but it is 
> never assigned to the field that is added to the output.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1580) Solr Configuration ignores 'mergeFactor' parameter, always uses Lucene default

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1580:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Solr Configuration ignores 'mergeFactor' parameter, always uses Lucene default
> --
>
> Key: SOLR-1580
> URL: https://issues.apache.org/jira/browse/SOLR-1580
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Affects Versions: 1.4
>Reporter: Lance Norskog
>Assignee: Mark Miller
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1580.patch, SOLR-1580.patch
>
>
> The 'mergeFactor' parameter in solrconfig.xml  is parsed by SolrIndexConfig 
> but is not consulted by SolrIndexWriter. This parameter controls the number 
> of segments that are merged at once and also controls the total number of 
> segments allowed to accumulate in the index.
> [IndexWriter.mergeFactor|http://lucene.apache.org/java/2_9_1/api/all/org/apache/lucene/index/IndexWriter.html#getMergeFactor()]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1584) setIncludeScore is added to the "FL" field instead of being concated

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1584:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> setIncludeScore is added to the "FL" field instead of being concated
> 
>
> Key: SOLR-1584
> URL: https://issues.apache.org/jira/browse/SOLR-1584
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.4
>Reporter: Asaf Ary
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1584.patch
>
>
> The current implementation of setIncludeScore(boolean) *adds* the value 
> "score" to the FL parameter.
> This causes a problem when using the setFields followed by include score.
> If I do this:
> setFields("*");
> setIncludeScore(true);
> I would expect the outcome to be "fl=*,score"
> Instead the outcome is: "fl=* &fl=score" which fails to use the score field 
> as FL is not a multi-valued field.
> The current implementation in the SolrJ SolrQuery object is:
> add("fl", "score")
> instead it should be:
> set("fl", get("fl") + ",score")
> obviously not as simplistic as that, but you catch my drift...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1586) Create Spatial Point FieldTypes

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1586:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Create Spatial Point FieldTypes
> ---
>
> Key: SOLR-1586
> URL: https://issues.apache.org/jira/browse/SOLR-1586
> Project: Solr
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: examplegeopointdoc.patch.txt, SOLR-1586-geohash.patch, 
> SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
> SOLR-1586.Mattmann.112209.geopointonly.patch.txt, 
> SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
> SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, 
> SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt, 
> SOLR-1586.Mattmann.120709.geohashonly.patch.txt, 
> SOLR-1586.Mattmann.121209.geohash.outarr.patch.txt, 
> SOLR-1586.Mattmann.121209.geohash.outstr.patch.txt, 
> SOLR-1586.Mattmann.122609.patch.txt, SOLR-1586.patch, SOLR-1586.patch
>
>
> Per SOLR-773, create field types that hid the details of creating tiers, 
> geohash and lat/lon fields.
> Fields should take in lat/lon points in a single form, as in:
> lat lon

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1587) Propagating fl=*,score to shards

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1587:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Propagating fl=*,score to shards
> 
>
> Key: SOLR-1587
> URL: https://issues.apache.org/jira/browse/SOLR-1587
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.4
> Environment: any http solr server
>Reporter: Asaf Ary
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1587.patch
>
>
> When doing an HTTP request to a Solr Server using the shards parameter 
> ("_shards_") the behavior of the response varies.
> The following requests cause the entire document (all fields) to return in 
> the response:
> {quote}
> http://localhost:8180/solr/cpaCore/select/?q=*:*
> http://localhost:8180/solr/cpaCore/select/?q=*:*&fl=score
> 
> http://localhost:8180/solr/cpaCore/select/?q=*:*&shards=shardLocation/solr/cpaCore
> {quote}
> The following request causes only the fields "id" and "score" to return in 
> the response:
> {quote}
> 
> http://localhost:8180/solr/cpaCore/select/?q=*:*&fl=score&shards=shardLocation/solr/cpaCore
> {quote}
> I don't know if this is by design but it does provide for some inconsistent 
> behavior, as shard requests behave differently than regular requests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1588) FieldProperties contains large commented code block

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1588:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> FieldProperties contains large commented code block
> ---
>
> Key: SOLR-1588
> URL: https://issues.apache.org/jira/browse/SOLR-1588
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
> Environment: My local MacBook pro laptop.
>Reporter: Chris A. Mattmann
>Assignee: Hoss Man
>Priority: Trivial
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1588.Mattmann.112209.patch.txt
>
>
> Lines 102-133 in o.a.solr.schema.FieldProperties contain a large commented 
> out block of code. It's generally a good thing to not commit this type of 
> code -- if it's needed it can be looked up by the CM system for a prior 
> revision. Trivial patch attached that removes the code block.
> (note I found this while trying to understand the FieldType hierarchy with 
> the goal of working on SOLR-1586 to help out...)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1590) Some basic javadoc for XMLWriter#startTag

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1590:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Some basic javadoc for XMLWriter#startTag
> -
>
> Key: SOLR-1590
> URL: https://issues.apache.org/jira/browse/SOLR-1590
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4
> Environment: My MacBook pro laptop.
>Reporter: Chris A. Mattmann
>Assignee: Hoss Man
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1590.Mattmann.112209.patch.txt
>
>
> Here's some javadoc for the XMLWriter#startTag method. It was unclear to me 
> what this method actually did (while writing a FieldType), so I looked up the 
> code, and decided I'd document what I found.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1593) reverse wildcard filter doesn't work for chars outside the BMP

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1593:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> reverse wildcard filter doesn't work for chars outside the BMP
> --
>
> Key: SOLR-1593
> URL: https://issues.apache.org/jira/browse/SOLR-1593
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1593.patch
>
>
> reverse wildcard filter doesn't work for chars outside the BMP.  reversing 
> characters that take up more than one Java char creates unpaired surrogates, 
> which get replaced with the replacement character at index time. See 
> https://issues.apache.org/jira/browse/LUCENE-2068

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1592) Refactor XMLWriter startTag to allow arbitrary attributes to be written

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1592:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Refactor XMLWriter startTag to allow arbitrary attributes to be written
> ---
>
> Key: SOLR-1592
> URL: https://issues.apache.org/jira/browse/SOLR-1592
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.4
> Environment: My MacBook laptop.
>Reporter: Chris A. Mattmann
>Assignee: Noble Paul
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1592.Mattmann.112209.patch.txt, 
> SOLR-1592.Mattmann.112209_02.patch.txt, SOLR-1592.patch, SOLR-1592.patch
>
>
> There are certain cases in which a user would like to write arbitrary 
> attributes as part of the XML output for a field tag. Case in point: I'd like 
> to declare tags in the SOLR output that are e.g., georss namespace, like 
> georss:point. Other users may want to declare myns:mytag tags, which should 
> be perfectly legal as SOLR goes. This isn't currently possible with the 
> XMLWriter implementation, which curiously only allows the attribute "name" to 
> be included in the XML tags. 
> Coincidentally, users of XMLWriter aren't allowed to modify the  outer XML tag to include those arbitrary namespaces (which was my original 
> thought as a workaround for this). This wouldn't matter anyways, because by 
> the time the user got to the FieldType#writeXML method, the header for the 
> XML would have been written anyways.
> I've developed a workaround, and in doing so, allowed something that should 
> have probably been allowed in the first place: allow a user to write 
> arbitrary attributes (including xmlns:myns="myuri") as part of the 
> XMLWriter#startTag function. I've kept the existing #startTag, but replaced 
> its innards with versions of startTag that include startTagWithNamespaces, 
> and startTagNoAttrs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1595) StreamingUpdateSolrServer doesn't specifying UTF-8 when creating OutputStreamWriter

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1595:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> StreamingUpdateSolrServer doesn't specifying UTF-8 when creating 
> OutputStreamWriter
> ---
>
> Key: SOLR-1595
> URL: https://issues.apache.org/jira/browse/SOLR-1595
> Project: Solr
>  Issue Type: Bug
>  Components: clients - java
>Affects Versions: 1.3, 1.4
>Reporter: Hoss Man
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1595.patch
>
>
> StreamingUpdateSolrServer constructs an OutputStreamWriter w/o specifying 
> that it should use UTF-8 ... as a result the JVMs default encoding is used 
> even though the request includes a Content-Type header of "text/xml; 
> charset=utf-8" via CLientUtisl.TEXT_XML.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1601) Schema browser does not indicate presence of charFilter

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1601:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Schema browser does not indicate presence of charFilter
> ---
>
> Key: SOLR-1601
> URL: https://issues.apache.org/jira/browse/SOLR-1601
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Jake Brownell
>Assignee: Koji Sekiguchi
>Priority: Trivial
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1601.patch
>
>
> My schema has a field defined as:
> {noformat}
>  positionIncrementGap="100">
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  words="stopwords.txt" enablePositionIncrements="true" />
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="1" catenateNumbers="1" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>  words="stopwords.txt" enablePositionIncrements="true" />  
>  
>  generateWordParts="1" generateNumberParts="1"
> catenateWords="0" catenateNumbers="0" catenateAll="0" 
> splitOnCaseChange="1" />
> 
>  protected="protwords.txt" />
> 
> 
> 
> 
> {noformat}
> and when I view the field in the schema browser, I see:
> {noformat}
> Tokenized:  true
> Class Name:  org.apache.solr.schema.TextField
> Index Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 1 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 1 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> Query Analyzer: org.apache.solr.analysis.TokenizerChain 
> Tokenizer Class:  org.apache.solr.analysis.WhitespaceTokenizerFactory
> Filters:  
> org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt 
> expand: true ignoreCase: true }
> org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt 
> ignoreCase: true enablePositionIncrements: true }
> org.apache.solr.analysis.WordDelimiterFilterFactory args:{splitOnCaseChange: 
> 1 generateNumberParts: 1 catenateWords: 0 generateWordParts: 1 catenateAll: 0 
> catenateNumbers: 0 }
> org.apache.solr.analysis.LowerCaseFilterFactory args:{}
> org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected: 
> protwords.txt }
> org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
> {noformat}
> It's not a big deal, but I expected to see some indication of the charFilter 
> that is in place.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1596) rollback may be resulting ina SolrIndexWriter hat doesn't get closed properly

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1596:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> rollback may be resulting ina SolrIndexWriter hat doesn't get closed properly
> -
>
> Key: SOLR-1596
> URL: https://issues.apache.org/jira/browse/SOLR-1596
> Project: Solr
>  Issue Type: Bug
>  Components: update
>Reporter: Hoss Man
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1596.patch
>
>
> anecdotal evidence of "SEVERE: SolrIndexWriter was not closed prior to 
> finalize" messages seen in the wild.  May be related to using  the "rollback" 
> command...
> http://old.nabble.com/SEVERE%3A-SolrIndexWriter-was-not-closed-prior-to-finalize-to26217896.html#a26217896

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1610) Add generics to SolrCache

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1610:
---

Fix Version/s: 3.1
   4.0

> Add generics to SolrCache
> -
>
> Key: SOLR-1610
> URL: https://issues.apache.org/jira/browse/SOLR-1610
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1610.patch
>
>
> Seems fairly simple for SolrCache to have generics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1608) Make it easy to write distributed search test cases

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1608:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Make it easy to write distributed search test cases
> ---
>
> Key: SOLR-1608
> URL: https://issues.apache.org/jira/browse/SOLR-1608
> Project: Solr
>  Issue Type: Improvement
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1608.patch, SOLR-1608.patch, SOLR-1608.patch
>
>
> Extract base class from TestDistributedSearch to make it easier for people to 
> write test cases for distributed components.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1611) Import Lucene 2.9.1 Collation jar

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1611:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Import Lucene 2.9.1 Collation jar
> -
>
> Key: SOLR-1611
> URL: https://issues.apache.org/jira/browse/SOLR-1611
> Project: Solr
>  Issue Type: New Feature
>Reporter: Shalin Shekhar Mangar
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
>
> Bring in the collation contrib jar so that we can use CollationKeyFilter and 
> ICUCollationKeyFilter wherever appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr version housekeeping in Jira/Wiki (1.5, 1.6, 3.x, 4.x, etc...)

2010-05-27 Thread Chris Hostetter
:   ...and here's the final contents of all.fixed.txt...

Checkpoint: going sequentially down the list, i've manually fixed 
everything up to an including SOLR-1611.  (only 66 left, woot!) ... i'll 
finish up hte rest after dinner.

: 
: https://issues.apache.org/jira/browse/SOLR-1131   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1139   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1177   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1268   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1297   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1302   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1357   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1379   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1432   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1516   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1522   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1532   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1538   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1553   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1558   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1561   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1563   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1569   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1570   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1571   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1572   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1574   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1577   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1579   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1580   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1582   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1584   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1586   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1587   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1588   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1590   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1592   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1593   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1595   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1596   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1601   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1608   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1610   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1611   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1615   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1621   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1624   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1625   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1628   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1635   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1637   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1651   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1653   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1657   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1660   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1661   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1662   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1667   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1674   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1677   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1679   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1695   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1696   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1697   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1704   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1706   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1711   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1722   1.5 3.1 4.0
: h

Build failed in Hudson: Lucene-trunk #1201

2010-05-27 Thread Apache Hudson Server
See 

Changes:

[shaie] LUCENE-2455: Some house cleaning in addIndexes* (trunk)

--
[...truncated 7459 lines...]
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 

  [javadoc] Note: Custom tags that were not seen:  @lucene.experimental, 
@lucene.internal
  [jar] Building jar: 

 [echo] Building misc...

javadocs:
[mkdir] Created dir: 

  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.misc...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.5.0_22
  [javadoc] Building tree for all the packages and classes...
  [javadoc] 
:43:
 warning - Tag @link: reference not found: IndexWriter#addIndexes(IndexReader[])
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 

  [javadoc] Note: Custom tags that were not seen:  @lucene.internal
  [javadoc] 1 warning
  [jar] Building jar: 

 [echo] Building queries...

javadocs:
[mkdir] Created dir: 

  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package org.apache.lucene.search.regex...
  [javadoc] Loading source files for package org.apache.lucene.search.similar...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.5.0_22
  [javadoc] Building tree for all the packages and classes...
  [javadoc] 
:525:
 warning - Tag @see: reference not found: 
org.apache.lucene.analysis.StopFilter#makeStopSet StopFilter.makeStopSet()
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 

  [javadoc] Note: Custom tags that were not seen:  @lucene.experimental, 
@lucene.internal
  [javadoc] 1 warning
  [jar] Building jar: 

 [echo] Building queryparser...

javadocs:
[mkdir] Created dir: 

  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.analyzing...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.complexPhrase...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.builders...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.config...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.messages...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.nodes...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.parser...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.processors...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.core.util...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.ext...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.precedence...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.standard...
  [javadoc] Loading source files for package 
org.apache.lucene.queryParser.standard.builders...
  [javadoc] Loading source fi

Re: Solr version housekeeping in Jira/Wiki (1.5, 1.6, 3.x, 4.x, etc...)

2010-05-27 Thread Mattmann, Chris A (388J)
Good job, Hoss!

Cheers,
Chris


On 5/27/10 4:28 PM, "Chris Hostetter"  wrote:

:   ...and here's the final contents of all.fixed.txt...

Checkpoint: going sequentially down the list, i've manually fixed
everything up to an including SOLR-1611.  (only 66 left, woot!) ... i'll
finish up hte rest after dinner.

:
: https://issues.apache.org/jira/browse/SOLR-1131   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1139   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1177   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1268   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1297   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1302   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1357   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1379   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1432   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1516   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1522   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1532   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1538   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1553   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1558   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1561   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1563   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1569   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1570   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1571   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1572   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1574   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1577   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1579   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1580   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1582   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1584   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1586   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1587   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1588   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1590   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1592   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1593   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1595   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1596   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1601   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1608   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1610   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1611   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1615   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1621   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1624   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1625   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1628   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1635   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1637   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1651   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1653   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1657   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1660   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1661   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1662   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1667   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1674   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1677   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1679   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1695   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1696   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1697   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1704   1.5 3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1706   3.1 4.0
: https://issues.apache.org/jira/browse/SOLR-1711   1.5 3.1 4.0
: 

[jira] Updated: (SOLR-1615) backslash escaping bug

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1615:
---

Fix Version/s: 1.5
   3.1
   4.0
   (was: Next)


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> backslash escaping bug
> --
>
> Key: SOLR-1615
> URL: https://issues.apache.org/jira/browse/SOLR-1615
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.3
>Reporter: Yonik Seeley
>Assignee: Yonik Seeley
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1615.patch
>
>
> backslash escaping isn't done properly in quoted strings for the StrParser 
> which is used for function queries and for local params
> http://search.lucidimagination.com/search/document/7e4c934a3168e53e/character_escape_in_queryparsing_strparser_getquotedstring

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1621) Allow current single core deployments to be specified by solr.xml

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1621:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Allow current single core deployments to be specified by solr.xml
> -
>
> Key: SOLR-1621
> URL: https://issues.apache.org/jira/browse/SOLR-1621
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.5
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, 
> SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch, 
> SOLR-1621.patch, SOLR-1621.patch, SOLR-1621.patch
>
>
> supporting two different modes of deployments is turning out to be hard. This 
> leads to duplication of code. Moreover there is a lot of confusion on where 
> do we put common configuration. See the mail thread 
> http://markmail.org/message/3m3rqvp2ckausjnf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1628) log contains incorrect number of adds and deletes

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1628:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> log contains incorrect number of adds and deletes
> -
>
> Key: SOLR-1628
> URL: https://issues.apache.org/jira/browse/SOLR-1628
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 1.4
>Reporter: Yonik Seeley
> Fix For: 1.5, 3.1, 4.0
>
>
> LogUpdateProcessorFactory logs the wrong number of deletes/adds if more than 
> 8.
> http://search.lucidimagination.com/search/document/f75c6a5a58e205a4/minor_nit

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1624) Highlighter bug with MultiValued field + TermPositions optimization

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1624:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Highlighter bug with MultiValued field + TermPositions optimization
> ---
>
> Key: SOLR-1624
> URL: https://issues.apache.org/jira/browse/SOLR-1624
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.4
>Reporter: Chris Harris
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1624.patch
>
>
> When TermPositions are stored, then 
> DefaultSolrHighlighter.doHighlighting(DocList docs, Query query, 
> SolrQueryRequest req, String[] defaultFields) currently initializes tstream 
> only for the first value of a multi-valued field. (Subsequent times through 
> the loop reinitialization is preempted by tots being non-null.) This means 
> that the 2nd/3rd/etc. values are not considered for highlighting purposes, 
> resulting in missed highlights.
> I'm attaching a patch with a test case to demonstrate the problem 
> (testTermVecMultiValuedHighlight2), as well as a proposed fix. All 
> highlighter tests pass with this applied. The patch should apply cleanly 
> against the latest trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1625:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Add regexp support for TermsComponent
> -
>
> Key: SOLR-1625
> URL: https://issues.apache.org/jira/browse/SOLR-1625
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Uri Boness
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1625.patch, SOLR-1625.patch, SOLR-1625.patch
>
>
> At the moment the only way to filter the returned terms is by a prefix. It 
> would be nice it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1635) DOMUtils doesn't wrap NumberFormatExceptions with useful errors

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1635:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> DOMUtils doesn't wrap NumberFormatExceptions with useful errors
> ---
>
> Key: SOLR-1635
> URL: https://issues.apache.org/jira/browse/SOLR-1635
> Project: Solr
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1635.patch
>
>
> When parsing NamedList style XML, DOMUtils does a really crappy job of 
> reporting errors when it can't parse numeric types (ie:  , , 
> etc...)
> http://old.nabble.com/java.lang.NumberFormatException%3A-For-input-string%3A-%22%22-to26631247.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1637) Remove ALIAS command

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1637:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Remove ALIAS command
> 
>
> Key: SOLR-1637
> URL: https://issues.apache.org/jira/browse/SOLR-1637
> Project: Solr
>  Issue Type: Sub-task
>  Components: multicore
>Affects Versions: 1.5
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 1.5, 3.1, 4.0
>
>
> Mulicore makes the CoreContainer code more complex. We should remove it for 
> now and revisit it for a simpler cleaner implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1657) convert the rest of solr to use the new tokenstream API

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1657:
---

Fix Version/s: 4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> convert the rest of solr to use the new tokenstream API
> ---
>
> Key: SOLR-1657
> URL: https://issues.apache.org/jira/browse/SOLR-1657
> Project: Solr
>  Issue Type: Task
>Reporter: Robert Muir
>Assignee: Mark Miller
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-1657.patch, SOLR-1657.patch, SOLR-1657.patch, 
> SOLR-1657.patch, SOLR-1657_part2.patch, 
> SOLR-1657_synonyms_ugly_slightly_less_slow.patch, 
> SOLR-1657_synonyms_ugly_slow.patch
>
>
> org.apache.solr.analysis:
> -BufferedTokenStream-
>  -> -CommonGramsFilter-
>  -> -CommonGramsQueryFilter-
>  -> -RemoveDuplicatesTokenFilter-
> -CapitalizationFilterFactory-
> -HyphenatedWordsFilter-
> -LengthFilter (deprecated, remove)-
> SynonymFilter
> SynonymFilterFactory
> -WordDelimiterFilter-
> -org.apache.solr.handler:-
> -AnalysisRequestHandler-
> -AnalysisRequestHandlerBase-
> -org.apache.solr.handler.component:-
> -QueryElevationComponent-
> -SpellCheckComponent-
> -org.apache.solr.highlight:-
> -DefaultSolrHighlighter-
> -org.apache.solr.spelling:-
> -SpellingQueryConverter-

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1651) Incorrect dataimport handler package name in SolrResourceLoader

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1651:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Incorrect dataimport handler package name in SolrResourceLoader
> ---
>
> Key: SOLR-1651
> URL: https://issues.apache.org/jira/browse/SOLR-1651
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Akshay K. Ukey
>Assignee: Shalin Shekhar Mangar
>Priority: Trivial
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1651.patch
>
>
> "packages" String array used by findClass method in SolrResourceLoader has 
> value for dataimport handler package as "handler.dataimport", must be 
> "handler.dataimport."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1653) add PatternReplaceCharFilter

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1653:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> add PatternReplaceCharFilter
> 
>
> Key: SOLR-1653
> URL: https://issues.apache.org/jira/browse/SOLR-1653
> Project: Solr
>  Issue Type: New Feature
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Koji Sekiguchi
>Assignee: Koji Sekiguchi
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1653.patch, SOLR-1653.patch
>
>
> Add a new CharFilter that uses a regular expression for the target of replace 
> string in char stream.
> Usage:
> {code:title=schema.xml}
>  positionIncrementGap="100" >
>   
>  groupedPattern="([nN][oO]\.)\s*(\d+)"
> replaceGroups="1,2" blockDelimiters=":;"/>
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>   
> 
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1660) capitalizationfilter crashes if you use the maxWordCountOption

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1660:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> capitalizationfilter crashes if you use the maxWordCountOption
> --
>
> Key: SOLR-1660
> URL: https://issues.apache.org/jira/browse/SOLR-1660
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Robert Muir
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1660.patch
>
>
> because arrayCopys into null.
> if you want a testcase i can yank it out of in-progress patch from SOLR-1657, 
> but i think its obvious.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1661) Remove adminCore from CoreContainer

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1661:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e

> Remove adminCore from CoreContainer
> ---
>
> Key: SOLR-1661
> URL: https://issues.apache.org/jira/browse/SOLR-1661
> Project: Solr
>  Issue Type: Task
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Minor
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1661.patch, SOLR-1661.patch
>
>
> we have deprecated the admin core concept as a part of SOLR-1121. It can be 
> removed completely now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1662) BufferedTokenStream incorrect cloning

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1662:
---

Fix Version/s: 1.5
   3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> BufferedTokenStream incorrect cloning
> -
>
> Key: SOLR-1662
> URL: https://issues.apache.org/jira/browse/SOLR-1662
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Robert Muir
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1662.patch
>
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes 
> (BaseTokenTestCase) to use the new TokenStream API, but also with some 
> additional safety.
> {code}
>  public static String tsToString(TokenStream in) throws IOException {
> StringBuilder out = new StringBuilder();
> TermAttribute termAtt = (TermAttribute) 
> in.addAttribute(TermAttribute.class);
> // extra safety to enforce, that the state is not preserved and also
> // assign bogus values
> in.clearAttributes();
> termAtt.setTermBuffer("bogusTerm");
> while (in.incrementToken()) {
>   if (out.length() > 0)
> out.append(' ');
>   out.append(termAtt.term());
>   in.clearAttributes();
>   termAtt.setTermBuffer("bogusTerm");
> }
> in.close();
> return out.toString();
>   }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do 
> not clear or clone properly. In this case there is a problem with a 
> tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A 
> B but does not clone, so the values get overwritten.
> This can be fixed in two ways: 
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1674) improve analysis tests, cut over to new API

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1674:
---

Fix Version/s: 1.5
   3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3

> improve analysis tests, cut over to new API
> ---
>
> Key: SOLR-1674
> URL: https://issues.apache.org/jira/browse/SOLR-1674
> Project: Solr
>  Issue Type: Test
>  Components: Schema and Analysis
>Reporter: Robert Muir
>Assignee: Mark Miller
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1674.patch, SOLR-1674.patch, SOLR-1674_speedup.patch
>
>
> This patch
> * converts all analysis tests to use the new tokenstream api
> * converts most tests to use the more stringent assertion mechanisms from 
> lucene
> * adds new tests to improve coverage
> Most bugs found by more stringent testing have been fixed, with the exception 
> of SynonymFilter.
> The problems with this filter are more serious, the previous tests were 
> essentially a no-op.
> The new tests for SynonymFilter test the current behavior, but have FIXMEs 
> with what I think the old test wanted to expect in the comments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1667) PatternTokenizer does not clearAttributes()

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1667:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> PatternTokenizer does not clearAttributes()
> ---
>
> Key: SOLR-1667
> URL: https://issues.apache.org/jira/browse/SOLR-1667
> Project: Solr
>  Issue Type: Bug
>  Components: Schema and Analysis
>Affects Versions: 1.4
>Reporter: Robert Muir
>Assignee: Shalin Shekhar Mangar
> Fix For: 1.5, 3.1, 4.0
>
> Attachments: SOLR-1667.patch
>
>
> PatternTokenizer creates tokens, but never calls clearAttributes()
> because of this things like positionIncrementGap are never reset to their 
> default value.
> trivial patch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Updated: (SOLR-1677) Add support for o.a.lucene.util.Version for BaseTokenizerFactory and BaseTokenFilterFactory

2010-05-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-1677:
---

Fix Version/s: 3.1
   4.0


Correcting Fix Version based on CHANGES.txt, see this thread for more details...

http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3calpine.deb.1.10.1005251052040.24...@radix.cryptio.net%3e


> Add support for o.a.lucene.util.Version for BaseTokenizerFactory and 
> BaseTokenFilterFactory
> ---
>
> Key: SOLR-1677
> URL: https://issues.apache.org/jira/browse/SOLR-1677
> Project: Solr
>  Issue Type: Sub-task
>  Components: Schema and Analysis
>Reporter: Uwe Schindler
> Fix For: 3.1, 4.0
>
> Attachments: SOLR-1677-lucenetrunk-branch-2.patch, 
> SOLR-1677-lucenetrunk-branch-3.patch, SOLR-1677-lucenetrunk-branch.patch, 
> SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch
>
>
> Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards 
> compatibility with old indexes created using older versions of Lucene. The 
> most important example is StandardTokenizer, which changed its behaviour with 
> posIncr and incorrect host token types in 2.4 and also in 2.9.
> In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with 
> much more Unicode support, almost every Tokenizer/TokenFilter needs this 
> Version parameter. In 2.9, the deprecated old ctors without Version take 
> LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer.
> This patch adds basic support for the Lucene Version property to the base 
> factories. Subclasses then can use the luceneMatchVersion decoded enum (in 
> 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently 
> contains a helper map to decode the version strings, but in 3.0 is can be 
> replaced by Version.valueOf(String), as the Version is a subclass of Java5 
> enums. The default value is Version.LUCENE_24 (as this is the default for the 
> no-version ctors in Lucene).
> This patch also removes unneeded conversions to CharArraySet from 
> StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed 
> to match Lucene 3.0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >