date:20080520

Hudson build is back to normal: Solr-trunk #446

2008-05-20 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Solr-trunk/446/changes

Release of SOLR 1.3

2008-05-20 Thread Andrew Savory

Hi,

(discussion moved from -user to -dev)

2008/5/19 Chris Hostetter <[EMAIL PROTECTED]>:

> If people are particularly eager to see a 1.3 release, the best thing to
> do is subscribe to solr-dev and start a dialog there about what issues
> people thing are "show stopers" for 1.3 and what assistance the various
> people working on those issues can use.

So, what are the show stoppers, how can we help, what can we reassign
to a future release?

Taking a look through the list there's quite a few issues with patches
attached that aren't applied yet. Clearing these out would cut the
open bug count by almost half:

SOLR-515
SOLR-438
SOLR-351 (applied?)
SOLR-281 (applied?)
SOLR-424
SOLR-243 (stuck in review hell?)
SOLR-433
SOLR-510
SOLR-139
SOLR-521 (applied, waiting to be closed)
SOLR-284
SOLR-560
SOLR-469
SOLR-572
SOLR-565

It's a little weird to see patch 'development' going on in JIRA
(sometimes for over a year), rather than getting the patches into svn
and then working there... I'd worry that some valuable code history is
getting lost along the way? Yes, it's a tough call between adding
'bad' code and waiting for the perfect patch, but bad code creates
healthy communities and is better than no code :-)


Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/

[jira] Updated: (SOLR-303) Distributed Search over HTTP

2008-05-20 Thread Lars Kotthoff (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Kotthoff updated SOLR-303:
---

Attachment: solr-dist-faceting-non-ascii-all.patch

I've had a couple of issues with the current version. First, the facet queries 
which are sent to the other shards are posted in the URL, but aren't URL 
encoded, i.e. during the refine stage anything non-ascii results in facet 
counts for "new" values (i.e. the garbled version) coming back and causing NPEs 
when trying to update the counts.

Furthermore, facet.limit= isn't working as expected, i.e. 
instead of all facets it returns none. Also facet.sort is not automatically 
enabled for negative values.

I've attached "solr-dist-faceting-non-ascii-all.patch" which fixes the above 
issues. Somebody who understands what everything is supposed to do should have 
a look over it though :)
For example I've found two linked hash maps in FacetInfo, topFacets and 
listFacets, which seem to serve the same purpose. Therefore I replaced them by 
a single hash map. It seems to work just fine this way.

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch, shards_qt.patch, solr-dist-faceting-non-ascii-all.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Release of SOLR 1.3

2008-05-20 Thread Noble Paul നോബിള്‍ नोब्ळ्

+1
The code has changed so radically between Solr1.2 and Solr1.3 .Because
1.3 is not released most of us have to stick to 1.2 . So anything that
we build must work on 1.2 and if I wish to contribute back to Solr it
has to be 1.3 compatible. SOLR-469 is a good example where I had to
really hack my code hard to ensure that I contained the version
specific dependencies to one file .

This is a good starting point . Let us get the list of issues which
can be easily fixed and apply the patches and push out a release .
--Noble


On Tue, May 20, 2008 at 2:23 PM, Andrew Savory <[EMAIL PROTECTED]> wrote:
> Hi,
>
> (discussion moved from -user to -dev)
>
> 2008/5/19 Chris Hostetter <[EMAIL PROTECTED]>:
>
>> If people are particularly eager to see a 1.3 release, the best thing to
>> do is subscribe to solr-dev and start a dialog there about what issues
>> people thing are "show stopers" for 1.3 and what assistance the various
>> people working on those issues can use.
>
> So, what are the show stoppers, how can we help, what can we reassign
> to a future release?
>
> Taking a look through the list there's quite a few issues with patches
> attached that aren't applied yet. Clearing these out would cut the
> open bug count by almost half:
>
> SOLR-515
> SOLR-438
> SOLR-351 (applied?)
> SOLR-281 (applied?)
> SOLR-424
> SOLR-243 (stuck in review hell?)
> SOLR-433
> SOLR-510
> SOLR-139
> SOLR-521 (applied, waiting to be closed)
> SOLR-284
> SOLR-560
> SOLR-469
> SOLR-572
> SOLR-565
>
> It's a little weird to see patch 'development' going on in JIRA
> (sometimes for over a year), rather than getting the patches into svn
> and then working there... I'd worry that some valuable code history is
> getting lost along the way? Yes, it's a tough call between adding
> 'bad' code and waiting for the perfect patch, but bad code creates
> healthy communities and is better than no code :-)
>
>
> Andrew.
> --
> [EMAIL PROTECTED] / [EMAIL PROTECTED]
> http://www.andrewsavory.com/
>



-- 
--Noble Paul

how to add a new parameter to solr request

2008-05-20 Thread khirb7


Hello every body

I want to modify  a little bit the behaviour of Solr and I want to know if
it is possible; Here is my problem :
I give to Solr document to index which UniqueKey Field is based on the Url
and the  Time at which the croawler downloaded it  so  UniqueKey is a digit
obtained like that  MyAlgo(Url+Time); the problem occur at searching time
solr return me the result which contain duplication it means for example the 
10 first result correspond to the same  web page with the same content 
because in  fact it is the same Url. So I  want to remove this duplication,
so I want to add a parameter  in the solr request for example  permitdupp
which takes values (true or false ) if  permitdupp= true I will let the
default Solr behaviour but if permitdupp=false I want to remouve all the
duplicative document and just to keep the recent indexed document (to get
the one recent my documents contain a date field ) .
So I want to know which is the easiest way to do this;
may be there is solr parametters I have to use (faceting???).or 
Programmatically : in that case  which classes I have to modify or  I have
to inherit  from to develop this solution.
any suggestion  is welcome. and thank you in advance.







-- 
View this message in context: 
http://www.nabble.com/how-to-add-a-new-parameter-to-solr-request-tp17338190p17338190.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Created: (SOLR-579) Extend SimplePost with RecurseDirectories, threads, document encoding , number of docs per commit

2008-05-20 Thread Patrick Debois (JIRA)

Extend SimplePost with RecurseDirectories, threads, document encoding , number 
of docs per commit
-

 Key: SOLR-579
 URL: https://issues.apache.org/jira/browse/SOLR-579
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.3
 Environment: Applies to all platforms
Reporter: Patrick Debois
Priority: Minor
 Fix For: 1.3


-When specifying a directory, simplepost should read also the contents of a  
directory

New options for the commandline (some only usefull in DATAMODE= files)
-RECURSEDIRS
Recursive read of directories as an option, this is usefull for 
directories with a lot of files where the commandline expansion fails and xargs 
is too slow
-DOCENCODING (default = system encoding or UTF-8) 
For non utf-8 clients , simplepost should include a way to set the 
encoding of the documents posted
-THREADSIZE (default =1 ) 
For large volume posts, a threading pool makes sense , using JDK 1.5 
Threadpool model
-DOCSPERCOMMIT (default = 1)
Number of documents after which a commit is done, instead of only at 
the end

Note: not to break the existing behaviour of the existing SimplePost tool 
(post.sh) might be used in scripts 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-580) Filte Query: Retrieve all docs with facets missing

2008-05-20 Thread Patrick Debois (JIRA)

Filte Query: Retrieve  all docs with facets missing
---

 Key: SOLR-580
 URL: https://issues.apache.org/jira/browse/SOLR-580
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Patrick Debois
Priority: Minor


Consider this list

facetA - 10
facetB - 20
facets missing  - 30

For facetA and facetB it is easy to select the correct fq=FACET:value . But to 
be able to see the document that have missing facets one needs to specifiy a 
NOT fq= for every value in the facet.
Therefore a kind of short hand would be usefull to select all documents that 
have a facet missing. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-581) Typo fillInteristingTermsFromMLTQuery

2008-05-20 Thread Patrick Debois (JIRA)

Typo fillInteristingTermsFromMLTQuery
-

 Key: SOLR-581
 URL: https://issues.apache.org/jira/browse/SOLR-581
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Patrick Debois
Priority: Trivial


There is a typo in  MoreLikeThisHandler.java

fillInteristingTermsFromMLTQuery

should read

fillInterestingTermsFromMLTQuery



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-582) Field Aliasing

2008-05-20 Thread Patrick Debois (JIRA)

Field Aliasing
--

 Key: SOLR-582
 URL: https://issues.apache.org/jira/browse/SOLR-582
 Project: Solr
  Issue Type: New Feature
Reporter: Patrick Debois
Priority: Minor


XML that are indexed are often using meaningfull, fullblown names for their 
fields.

For powersearching shorthand for these terms would come in handy. 
This would also help for hard to remember values where one could specify 
multiple names for the same field.
Also for multi lingual queries this would be interesting.

Is guess there should be a config file that is read by the queryparser, 
substibuting terms to their canonical values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-564) Realtime search in Solr

2008-05-20 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598308#action_12598308
 ] 

Jason Rutherglen commented on SOLR-564:
---

After review, function queries ValueSource should be fine because the values 
are not loaded from the field cache until the query is executed in the 
sub-searcher.  Sort is not mentioned because it is handled in the sub-searcher. 
 LUCENE-831 is a good step, however, if used for the top level reader in a 
realtime system, large arrays will constantly be created for the top level 
reader after every transaction.  This is why the work, meaning the query and 
the results, should be performed in the sub-searcher and then merged.  

SimpleFacets.getFieldCacheCounts should be placed in the SolrIndexSearcher in 
order for it to be operable which will be placed in SOLR-567.

The Ocean Solr code patch will be attached to this issue.  

> Realtime search in Solr
> ---
>
> Key: SOLR-564
> URL: https://issues.apache.org/jira/browse/SOLR-564
> Project: Solr
>  Issue Type: New Feature
>  Components: replication, search
>Affects Versions: 1.3
>Reporter: Jason Rutherglen
>
> Before when I looked at this, the changes required to make Solr realtime 
> would seem to break the rest of Solr.  Is this still the case?  In project 
> Ocean http://code.google.com/p/oceansearch/ there is a realtime core however 
> integrating into Solr has looked like a redesign of the guts of Solr.  
> - Support for replication per update to transaction log
> - Custom realtime index creation
> - Filter and facet merging
> - Custom IndexSearcher that ties into realtime subsystem
> - Custom SolrCore that ties into realtime subsystem
> Is there a way to plug into these low level Solr functions without a massive 
> redesign?  A key area of concern is the doclist caching which is not used in 
> realtime search because after every update the doclists are no longer valid.  
> The doclist caching and handling is default in SolrCore.  Ocean relies on a 
> custom threaded MultiSearcher rather than a single IndexSearcher is a 
> difficulty.  DirectUpdateHandler2 works directly on IndexWriter is 
> problematic.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-567) SolrCore Pluggable

2008-05-20 Thread Jason Rutherglen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen updated SOLR-567:
--

Attachment: solr-567.patch

solr-567.patch

Moved SimpleFacets.getFieldCacheCounts to SolrIndexSearcher to allow an 
alternate SolrCore to use a different implementation due to direct top level 
field cache access.

> SolrCore Pluggable
> --
>
> Key: SOLR-567
> URL: https://issues.apache.org/jira/browse/SOLR-567
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Jason Rutherglen
> Attachments: solr-567.patch, solr-567.patch
>
>
> SolrCore needs to be an abstract class with the existing functionality in a 
> subclass.  SolrIndexSearcher the same.  It seems that most of the Searcher 
> methods in SolrIndexSearcher are not used.  The new abstract class need only 
> have the methods used by the other Solr classes.  This will allow other 
> indexing and search implementations to reuse the other parts of Solr.  Any 
> other classes that have functionality specific to the Solr implementation of 
> indexing and replication such as SolrConfig can be made abstract.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-580) Filte Query: Retrieve all docs with facets missing

2008-05-20 Thread Erik Hatcher (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher resolved SOLR-580.
---

Resolution: Invalid

You can actually constrain an fq on all documents that do _not_ have a value in 
a particular field using &fq=-field:[* TO *]

> Filte Query: Retrieve  all docs with facets missing
> ---
>
> Key: SOLR-580
> URL: https://issues.apache.org/jira/browse/SOLR-580
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Patrick Debois
>Priority: Minor
>
> Consider this list
> facetA - 10
> facetB - 20
> facets missing  - 30
> For facetA and facetB it is easy to select the correct fq=FACET:value . But 
> to be able to see the document that have missing facets one needs to specifiy 
> a NOT fq= for every value in the facet.
> Therefore a kind of short hand would be usefull to select all documents that 
> have a facet missing. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: how to add a new parameter to solr request

2008-05-20 Thread khirb7


hello every body
I want just to add this example to be more clear. I have this result from
solr.


−

1
http://www.sarkozy.fr
01/01/2008

−

2
http://www.sarkozy.fr
31/01/2008

−

3
http://www.sarkozy.fr
15/01/2008

 .
 . 
 .


Note that it's the same field   DocUrl (http://www.sarkozy.fr) for the three
shown document above. I want to get in  the result something like that.


−

2
http://www.sarkozy.fr
31/01/2008




 .
 . 
 .

keep the recent one.

How to deal with that. Thank you in advance.




-- 
View this message in context: 
http://www.nabble.com/how-to-add-a-new-parameter-to-solr-request-tp17338190p17344135.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Updated: (SOLR-556) Highlighting of multi-valued fields returns snippets which span multiple different values

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-556:


Fix Version/s: 1.3

> Highlighting of multi-valued fields returns snippets which span multiple 
> different values
> -
>
> Key: SOLR-556
> URL: https://issues.apache.org/jira/browse/SOLR-556
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Affects Versions: 1.3
> Environment: Tomcat 5.5
>Reporter: Lars Kotthoff
>Assignee: Mike Klaas
>Priority: Minor
> Fix For: 1.3
>
> Attachments: solr-highlight-multivalued-example.xml, 
> solr-highlight-multivalued.patch
>
>
> When highlighting multi-valued fields, the highlighter sometimes returns 
> snippets which span multiple values, e.g. with values "foo" and "bar" and 
> search term "ba" the highlighter will create the snippet "foobar". 
> Furthermore it sometimes returns smaller snippets than it should, e.g. with 
> value "foobar" and search term "oo" it will create the snippet "oo" 
> regardless of hl.fragsize.
> I have been unable to determine the real cause for this, or indeed what 
> actually goes on at all. To reproduce the problem, I've used the following 
> steps:
> * create an index with multi-valued fields, one document should have at least 
> 3 values for these fields (in my case strings of length between 5 and 15 
> Japanese characters -- as far as I can tell plain old ASCII should produce 
> the same effect though)
> * search for part of a value in such a field with highlighting enabled, the 
> additional parameters I use are hl.fragsize=70, hl.requireFieldMatch=true, 
> hl.mergeContiguous=true (changing the parameters does not seem to have any 
> effect on the result though)
> * highlighted snippets should show effects described above

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-536) Automatic binding of results to Beans (for solrj)

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-536:


Fix Version/s: (was: 1.3)

> Automatic binding of results to Beans (for solrj)
> -
>
> Key: SOLR-536
> URL: https://issues.apache.org/jira/browse/SOLR-536
> Project: Solr
>  Issue Type: New Feature
>  Components: clients - java
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Attachments: SOLR-536.patch
>
>
> as we are using java5 .we can use annotations to bind SolrDocument to java 
> beans directly.
> This can make the usage of solrj a  bit simpler
> The QueryResponse class in solrj can have an extra method as follows
> public  List getResultBeans(Class klass)
> and the bean can have annotations as
> class MyBean{
> @Field("id") //name is optional
> String id;
> @Field("category")
> List categories
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-579) Extend SimplePost with RecurseDirectories, threads, document encoding , number of docs per commit

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-579:


Fix Version/s: (was: 1.3)

> Extend SimplePost with RecurseDirectories, threads, document encoding , 
> number of docs per commit
> -
>
> Key: SOLR-579
> URL: https://issues.apache.org/jira/browse/SOLR-579
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.3
> Environment: Applies to all platforms
>Reporter: Patrick Debois
>Priority: Minor
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> -When specifying a directory, simplepost should read also the contents of a  
> directory
> New options for the commandline (some only usefull in DATAMODE= files)
> -RECURSEDIRS
> Recursive read of directories as an option, this is usefull for 
> directories with a lot of files where the commandline expansion fails and 
> xargs is too slow
> -DOCENCODING (default = system encoding or UTF-8) 
> For non utf-8 clients , simplepost should include a way to set the 
> encoding of the documents posted
> -THREADSIZE (default =1 ) 
> For large volume posts, a threading pool makes sense , using JDK 1.5 
> Threadpool model
> -DOCSPERCOMMIT (default = 1)
> Number of documents after which a commit is done, instead of only at 
> the end
> Note: not to break the existing behaviour of the existing SimplePost tool 
> (post.sh) might be used in scripts 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-383) Add support for globalization/culture management

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas resolved SOLR-383.
-

   Resolution: Fixed
Fix Version/s: (was: 1.3)

> Add support for globalization/culture management
> 
>
> Key: SOLR-383
> URL: https://issues.apache.org/jira/browse/SOLR-383
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - C#
>Affects Versions: 1.3
>Reporter: Jeff Rodenburg
>Assignee: Jeff Rodenburg
>Priority: Minor
>
> SolrSharp should supply configuration and/or programmatic control over 
> windows culture settings.  This is important for working with data being 
> saved to indexes that carry certain formatting expectations for various types 
> of fields, both in SolrSharp as well as the solr field counterparts on the 
> server side.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-563) Contrib area for Solr

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-563:


Fix Version/s: (was: 1.3)

> Contrib area for Solr
> -
>
> Key: SOLR-563
> URL: https://issues.apache.org/jira/browse/SOLR-563
> Project: Solr
>  Issue Type: Task
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Attachments: SOLR-563.patch
>
>
> Add a contrib area for Solr and modify existing build.xml to build, package 
> and distribute contrib projects also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-565) Component to abstract shards from clients

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-565:


Fix Version/s: (was: 1.3)

> Component to abstract shards from clients
> -
>
> Key: SOLR-565
> URL: https://issues.apache.org/jira/browse/SOLR-565
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: patrick o'leary
>Priority: Minor
> Attachments: distributor_component.patch
>
>
> A component that will remove the need for calling clients to provide the 
> shards parameter for
> a distributed search. 
> As systems grow, it's better to manage shards with in solr, rather than 
> managing each client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-551) SOlr replication should include the schema also

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-551:


Fix Version/s: (was: 1.3)

> SOlr replication should include the schema also
> ---
>
> Key: SOLR-551
> URL: https://issues.apache.org/jira/browse/SOLR-551
> Project: Solr
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 1.3
>Reporter: Noble Paul
>
> The current Solr replication just copy the data directory . So if the
> schema changes and I do a re-index it will blissfully copy the index
> and the slaves will fail because of incompatible schema.
> So the steps we follow are
>  * Stop rsync on slaves
>  * Update the master with new schema
>  * re-index data
>  * forEach slave
>  ** Kill the slave
>  ** clean the data directory
>  ** install the new schema
>  ** restart
>  ** do a manual snappull
> The amount of work the admin needs to do is quite significant
> (depending on the no:of slaves). These are manual steps and very error
> prone
> The solution :
> Make the replication mechanism handle the schema replication also. So
> all I need to do is to just change the master and the slaves synch
> automatically
> What is a good way to implement this?
> We have an idea along the following lines
> This should involve changes to the snapshooter and snappuller scripts
> and the snapinstaller components
> Everytime the snapshooter takes a snapshot it must keep the timestamps
> of schema.xml and elevate.xml (all the files which might affect the
> runtime behavior in slaves)
> For subsequent snapshots if the timestamps of any of them is changed
> it must copy the all of them also for replication.
> The snappuller copies the new directory as usual
> The snapinstaller checks if these config files are present ,
> if yes,
>  * It can create a temporary core
>  * install the changed index and configuration
>  * load it completely and swap it out with the original core

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-561) Solr replication by Solr (for windows also)

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-561:


Fix Version/s: (was: 1.3)

> Solr replication by Solr (for windows also)
> ---
>
> Key: SOLR-561
> URL: https://issues.apache.org/jira/browse/SOLR-561
> Project: Solr
>  Issue Type: New Feature
>  Components: replication
>Affects Versions: 1.3
> Environment: All
>Reporter: Noble Paul
>
> The current replication strategy in solr involves shell scripts . The 
> following are the drawbacks with the approach
> *  It does not work with windows
> * Replication works as a separate piece not integrated with solr.
> * Cannot control replication from solr admin/JMX
> * Each operation requires manual telnet to the host
> Doing the replication in java has the following advantages
> * Platform independence
> * Manual steps can be completely eliminated. Everything can be driven from 
> solrconfig.xml .
> ** Adding the url of the master in the slaves should be good enough to enable 
> replication. Other things like frequency of
> snapshoot/snappull can also be configured . All other information can be 
> automatically obtained.
> * Start/stop can be triggered from solr/admin or JMX
> * Can get the status/progress while replication is going on. It can also 
> abort an ongoing replication
> * No need to have a login into the machine 
> This issue can track the implementation of solr replication in java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-506) Enabling HTTP Cache headers should be configurable on a per-handler basis

2008-05-20 Thread Mike Klaas (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Klaas updated SOLR-506:


Fix Version/s: (was: 1.3)

> Enabling HTTP Cache headers should be configurable on a per-handler basis
> -
>
> Key: SOLR-506
> URL: https://issues.apache.org/jira/browse/SOLR-506
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
>
> HTTP cache headers are needed only for select handler's response and it does 
> not make much sense to enable it globally for all Solr responses.
> Therefore, enabling/disabling cache headers should be configurable on a 
> per-handler basis. It should be enabled by default on the select request 
> handler and disabled by default on all others. It should be possible to 
> override these defaults through configuration as well as through API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-582) Field Aliasing

2008-05-20 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598403#action_12598403
 ] 

Hoss Man commented on SOLR-582:
---

this sounds like a subset of the brainstorming in this wiki page...

http://wiki.apache.org/solr/FieldAliasesAndGlobsInParams

> Field Aliasing
> --
>
> Key: SOLR-582
> URL: https://issues.apache.org/jira/browse/SOLR-582
> Project: Solr
>  Issue Type: New Feature
>Reporter: Patrick Debois
>Priority: Minor
>
> XML that are indexed are often using meaningfull, fullblown names for their 
> fields.
> For powersearching shorthand for these terms would come in handy. 
> This would also help for hard to remember values where one could specify 
> multiple names for the same field.
> Also for multi lingual queries this would be interesting.
> Is guess there should be a config file that is read by the queryparser, 
> substibuting terms to their canonical values.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-579) Extend SimplePost with RecurseDirectories, threads, document encoding , number of docs per commit

2008-05-20 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598407#action_12598407
 ] 

Hoss Man commented on SOLR-579:
---

FWIW: SimplePostTool isn't intended to really have ... "features". It exists 
purely to provided a cross platform way for people to index the data necessary 
for the tutorial.

i'm -1 on enhancing it in ways that could encourage people to think of it as a 
general purpose reusable tool.

> Extend SimplePost with RecurseDirectories, threads, document encoding , 
> number of docs per commit
> -
>
> Key: SOLR-579
> URL: https://issues.apache.org/jira/browse/SOLR-579
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.3
> Environment: Applies to all platforms
>Reporter: Patrick Debois
>Priority: Minor
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> -When specifying a directory, simplepost should read also the contents of a  
> directory
> New options for the commandline (some only usefull in DATAMODE= files)
> -RECURSEDIRS
> Recursive read of directories as an option, this is usefull for 
> directories with a lot of files where the commandline expansion fails and 
> xargs is too slow
> -DOCENCODING (default = system encoding or UTF-8) 
> For non utf-8 clients , simplepost should include a way to set the 
> encoding of the documents posted
> -THREADSIZE (default =1 ) 
> For large volume posts, a threading pool makes sense , using JDK 1.5 
> Threadpool model
> -DOCSPERCOMMIT (default = 1)
> Number of documents after which a commit is done, instead of only at 
> the end
> Note: not to break the existing behaviour of the existing SimplePost tool 
> (post.sh) might be used in scripts 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-580) Filte Query: Retrieve all docs with facets missing

2008-05-20 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598411#action_12598411
 ] 

Hoss Man commented on SOLR-580:
---

i'm confused ... assuming these facet counts are for field "facetField" then 
can't all the docs counted by facet.missing be retrieved using: 
{{fq=-facetField:[* TO *]}} ?

> Filte Query: Retrieve  all docs with facets missing
> ---
>
> Key: SOLR-580
> URL: https://issues.apache.org/jira/browse/SOLR-580
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Patrick Debois
>Priority: Minor
>
> Consider this list
> facetA - 10
> facetB - 20
> facets missing  - 30
> For facetA and facetB it is easy to select the correct fq=FACET:value . But 
> to be able to see the document that have missing facets one needs to specifiy 
> a NOT fq= for every value in the facet.
> Therefore a kind of short hand would be usefull to select all documents that 
> have a facet missing. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-580) Filte Query: Retrieve all docs with facets missing

2008-05-20 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598413#action_12598413
 ] 

Hoss Man commented on SOLR-580:
---

whoops ... really, REALLY, stale comment collision.

> Filte Query: Retrieve  all docs with facets missing
> ---
>
> Key: SOLR-580
> URL: https://issues.apache.org/jira/browse/SOLR-580
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: Patrick Debois
>Priority: Minor
>
> Consider this list
> facetA - 10
> facetB - 20
> facets missing  - 30
> For facetA and facetB it is easy to select the correct fq=FACET:value . But 
> to be able to see the document that have missing facets one needs to specifiy 
> a NOT fq= for every value in the facet.
> Therefore a kind of short hand would be usefull to select all documents that 
> have a facet missing. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Release of SOLR 1.3

2008-05-20 Thread Mike Klaas



On 20-May-08, at 1:53 AM, Andrew Savory wrote:

2008/5/19 Chris Hostetter <[EMAIL PROTECTED]>:

If people are particularly eager to see a 1.3 release, the best  
thing to
do is subscribe to solr-dev and start a dialog there about what  
issues
people thing are "show stopers" for 1.3 and what assistance the  
various

people working on those issues can use.


So, what are the show stoppers, how can we help, what can we reassign
to a future release?


I've gone and reassigned a bunch of issues that were labeled "1.3" by  
the original submitter, if the submitter is not a committer (perhaps  
this field shouldn't be editable by everyone).  That still leaves many  
issues, several of which I don't think are critical for 1.3.


I propose that we follow an "ownership" process for getting this  
release out the door: we give committers a week to fill in the  
"assigned to" field in JIRA for the 1.3 issues.  Any issue that isn't  
assigned after one week gets moved to a future release.  Then we can  
each evaluate the issues we are responsible for.


Any non-1.3-marked issues should be added at this time too.


Taking a look through the list there's quite a few issues with patches
attached that aren't applied yet. Clearing these out would cut the
open bug count by almost half:


But then we'd have to open bug reports for each one that says "make  
sure this actually works and that it is the correct direction for  
Solr" :)



It's a little weird to see patch 'development' going on in JIRA
(sometimes for over a year), rather than getting the patches into svn
and then working there... I'd worry that some valuable code history is
getting lost along the way? Yes, it's a tough call between adding
'bad' code and waiting for the perfect patch, but bad code creates
healthy communities and is better than no code :-)


Committing the code to trunk creates a path dependence and  
responsibility for maintaining the code.  There would also be a high  
probability of trunk never being in a releasable state, given the  
chance of there being a half-baked idea in trunk that we don't want to  
be bound to for the rest of Solr's lifetime.


(incidentally, this is the same philosophy we apply at my company,  
except that development is usually done in branches rather than  
patches.)


-Mike

Re: Release of SOLR 1.3

2008-05-20 Thread Shalin Shekhar Mangar

+1 for your suggestions Mike.

I'd like to see a few of the smaller issues get committed in 1.3 such as
SOLR-256 (JMX), SOLR-536 (binding for SolrJ), SOLR-430 (SpellChecker support
in SolrJ) etc. Also, SOLR-561 (replication by Solr) would be really cool to
have in the next release. Noble and I are working on it and plan to give a
patch soon.

Mike -- you removed SOLR-563 (Contrib area for Solr) from 1.3 but it is a
dependency for SOLR-469 (DataImportHandler) as it was decided to have
DataImportHandler as a contrib project. It would also be good to have a
rough release roadmaps to work against. Can fixed release cycle (say every 6
months) work for Solr?

On Wed, May 21, 2008 at 12:45 AM, Mike Klaas <[EMAIL PROTECTED]> wrote:

>
> On 20-May-08, at 1:53 AM, Andrew Savory wrote:
>
>> 2008/5/19 Chris Hostetter <[EMAIL PROTECTED]>:
>>
>>  If people are particularly eager to see a 1.3 release, the best thing to
>>> do is subscribe to solr-dev and start a dialog there about what issues
>>> people thing are "show stopers" for 1.3 and what assistance the various
>>> people working on those issues can use.
>>>
>>
>> So, what are the show stoppers, how can we help, what can we reassign
>> to a future release?
>>
>
> I've gone and reassigned a bunch of issues that were labeled "1.3" by the
> original submitter, if the submitter is not a committer (perhaps this field
> shouldn't be editable by everyone).  That still leaves many issues, several
> of which I don't think are critical for 1.3.
>
> I propose that we follow an "ownership" process for getting this release
> out the door: we give committers a week to fill in the "assigned to" field
> in JIRA for the 1.3 issues.  Any issue that isn't assigned after one week
> gets moved to a future release.  Then we can each evaluate the issues we are
> responsible for.
>
> Any non-1.3-marked issues should be added at this time too.
>
>  Taking a look through the list there's quite a few issues with patches
>> attached that aren't applied yet. Clearing these out would cut the
>> open bug count by almost half:
>>
>
> But then we'd have to open bug reports for each one that says "make sure
> this actually works and that it is the correct direction for Solr" :)
>
>  It's a little weird to see patch 'development' going on in JIRA
>> (sometimes for over a year), rather than getting the patches into svn
>> and then working there... I'd worry that some valuable code history is
>> getting lost along the way? Yes, it's a tough call between adding
>> 'bad' code and waiting for the perfect patch, but bad code creates
>> healthy communities and is better than no code :-)
>>
>
> Committing the code to trunk creates a path dependence and responsibility
> for maintaining the code.  There would also be a high probability of trunk
> never being in a releasable state, given the chance of there being a
> half-baked idea in trunk that we don't want to be bound to for the rest of
> Solr's lifetime.
>
> (incidentally, this is the same philosophy we apply at my company, except
> that development is usually done in branches rather than patches.)
>
> -Mike
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Release of SOLR 1.3

2008-05-20 Thread Andrew Savory

Hi Mike,

On 20/05/2008, Mike Klaas <[EMAIL PROTECTED]> wrote:

>  I've gone and reassigned a bunch of issues that were labeled "1.3" by the
> original submitter, if the submitter is not a committer (perhaps this field
> shouldn't be editable by everyone).  That still leaves many issues, several
> of which I don't think are critical for 1.3.

Cool, thanks for that. Indeed, assigning issues to releases should
only be possible by committers.

> > Taking a look through the list there's quite a few issues with patches
> > attached that aren't applied yet. Clearing these out would cut the
> > open bug count by almost half:
> >
>
>  But then we'd have to open bug reports for each one that says "make sure
> this actually works and that it is the correct direction for Solr" :)

Heh. Thankfully many of the patches look well-tested and extremely
well discussed already, so I'd hope they wouldn't require too many
followup issues!

> > It's a little weird to see patch 'development' going on in JIRA
> > (sometimes for over a year), rather than getting the patches into svn
> > and then working there... I'd worry that some valuable code history is
> > getting lost along the way? Yes, it's a tough call between adding
> > 'bad' code and waiting for the perfect patch, but bad code creates
> > healthy communities and is better than no code :-)
>
>  Committing the code to trunk creates a path dependence and responsibility
> for maintaining the code.  There would also be a high probability of trunk
> never being in a releasable state, given the chance of there being a
> half-baked idea in trunk that we don't want to be bound to for the rest of
> Solr's lifetime.

I'd tend to disagree: committing the patches to trunk allows
widespread testing and the chance for wider review of the code to see
if it does what it should. Only when the code is part of a release is
there any obligation to a proper lifecycle (ongoing support,
deprecation, then finally removal).

Of course, being concerned for the state of trunk is a good thing
overall, but it seems from my casual observation that some
contributions that are far from half-baked are not making it into
trunk: this is even worse as it might lead to an unnaturally short
lifetime for Solr.

>  (incidentally, this is the same philosophy we apply at my company, except
> that development is usually done in branches rather than patches.)

Sure, I'm currently working in a branch-per-feature environment, and
it has some advantages for a corporate environment with no community
concerns. But here we're talking about consensus-driven open
development, for which a more open approach may be appropriate. True,
it may seem chaotic and perhaps a bit risky - but with enough eyes on
the code we can mitigate that risk.

And hey, if some contributions are really controversial, there's
always the option to do more branches (or even set up a scratchpad).

Just my €0.02!

Andrew.
--
[EMAIL PROTECTED] / [EMAIL PROTECTED]
http://www.andrewsavory.com/

[jira] Commented: (SOLR-565) Component to abstract shards from clients

2008-05-20 Thread Jayson Minard (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598473#action_12598473
 ] 

Jayson Minard commented on SOLR-565:


Another item to consider:

Some times you want to control which shards participate in any given query.  
This is an important optimization for large scale deployments that need to 
quickly subset what is queried so that they do not waste CPU of irrelevant 
shards.  

> Component to abstract shards from clients
> -
>
> Key: SOLR-565
> URL: https://issues.apache.org/jira/browse/SOLR-565
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: patrick o'leary
>Priority: Minor
> Attachments: distributor_component.patch
>
>
> A component that will remove the need for calling clients to provide the 
> shards parameter for
> a distributed search. 
> As systems grow, it's better to manage shards with in solr, rather than 
> managing each client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-551) SOlr replication should include the schema also

2008-05-20 Thread Jayson Minard (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598476#action_12598476
 ] 

Jayson Minard commented on SOLR-551:


Why is the schema stored outside of the index?  Is another possible option to 
store it in a magic record within the index?  That allows anyone to retrieve it 
that wants to see the schema, for example the UI might want to know the static 
fields quickly and can use the schema to determine that information.  

Basically, can some meta-data about the index be stored in the index which 
solves the replication problem, and makes it more easily accessible to the 
outside world?

> SOlr replication should include the schema also
> ---
>
> Key: SOLR-551
> URL: https://issues.apache.org/jira/browse/SOLR-551
> Project: Solr
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 1.3
>Reporter: Noble Paul
>
> The current Solr replication just copy the data directory . So if the
> schema changes and I do a re-index it will blissfully copy the index
> and the slaves will fail because of incompatible schema.
> So the steps we follow are
>  * Stop rsync on slaves
>  * Update the master with new schema
>  * re-index data
>  * forEach slave
>  ** Kill the slave
>  ** clean the data directory
>  ** install the new schema
>  ** restart
>  ** do a manual snappull
> The amount of work the admin needs to do is quite significant
> (depending on the no:of slaves). These are manual steps and very error
> prone
> The solution :
> Make the replication mechanism handle the schema replication also. So
> all I need to do is to just change the master and the slaves synch
> automatically
> What is a good way to implement this?
> We have an idea along the following lines
> This should involve changes to the snapshooter and snappuller scripts
> and the snapinstaller components
> Everytime the snapshooter takes a snapshot it must keep the timestamps
> of schema.xml and elevate.xml (all the files which might affect the
> runtime behavior in slaves)
> For subsequent snapshots if the timestamps of any of them is changed
> it must copy the all of them also for replication.
> The snappuller copies the new directory as usual
> The snapinstaller checks if these config files are present ,
> if yes,
>  * It can create a temporary core
>  * install the changed index and configuration
>  * load it completely and swap it out with the original core

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-579) Extend SimplePost with RecurseDirectories, threads, document encoding , number of docs per commit

2008-05-20 Thread Otis Gospodnetic (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598493#action_12598493
 ] 

Otis Gospodnetic commented on SOLR-579:
---

Same here, -1.  That would create the same situation that we sometimes see over 
in Lucene land where people use the Lucene demo and think *that* is Lucene, or 
they take the demo and want it to run as an out-of-the-box application for them.


> Extend SimplePost with RecurseDirectories, threads, document encoding , 
> number of docs per commit
> -
>
> Key: SOLR-579
> URL: https://issues.apache.org/jira/browse/SOLR-579
> Project: Solr
>  Issue Type: New Feature
>Affects Versions: 1.3
> Environment: Applies to all platforms
>Reporter: Patrick Debois
>Priority: Minor
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> -When specifying a directory, simplepost should read also the contents of a  
> directory
> New options for the commandline (some only usefull in DATAMODE= files)
> -RECURSEDIRS
> Recursive read of directories as an option, this is usefull for 
> directories with a lot of files where the commandline expansion fails and 
> xargs is too slow
> -DOCENCODING (default = system encoding or UTF-8) 
> For non utf-8 clients , simplepost should include a way to set the 
> encoding of the documents posted
> -THREADSIZE (default =1 ) 
> For large volume posts, a threading pool makes sense , using JDK 1.5 
> Threadpool model
> -DOCSPERCOMMIT (default = 1)
> Number of documents after which a commit is done, instead of only at 
> the end
> Note: not to break the existing behaviour of the existing SimplePost tool 
> (post.sh) might be used in scripts 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-565) Component to abstract shards from clients

2008-05-20 Thread patrick o'leary (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598495#action_12598495
 ] 

patrick o'leary commented on SOLR-565:
--

That's a different aspect, where you either have a map reduce / ontology / hash 
based system to focus you queries to certain farms
of servers. 

This component could act as an example of how to accomplish that, but there are 
so many possible implementations that it's not
probable to provide a scope for it.


> Component to abstract shards from clients
> -
>
> Key: SOLR-565
> URL: https://issues.apache.org/jira/browse/SOLR-565
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: patrick o'leary
>Priority: Minor
> Attachments: distributor_component.patch
>
>
> A component that will remove the need for calling clients to provide the 
> shards parameter for
> a distributed search. 
> As systems grow, it's better to manage shards with in solr, rather than 
> managing each client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-553) Highlighter does not match phrase queries correctly

2008-05-20 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598494#action_12598494
 ] 

Mark Miller commented on SOLR-553:
--

>Probably best to create a new ticket (if necessary) about the ax 
>bx instead of ax bx problem. That >highlights have 
>incorrect matches is far worse. I'll adjust the problem description.

If I remember correctly, this was an ease of implementation issue. Part of it 
was fitting into the current Highlighter framework (individual tokens are 
scored and highlighted) and part of it was ease in general I think. I am not 
sure that it would be too easy to alter.

It's very easy to do with the new Highlighter I have been working on, the 
LargeDocHighlighter. It breaks from the current API, and makes this type of 
highlight markup quite easy. It may never see the light of day though...to do 
what I want, all parts of the query need to be located with the MemoryIndex, 
and the time this takes on non position sensitive queries clauses is almost 
equal to the savings I get from not iterating through and scoring each token in 
a TokenStream. I do still have hopes I can pull something off though, and it 
may end up being useful for something else.

For now though, Highlighting each each token seems a small inconvenience to 
retain all the old Highlighters tests, corner cases, and speed in non position 
sensitive scoring. Thats not to say there will not be a way if you take a look 
at the code though.

> Highlighter does not match phrase queries correctly
> ---
>
> Key: SOLR-553
> URL: https://issues.apache.org/jira/browse/SOLR-553
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Affects Versions: 1.2
> Environment: all
>Reporter: Brian Whitman
>Assignee: Otis Gospodnetic
> Attachments: highlighttest.xml, Solr-553.patch, Solr-553.patch, 
> Solr-553.patch
>
>
> http://www.nabble.com/highlighting-pt2%3A-returning-tokens-out-of-order-from-PhraseQuery-to16156718.html
> Say we search for the band "I Love You But I've Chosen Darkness"
> .../selectrows=100&q=%22I%20Love%20You%20But%20I\'ve%20Chosen%20Darkness%22&fq=type:html&hl=true&hl.fl=content&hl.fragsize=500&hl.snippets=5&hl.simple.pre=%3Cspan%3E&hl.simple.post=%3C/span%3E
> The highlight returns a snippet that does have the name altogether:
> Lights (Live) : I Love You But 
> I've Chosen Darkness :
> But also returns unrelated snips from the same page:
> Black Francis Shop "I Think I Love 
> You"
> A correct highlighter should not return snippets that do not match the phrase 
> exactly.
> LUCENE-794 (not yet committed, but seems to be ready) fixes up the problem 
> from the Lucene end. Solr should get it too.
> Related: SOLR-575 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Release of SOLR 1.3

2008-05-20 Thread Otis Gospodnetic

Hi,

Half-baked things getting into trunk probably won't happen.  Lots of people use 
Solr nightlies (cause they are often stable enough).  If we were a bunch paid 
to work on Solr, then we'd be more organized/structured and have more regular 
release cycles.  Solr is also not likely to have a very short lifetime -- too 
many people use it, develop for it, and depend on it.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Andrew Savory <[EMAIL PROTECTED]>
> To: solr-dev@lucene.apache.org
> Sent: Tuesday, May 20, 2008 3:51:50 PM
> Subject: Re: Release of SOLR 1.3
> 
> Hi Mike,
> 
> On 20/05/2008, Mike Klaas wrote:
> 
> >  I've gone and reassigned a bunch of issues that were labeled "1.3" by the
> > original submitter, if the submitter is not a committer (perhaps this field
> > shouldn't be editable by everyone).  That still leaves many issues, several
> > of which I don't think are critical for 1.3.
> 
> Cool, thanks for that. Indeed, assigning issues to releases should
> only be possible by committers.
> 
> > > Taking a look through the list there's quite a few issues with patches
> > > attached that aren't applied yet. Clearing these out would cut the
> > > open bug count by almost half:
> > >
> >
> >  But then we'd have to open bug reports for each one that says "make sure
> > this actually works and that it is the correct direction for Solr" :)
> 
> Heh. Thankfully many of the patches look well-tested and extremely
> well discussed already, so I'd hope they wouldn't require too many
> followup issues!
> 
> > > It's a little weird to see patch 'development' going on in JIRA
> > > (sometimes for over a year), rather than getting the patches into svn
> > > and then working there... I'd worry that some valuable code history is
> > > getting lost along the way? Yes, it's a tough call between adding
> > > 'bad' code and waiting for the perfect patch, but bad code creates
> > > healthy communities and is better than no code :-)
> >
> >  Committing the code to trunk creates a path dependence and responsibility
> > for maintaining the code.  There would also be a high probability of trunk
> > never being in a releasable state, given the chance of there being a
> > half-baked idea in trunk that we don't want to be bound to for the rest of
> > Solr's lifetime.
> 
> I'd tend to disagree: committing the patches to trunk allows
> widespread testing and the chance for wider review of the code to see
> if it does what it should. Only when the code is part of a release is
> there any obligation to a proper lifecycle (ongoing support,
> deprecation, then finally removal).
> 
> Of course, being concerned for the state of trunk is a good thing
> overall, but it seems from my casual observation that some
> contributions that are far from half-baked are not making it into
> trunk: this is even worse as it might lead to an unnaturally short
> lifetime for Solr.
> 
> >  (incidentally, this is the same philosophy we apply at my company, except
> > that development is usually done in branches rather than patches.)
> 
> Sure, I'm currently working in a branch-per-feature environment, and
> it has some advantages for a corporate environment with no community
> concerns. But here we're talking about consensus-driven open
> development, for which a more open approach may be appropriate. True,
> it may seem chaotic and perhaps a bit risky - but with enough eyes on
> the code we can mitigate that risk.
> 
> And hey, if some contributions are really controversial, there's
> always the option to do more branches (or even set up a scratchpad).
> 
> Just my €0.02!
> 
> 
> Andrew.
> --
> [EMAIL PROTECTED] / [EMAIL PROTECTED]
> http://www.andrewsavory.com/

Re: Release of SOLR 1.3

2008-05-20 Thread Otis Gospodnetic

I'll take the contrib/ issue if nobody else does.  I would want to see that one 
in 1.3, so we can get DataImportHandler in.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
> From: Shalin Shekhar Mangar <[EMAIL PROTECTED]>
> To: solr-dev@lucene.apache.org
> Sent: Tuesday, May 20, 2008 3:32:21 PM
> Subject: Re: Release of SOLR 1.3
> 
> +1 for your suggestions Mike.
> 
> I'd like to see a few of the smaller issues get committed in 1.3 such as
> SOLR-256 (JMX), SOLR-536 (binding for SolrJ), SOLR-430 (SpellChecker support
> in SolrJ) etc. Also, SOLR-561 (replication by Solr) would be really cool to
> have in the next release. Noble and I are working on it and plan to give a
> patch soon.
> 
> Mike -- you removed SOLR-563 (Contrib area for Solr) from 1.3 but it is a
> dependency for SOLR-469 (DataImportHandler) as it was decided to have
> DataImportHandler as a contrib project. It would also be good to have a
> rough release roadmaps to work against. Can fixed release cycle (say every 6
> months) work for Solr?
> 
> On Wed, May 21, 2008 at 12:45 AM, Mike Klaas wrote:
> 
> >
> > On 20-May-08, at 1:53 AM, Andrew Savory wrote:
> >
> >> 2008/5/19 Chris Hostetter :
> >>
> >>  If people are particularly eager to see a 1.3 release, the best thing to
> >>> do is subscribe to solr-dev and start a dialog there about what issues
> >>> people thing are "show stopers" for 1.3 and what assistance the various
> >>> people working on those issues can use.
> >>>
> >>
> >> So, what are the show stoppers, how can we help, what can we reassign
> >> to a future release?
> >>
> >
> > I've gone and reassigned a bunch of issues that were labeled "1.3" by the
> > original submitter, if the submitter is not a committer (perhaps this field
> > shouldn't be editable by everyone).  That still leaves many issues, several
> > of which I don't think are critical for 1.3.
> >
> > I propose that we follow an "ownership" process for getting this release
> > out the door: we give committers a week to fill in the "assigned to" field
> > in JIRA for the 1.3 issues.  Any issue that isn't assigned after one week
> > gets moved to a future release.  Then we can each evaluate the issues we are
> > responsible for.
> >
> > Any non-1.3-marked issues should be added at this time too.
> >
> >  Taking a look through the list there's quite a few issues with patches
> >> attached that aren't applied yet. Clearing these out would cut the
> >> open bug count by almost half:
> >>
> >
> > But then we'd have to open bug reports for each one that says "make sure
> > this actually works and that it is the correct direction for Solr" :)
> >
> >  It's a little weird to see patch 'development' going on in JIRA
> >> (sometimes for over a year), rather than getting the patches into svn
> >> and then working there... I'd worry that some valuable code history is
> >> getting lost along the way? Yes, it's a tough call between adding
> >> 'bad' code and waiting for the perfect patch, but bad code creates
> >> healthy communities and is better than no code :-)
> >>
> >
> > Committing the code to trunk creates a path dependence and responsibility
> > for maintaining the code.  There would also be a high probability of trunk
> > never being in a releasable state, given the chance of there being a
> > half-baked idea in trunk that we don't want to be bound to for the rest of
> > Solr's lifetime.
> >
> > (incidentally, this is the same philosophy we apply at my company, except
> > that development is usually done in branches rather than patches.)
> >
> > -Mike
> >
> 
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.

[jira] Commented: (SOLR-551) SOlr replication should include the schema also

2008-05-20 Thread Otis Gospodnetic (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598498#action_12598498
 ] 

Otis Gospodnetic commented on SOLR-551:
---

The Hadoop -user group has a recent thread about synchronizing config 
distribution and it looks like people really like the idea of retrieving the 
configs from a well known URL.  Perhaps that's the thing to do here, too (a la 
admin pages).


> SOlr replication should include the schema also
> ---
>
> Key: SOLR-551
> URL: https://issues.apache.org/jira/browse/SOLR-551
> Project: Solr
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 1.3
>Reporter: Noble Paul
>
> The current Solr replication just copy the data directory . So if the
> schema changes and I do a re-index it will blissfully copy the index
> and the slaves will fail because of incompatible schema.
> So the steps we follow are
>  * Stop rsync on slaves
>  * Update the master with new schema
>  * re-index data
>  * forEach slave
>  ** Kill the slave
>  ** clean the data directory
>  ** install the new schema
>  ** restart
>  ** do a manual snappull
> The amount of work the admin needs to do is quite significant
> (depending on the no:of slaves). These are manual steps and very error
> prone
> The solution :
> Make the replication mechanism handle the schema replication also. So
> all I need to do is to just change the master and the slaves synch
> automatically
> What is a good way to implement this?
> We have an idea along the following lines
> This should involve changes to the snapshooter and snappuller scripts
> and the snapinstaller components
> Everytime the snapshooter takes a snapshot it must keep the timestamps
> of schema.xml and elevate.xml (all the files which might affect the
> runtime behavior in slaves)
> For subsequent snapshots if the timestamps of any of them is changed
> it must copy the all of them also for replication.
> The snappuller copies the new directory as usual
> The snapinstaller checks if these config files are present ,
> if yes,
>  * It can create a temporary core
>  * install the changed index and configuration
>  * load it completely and swap it out with the original core

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-565) Component to abstract shards from clients

2008-05-20 Thread Otis Gospodnetic (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598499#action_12598499
 ] 

Otis Gospodnetic commented on SOLR-565:
---

I agree.  Let's get this in and then worry about getting fancy.  This should go 
in 1.3 and I'll take it if nobody else does.

> Component to abstract shards from clients
> -
>
> Key: SOLR-565
> URL: https://issues.apache.org/jira/browse/SOLR-565
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: patrick o'leary
>Priority: Minor
> Attachments: distributor_component.patch
>
>
> A component that will remove the need for calling clients to provide the 
> shards parameter for
> a distributed search. 
> As systems grow, it's better to manage shards with in solr, rather than 
> managing each client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-565) Component to abstract shards from clients

2008-05-20 Thread Jayson Minard (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598506#action_12598506
 ] 

Jayson Minard commented on SOLR-565:


Selecting shards by sets is not overly fancy.  You basically allow shards to be 
specified by location, then you allow shard sets to be specified including 
those sets.  You reference the set (by default there is an "All" set) during 
the query and you are off to the races.

Shard selection by sets covers a lot of ground in terms of bringing in more use 
cases without adding that much more complexity.  Really, not much complexity, 
just a bit more code.



> Component to abstract shards from clients
> -
>
> Key: SOLR-565
> URL: https://issues.apache.org/jira/browse/SOLR-565
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.3
>Reporter: patrick o'leary
>Priority: Minor
> Attachments: distributor_component.patch
>
>
> A component that will remove the need for calling clients to provide the 
> shards parameter for
> a distributed search. 
> As systems grow, it's better to manage shards with in solr, rather than 
> managing each client.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-20 Thread Oleg Gnatovskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598512#action_12598512
 ] 

Oleg Gnatovskiy commented on SOLR-572:
--

Hey guys I created a dictionary index from the following XML file:

  
10
pizza
  
  
11
club
  
  
12
bar
  

My config is the following:


default
index
word


 
and word is defined in schema.xml as:


When I run a query with the following URL:
http://localhost:8983/solr/select/?q=barr&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
I get the following response:
lst name="spellcheck">

1

bar



which is what I expect.
However with this URL:
http://wil1devsch1.cs.tmcs:8983/solr/select/?q=bar&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
 where bar is correctly spelled, I get the following:


1

barr



Could you please tell me where the word "barr" is coming from, and why it is 
being suggested? 

Thanks!

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Accessing IndexReader during core initialization hangs init

2008-05-20 Thread Chris Hostetter


: While working on SOLR-572, I found that if I try to access the
: IndexReader using SolrCore.getSearcher().get().getReader() within the
: SolrCoreAware.inform method, the initialization process hangs.

I haven't really thought about it before, but it seems logical that 
SolrCore.getSearcher() should work during the "inform" stage .. there may 
be some chicken and egg problems there though -- I'm guessing if it 
doesn't work right now it might be related to the issues with needing to 
inform all plugins before triggering the "firstSearcher" events (since 
handlers are likely used by those events) -- but it seems like the search 
could be created, then "inform" the plugins, then trigger the 
firstSearcher events.

: IndexReader in this way? I needed access to the IndexReader so that I
: can create the spell check index during core initialization. For now,
: I've moved the index creation to the first query coming into
: SpellCheckComponent (note to myself: review thread-safety in the init
: code).

As I mentioned in some spelling related issue recently (although aparently 
not SOLR-572) the straightforward way to do this is to initialize things 
like this when requests with a very specific initialization params occur, 
and then document that the "recommended" way to use your handler is to 
configure a request with those params as part of the firstSearcher event.

(having initilization work done like this is also neccessary to 
allow rebuilding a spelling index after the dictionary has changed)


-Hoss

[jira] Issue Comment Edited: (SOLR-572) Spell Checker as a Search Component

2008-05-20 Thread Oleg Gnatovskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598512#action_12598512
 ] 

oleg_gnatovskiy edited comment on SOLR-572 at 5/20/08 3:39 PM:
---

Hey guys I created a dictionary index from the following XML file:

  
10
pizza
  
  
11
club
  
  
12
bar
  

My config is the following:


default
index
word


 
and word is defined in schema.xml as:


When I run a query with the following URL:
http://localhost:8983/solr/select/?q=barr&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
I get the following response:
lst name="spellcheck">

1

bar



which is what I expect.
However with this URL:
http://localhost:8983/solr/select/?q=bar&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
 where bar is correctly spelled, I get the following:


1

barr



Could you please tell me where the word "barr" is coming from, and why it is 
being suggested? 

Thanks!

  was (Author: oleg_gnatovskiy):
Hey guys I created a dictionary index from the following XML file:

  
10
pizza
  
  
11
club
  
  
12
bar
  

My config is the following:


default
index
word


 
and word is defined in schema.xml as:


When I run a query with the following URL:
http://localhost:8983/solr/select/?q=barr&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
I get the following response:
lst name="spellcheck">

1

bar



which is what I expect.
However with this URL:
http://wil1devsch1.cs.tmcs:8983/solr/select/?q=bar&spellcheck=true&spellcheck.dictionary=default&spellcheck.count=10
 where bar is correctly spelled, I get the following:


1

barr



Could you please tell me where the word "barr" is coming from, and why it is 
being suggested? 

Thanks!
  
> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-578) Binary stream response for request

2008-05-20 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598532#action_12598532
 ] 

Hoss Man commented on SOLR-578:
---

bq. If your Handler can write out data only using a specific writer, you have 
the flexibility of overriding the 'wt' in the handler. Register your own writer 
in solrconfig.xml.

Correct.  

(a handler can even go so far as to "fail" in the inform(SolrCore) method if 
the writer it expects is not present)

The ShowFileRequestHandler and RawResponseWriter are good examples of this 
model (although it would probably make sense to change RawResponseWriter to 
implement BinaryQueryResponseWriter at some point)

bq. It is incongruous to have SolrQueryRequest.getContentStreams() but nothing 
similar for SolrQueryResponse.

Only if you are use to thinking of things in terms of the servlet API : )

generally speaking, the majority of Request Handlers shouldn't be dealing with 
raw character or binary streams ... they should be dealing with simple objects 
and deferring rendering of those objects to the QueryResponseWriter to decide 
how to render them based on the wishes of the client ... there are exceptions 
to every rule however, hence the approach described here where the Handler 
"forces" a particular response writer.

> Binary stream response for request
> --
>
> Key: SOLR-578
> URL: https://issues.apache.org/jira/browse/SOLR-578
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Jason Rutherglen
>
> Allow sending binary response back from request.  This is not the same as 
> encoding in binary such as BinaryQueryResponseWriter.  Simply need access to 
> servlet response stream for sending something like a Lucene segment.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-572) Spell Checker as a Search Component

2008-05-20 Thread Oleg Gnatovskiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Gnatovskiy updated SOLR-572:
-

Comment: was deleted

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-20 Thread Oleg Gnatovskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598533#action_12598533
 ] 

Oleg Gnatovskiy commented on SOLR-572:
--

Hey guys please disregard my last comment, I had a configuration issue that 
caused the problem. I was just wondering if there is a way to get the 
suggestions not to echo the query if there are no suggestions. For example a 
query where q=food probably should return a suggestion of "food".

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (SOLR-578) Binary stream response for request

2008-05-20 Thread Jason Rutherglen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Rutherglen closed SOLR-578.
-

   Resolution: Won't Fix
Fix Version/s: 1.3

Ok

> Binary stream response for request
> --
>
> Key: SOLR-578
> URL: https://issues.apache.org/jira/browse/SOLR-578
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 1.3
>Reporter: Jason Rutherglen
> Fix For: 1.3
>
>
> Allow sending binary response back from request.  This is not the same as 
> encoding in binary such as BinaryQueryResponseWriter.  Simply need access to 
> servlet response stream for sending something like a Lucene segment.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-303) Distributed Search over HTTP

2008-05-20 Thread Lars Kotthoff (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598548#action_12598548
 ] 

Lars Kotthoff commented on SOLR-303:


On closer inspection of the code, are the fields "sort" and "prefix" of 
FieldFacet used anywhere at all? They don't seem to be referenced anywhere in 
the code and just removing them doesn't seem to have any obvious effect.

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch, shards_qt.patch, solr-dist-faceting-non-ascii-all.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-572) Spell Checker as a Search Component

2008-05-20 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598549#action_12598549
 ] 

Shalin Shekhar Mangar commented on SOLR-572:


Oleg -- Thanks for trying out the patch. No, currently it does not signal if 
suggestions are not found, it just returns the query terms themselves. I'll add 
that feature.

> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-303) Distributed Search over HTTP

2008-05-20 Thread Gunnar Wagenknecht (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598551#action_12598551
 ] 

Gunnar Wagenknecht commented on SOLR-303:
-

Hi / Hallo,

Thanks for your mail. Unfortunately, I won't be able to answer it
soon. I'm on vacation till June 2nd without access to my mails.



Vielen Dank für die Email. Leider werde ich nicht sofort antworten.
Ich bin bis 2. Juni im Urlaub ohne Zugriff auf mein Postfach.

-Gunnar

-- 
Gunnar Wagenknecht
[EMAIL PROTECTED]
http://wagenknecht.org/


> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch, shards_qt.patch, solr-dist-faceting-non-ascii-all.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-572) Spell Checker as a Search Component

2008-05-20 Thread Oleg Gnatovskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598533#action_12598533
 ] 

oleg_gnatovskiy edited comment on SOLR-572 at 5/20/08 9:23 PM:
---

Hey guys I was just wondering if there is a way to get the suggestions not to 
echo the query if there are no suggestions available. For example, a query 
where q=food probably should not return a suggestion of "food".

  was (Author: oleg_gnatovskiy):
Hey guys please disregard my last comment, I had a configuration issue that 
caused the problem. I was just wondering if there is a way to get the 
suggestions not to echo the query if there are no suggestions. For example a 
query where q=food probably should return a suggestion of "food".
  
> Spell Checker as a Search Component
> ---
>
> Key: SOLR-572
> URL: https://issues.apache.org/jira/browse/SOLR-572
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Affects Versions: 1.3
>Reporter: Shalin Shekhar Mangar
> Fix For: 1.3
>
> Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-303) Distributed Search over HTTP

2008-05-20 Thread Otis Gospodnetic (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Otis Gospodnetic updated SOLR-303:
--

Comment: was deleted

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch, shards_qt.patch, solr-dist-faceting-non-ascii-all.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

51 matches

Mail list logo