date:20080318

[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches

2008-03-18 Thread Thomas Peuss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579761#action_12579761
 ] 

Thomas Peuss commented on SOLR-127:
---

{quote}
It seems there is no way to disable caching on a per-handler basis.
{quote}
True. And we should work to a point where we can configure this per handler.
{quote}
I've read through the comments on this issue but I'm still not convinced as to 
why we need to enable HTTP Caching by default. The way I see it is that using a 
HTTP Caching Proxy in front of SOLR is a very rare use case and people using it 
in their deployments can always go and enable caching in solrconfig. The 
downside of enabling this by default is that there is no way right now to 
disable it on a per-handler basis and even if there was a way, everyone would 
have to explicitly do it in their configuration and is something that we would 
have to educate users unnecessarily.
{quote}
I have no problem with disabling caching headers by default. We might need a 
functionality where some back-end module can veto on emitting cache headers or 
can tell the cache header code to emit cache headers that avoid caching of the 
response. This is not too hard to implement. I have a look into this tonight. 
We can simply add two methods to the SolrQueryResponse class (like _void 
setAvoidHTTPCaching(boolean)_ and _boolean isAvoidHTTPCaching()_ - the default 
for the value would be _false_). The update request handlers should set this to 
_true_ all the time. The partial response stuff can set this to _true_ as well.

Another way of getting around emitting cache headers on a _per request_ basis 
is to use POST requests. For POST requests we do not emit cache related headers 
or  _Not Modified_ responses completely following the W3C specs here.

And while thinking about that I realize that we need to extend the tests as 
well that we make sure that we never emit cache related headers in case of 
errors.

And still you can already disable caching header related functionality by adding
{noformat}
   
{noformat}
to your solrconfig.xml.

bq. I appreciate the work you all have put into this issue and all I'm trying 
to say is that a feature used very rarely should not be enabled by default. I'd 
like to vote to go back to Solr 1.2 compatibility by default.

In my world caching proxies and loadbalancers are the default. This might 
influence my view on that stuff. ;-)

> Make Solr more friendly to external HTTP caches
> ---
>
> Key: SOLR-127
> URL: https://issues.apache.org/jira/browse/SOLR-127
> Project: Solr
>  Issue Type: Wish
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.3
>
> Attachments: CacheUnitTest.patch, CacheUnitTest.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to determing how to respond to If-Modified-Since requests.
> (for the record, i think the reason this hasn't occured to me in the 2+ years 
> i've been using Solr, is because with the internal caching, i've yet to need 
> to put a proxy cache in front of Solr)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Hudson build is back to normal: Solr-trunk #380

2008-03-18 Thread Apache Hudson Server

See http://hudson.zones.apache.org/hudson/job/Solr-trunk/380/changes

[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches

2008-03-18 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579767#action_12579767
 ] 

Noble Paul commented on SOLR-127:
-

If we look at the problem that this feature is trying to solve, only the 
'select' handler should need this . So making it 'enabled' by default for all 
handlers does not serve any purpose.

This is indeed a useful feature for those who use a caching proxy in front. But 
those users are educated enough to configure it in solrconfig.xml if they need 
it .( BTW , We use Solr extensively and we have no caching in front of Solr )


In an ideal situation the 'select' handler must have it enabled by default. 
For all other handlers keep it off by default and provide an option to enable 
it (if needed)



> Make Solr more friendly to external HTTP caches
> ---
>
> Key: SOLR-127
> URL: https://issues.apache.org/jira/browse/SOLR-127
> Project: Solr
>  Issue Type: Wish
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.3
>
> Attachments: CacheUnitTest.patch, CacheUnitTest.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to determing how to respond to If-Modified-Since requests.
> (for the record, i think the reason this hasn't occured to me in the 2+ years 
> i've been using Solr, is because with the internal caching, i've yet to need 
> to put a proxy cache in front of Solr)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-505) Give RequestHandlers the possiblity to suppress the generation of HTTP caching headers

2008-03-18 Thread Thomas Peuss (JIRA)

Give RequestHandlers the possiblity to suppress the generation of HTTP caching 
headers
--

 Key: SOLR-505
 URL: https://issues.apache.org/jira/browse/SOLR-505
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
Reporter: Thomas Peuss


The code from SOLR-127 emits HTTP cache headers for all handlers if configured. 
We should not emit cache related headers for update request handlers. Partial 
responses (coming from the Timeout request stuff) should not be cached as well.

To solve this problem we can simply add two methods to the SolrQueryResponse 
class (like void setAvoidHTTPCaching(boolean) and boolean isAvoidHTTPCaching() 
- the default for the value would be false). The update request handlers should 
set this to true all the time. The partial response stuff can set this to true 
as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches

2008-03-18 Thread Thomas Peuss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579808#action_12579808
 ] 

Thomas Peuss commented on SOLR-127:
---

bq. This is indeed a useful feature for those who use a caching proxy in front. 
But those users are educated enough to configure it in solrconfig.xml if they 
need it .( BTW , We use Solr extensively and we have no caching in front of 
Solr )
True. We should disable the cache header stuff by default. Please open a new 
JIRA issue for that.

bq. In an ideal situation the 'select' handler must have it enabled by default.
For all other handlers keep it off by default and provide an option to enable 
it (if needed)
Exactly. We need to get a bit more specific here. I opened SOLR-505 for that.

> Make Solr more friendly to external HTTP caches
> ---
>
> Key: SOLR-127
> URL: https://issues.apache.org/jira/browse/SOLR-127
> Project: Solr
>  Issue Type: Wish
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.3
>
> Attachments: CacheUnitTest.patch, CacheUnitTest.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to determing how to respond to If-Modified-Since requests.
> (for the record, i think the reason this hasn't occured to me in the 2+ years 
> i've been using Solr, is because with the internal caching, i've yet to need 
> to put a proxy cache in front of Solr)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-127) Make Solr more friendly to external HTTP caches

2008-03-18 Thread Thomas Peuss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579808#action_12579808
 ] 

tpeuss edited comment on SOLR-127 at 3/18/08 4:23 AM:


bq. This is indeed a useful feature for those who use a caching proxy in front. 
But those users are educated enough to configure it in solrconfig.xml if they 
need it .( BTW , We use Solr extensively and we have no caching in front of 
Solr )
True. We should disable the cache header stuff by default. Please open a new 
JIRA issue for that.

bq. In an ideal situation the 'select' handler must have it enabled by default. 
For all other handlers keep it off by default and provide an option to enable 
it (if needed)
Exactly. We need to get a bit more specific here. I have opened SOLR-505 for 
that.

  was (Author: tpeuss):
bq. This is indeed a useful feature for those who use a caching proxy in 
front. But those users are educated enough to configure it in solrconfig.xml if 
they need it .( BTW , We use Solr extensively and we have no caching in front 
of Solr )
True. We should disable the cache header stuff by default. Please open a new 
JIRA issue for that.

bq. In an ideal situation the 'select' handler must have it enabled by default.
For all other handlers keep it off by default and provide an option to enable 
it (if needed)
Exactly. We need to get a bit more specific here. I opened SOLR-505 for that.
  
> Make Solr more friendly to external HTTP caches
> ---
>
> Key: SOLR-127
> URL: https://issues.apache.org/jira/browse/SOLR-127
> Project: Solr
>  Issue Type: Wish
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.3
>
> Attachments: CacheUnitTest.patch, CacheUnitTest.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to determing how to respond to If-Modified-Since requests.
> (for the record, i think the reason this hasn't occured to me in the 2+ years 
> i've been using Solr, is because with the internal caching, i've yet to need 
> to put a proxy cache in front of Solr)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-506) Enabling HTTP Cache headers should be configurable on a per-handler basis

2008-03-18 Thread Shalin Shekhar Mangar (JIRA)

Enabling HTTP Cache headers should be configurable on a per-handler basis
-

 Key: SOLR-506
 URL: https://issues.apache.org/jira/browse/SOLR-506
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Shalin Shekhar Mangar
 Fix For: 1.3


HTTP cache headers are needed only for select handler's response and it does 
not make much sense to enable it globally for all Solr responses.

Therefore, enabling/disabling cache headers should be configurable on a 
per-handler basis. It should be enabled by default on the select request 
handler and disabled by default on all others. It should be possible to 
override these defaults through configuration as well as through API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches

2008-03-18 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579812#action_12579812
 ] 

Shalin Shekhar Mangar commented on SOLR-127:


I've opened SOLR-506 to have this feature configurable on a per-handler basis.

Thanks Thomas for starting SOLR-505, together these two issues should lead to 
an 'ideal' solution :)

> Make Solr more friendly to external HTTP caches
> ---
>
> Key: SOLR-127
> URL: https://issues.apache.org/jira/browse/SOLR-127
> Project: Solr
>  Issue Type: Wish
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.3
>
> Attachments: CacheUnitTest.patch, CacheUnitTest.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to determing how to respond to If-Modified-Since requests.
> (for the record, i think the reason this hasn't occured to me in the 2+ years 
> i've been using Solr, is because with the internal caching, i've yet to need 
> to put a proxy cache in front of Solr)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-497) SolrJ QueryResponse does not support date faceting results

2008-03-18 Thread Grant Ingersoll (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved SOLR-497.
--

Resolution: Fixed

Committed revision 638357.  Thanks Shalin!

> SolrJ QueryResponse does not support date faceting results
> --
>
> Key: SOLR-497
> URL: https://issues.apache.org/jira/browse/SOLR-497
> Project: Solr
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-497.patch
>
>
> The QueryResponse provides getFacetFields for drilling down into facets.  It 
> would also be handy to have similar info for facet dates.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-505) Give RequestHandlers the possiblity to suppress the generation of HTTP caching headers

2008-03-18 Thread Thomas Peuss (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Peuss updated SOLR-505:
--

Attachment: SOLR-505.patch

Draft version of a patch.

> Give RequestHandlers the possiblity to suppress the generation of HTTP 
> caching headers
> --
>
> Key: SOLR-505
> URL: https://issues.apache.org/jira/browse/SOLR-505
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Thomas Peuss
> Attachments: SOLR-505.patch
>
>
> The code from SOLR-127 emits HTTP cache headers for all handlers if 
> configured. We should not emit cache related headers for update request 
> handlers. Partial responses (coming from the Timeout request stuff) should 
> not be cached as well.
> To solve this problem we can simply add two methods to the SolrQueryResponse 
> class (like void setAvoidHTTPCaching(boolean) and boolean 
> isAvoidHTTPCaching() - the default for the value would be false). The update 
> request handlers should set this to true all the time. The partial response 
> stuff can set this to true as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-505) Give RequestHandlers the possiblity to suppress the generation of HTTP caching headers

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579867#action_12579867
 ] 

Yonik Seeley commented on SOLR-505:
---

+1 on the general approach.
Distributed search will probably need to avoid caching as well.

> Give RequestHandlers the possiblity to suppress the generation of HTTP 
> caching headers
> --
>
> Key: SOLR-505
> URL: https://issues.apache.org/jira/browse/SOLR-505
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Thomas Peuss
> Attachments: SOLR-505.patch
>
>
> The code from SOLR-127 emits HTTP cache headers for all handlers if 
> configured. We should not emit cache related headers for update request 
> handlers. Partial responses (coming from the Timeout request stuff) should 
> not be cached as well.
> To solve this problem we can simply add two methods to the SolrQueryResponse 
> class (like void setAvoidHTTPCaching(boolean) and boolean 
> isAvoidHTTPCaching() - the default for the value would be false). The update 
> request handlers should set this to true all the time. The partial response 
> stuff can set this to true as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-469) DB Import RequestHandler

2008-03-18 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-469:


Attachment: SOLR-469.patch

This is the biggest ever feature release for the patch . This contains almost 
all the planned features for DataImportHandler include:

* support for xml/http datasources
* Javascript for transformer (requires java 6)
* Numerous performance enhancements and bug fixes
* Better logging and error handling
* An improved command interface
* command to reload config
* statistics integrated with solr statistics
* Can accessrequest parameters
* Extra configurable parameters can be passed from solrconfig.xml
* Multiple transformers possible (chaining)
* Can put in the handler without a data-config.xml and datasource
*Can make an arbitrary entity a root entity

More documentation in the wiki

> DB Import RequestHandler
> 
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-486) Support binary formats for QueryresponseWriter

2008-03-18 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-486:


Attachment: SOLR-486.patch

No changes . Just synchronizing with other code changes.

This is a  very useful option for users who wish to implement a binary format 
to improve the performance. (I plan to contribute one as soon as this is 
committed)

Currently the java clients go though the xml response format which can eat up 
some time in unmarshalling . It can be quite significant if the document size 
is large (take the case of facet requests)

I have an xml reponse which took around 30ms for unmarshalling . Binary format 
would have taken less than 5 ms.

> Support binary formats for QueryresponseWriter
> --
>
> Key: SOLR-486
> URL: https://issues.apache.org/jira/browse/SOLR-486
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, search
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-486.patch, SOLR-486.patch
>
>
> QueryResponse writer only allows text data to be written.
> So it is not possible to implement a binary protocol . Create another 
> interface which has a method 
> write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-505) Give RequestHandlers the possiblity to suppress the generation of HTTP caching headers

2008-03-18 Thread Thomas Peuss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579918#action_12579918
 ] 

Thomas Peuss commented on SOLR-505:
---

BTW: The patch uses _avoidHttpCaching=true_ as default as this is the safer 
approach. The patch only set this to _false_ for the search handlers.

> Give RequestHandlers the possiblity to suppress the generation of HTTP 
> caching headers
> --
>
> Key: SOLR-505
> URL: https://issues.apache.org/jira/browse/SOLR-505
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Thomas Peuss
> Attachments: SOLR-505.patch
>
>
> The code from SOLR-127 emits HTTP cache headers for all handlers if 
> configured. We should not emit cache related headers for update request 
> handlers. Partial responses (coming from the Timeout request stuff) should 
> not be cached as well.
> To solve this problem we can simply add two methods to the SolrQueryResponse 
> class (like void setAvoidHTTPCaching(boolean) and boolean 
> isAvoidHTTPCaching() - the default for the value would be false). The update 
> request handlers should set this to true all the time. The partial response 
> stuff can set this to true as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-469) DB Import RequestHandler

2008-03-18 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-469:


Attachment: SOLR-469.patch

The last patch started from the wrong root. This applies properly

> DB Import RequestHandler
> 
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-469) DB Import RequestHandler

2008-03-18 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-469:


Attachment: (was: SOLR-469.patch)

> DB Import RequestHandler
> 
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-469) DB Import RequestHandler

2008-03-18 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-469:


Attachment: (was: SOLR-469.patch)

> DB Import RequestHandler
> 
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-469) DB Import RequestHandler

2008-03-18 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-469:


Attachment: SOLR-469.patch

> DB Import RequestHandler
> 
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.3
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>   - It also takes in a properties file for the data source 
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>   -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>   - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
>   - where we can schedule it to be run automatically at regular 
> intervals
>   - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Problem with Facetting !

2008-03-18 Thread Tejaswi_Haramurali


Hi , 

 I am Tejaswi , a newbie with solr. 
I have tried to add new facets using solrj and I have also made the
appropriate changes in schema.xml and solrconfig.xml.

My aim is to index Marc-xml files (Marc-XML is a variant of xml used
predominantly with library content)
In this situation, I am facing a problem. 

Now, for instance if I want to facet X,Y,Z fields,

1) I make the changes in solr-config.xml
2) changes in schema.xml

 But however, some of my XML data is variable in the sense that sometimes,
one XML file lacks the 'X' field
one XML file lacks the 'Y' field. In this case, Solr throws an exception.

One aspect is that I am not facing this problem when all my XML records are
uniform (i.e have X,Y and Z fields)

Can anyone shed light on this problem ?
-- 
View this message in context: 
http://www.nabble.com/Problem-with-Facetting-%21-tp16125687p16125687.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

Re: Problem with Facetting !

2008-03-18 Thread Yonik Seeley

On Tue, Mar 18, 2008 at 12:58 PM, Tejaswi_Haramurali
<[EMAIL PROTECTED]> wrote:
>   I am Tejaswi , a newbie with solr.
>  I have tried to add new facets using solrj and I have also made the
>  appropriate changes in schema.xml and solrconfig.xml.
>
>  My aim is to index Marc-xml files (Marc-XML is a variant of xml used
>  predominantly with library content)
>  In this situation, I am facing a problem.
>
>  Now, for instance if I want to facet X,Y,Z fields,
>
>  1) I make the changes in solr-config.xml
>  2) changes in schema.xml
>
>   But however, some of my XML data is variable in the sense that sometimes,
>  one XML file lacks the 'X' field
>  one XML file lacks the 'Y' field. In this case, Solr throws an exception.

Do you get the exception when indexing or when searching?
What's the full exception stack trace?

-Yonik

[jira] Updated: (SOLR-303) Distributed Search over HTTP

2008-03-18 Thread Jayson Minard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayson Minard updated SOLR-303:
---

Attachment: distributed_facet_count_bugfix.patch

Attached patch to fix issue with distributed search.  If you specified a 
facet.field that was valid for the schema but not contained in a shard, an 
unintentional exception (array index out of bounds) would be thrown instead of 
returning the facet as empty.

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_facet_count_bugfix.patch, 
> distributed_pjaol.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.stu.patch, fedsearch.stu.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Problem with Facetting !

2008-03-18 Thread Tejaswi_Haramurali


Hi Yonik,
The error is 

org.apache.solr.client.exception.SolrClientException: should be lst, not:
str at
org.apache.solr.client.impl.ResultsParser.processFacetInfo(ResultsParser.java:338)
at org.apache.solr.client.impl.ResultsParser.process(ResultsParser.java:121)
at org.apache.solr.client.impl.SolrClientImpl.query(SolrClientImpl.java:396)
at solarqueryservlet.processRequest(solarqueryservlet.java:139) at
solarqueryservlet.doPost(solarqueryservlet.java:43) at
javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at
javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:390)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)


Yonik Seeley wrote:
> 
> On Tue, Mar 18, 2008 at 12:58 PM, Tejaswi_Haramurali
> <[EMAIL PROTECTED]> wrote:
>>   I am Tejaswi , a newbie with solr.
>>  I have tried to add new facets using solrj and I have also made the
>>  appropriate changes in schema.xml and solrconfig.xml.
>>
>>  My aim is to index Marc-xml files (Marc-XML is a variant of xml used
>>  predominantly with library content)
>>  In this situation, I am facing a problem.
>>
>>  Now, for instance if I want to facet X,Y,Z fields,
>>
>>  1) I make the changes in solr-config.xml
>>  2) changes in schema.xml
>>
>>   But however, some of my XML data is variable in the sense that
>> sometimes,
>>  one XML file lacks the 'X' field
>>  one XML file lacks the 'Y' field. In this case, Solr throws an
>> exception.
> 
> Do you get the exception when indexing or when searching?
> What's the full exception stack trace?
> 
> -Yonik
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-Facetting-%21-tp16125687p16126128.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Commented: (SOLR-303) Distributed Search over HTTP

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579931#action_12579931
 ] 

Yonik Seeley commented on SOLR-303:
---

I just committed this bugfix... thanks Jayson!

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_facet_count_bugfix.patch, 
> distributed_pjaol.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.stu.patch, fedsearch.stu.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579943#action_12579943
 ] 

Yonik Seeley commented on SOLR-486:
---

This patch is small enough, perhaps just combine it with the patch that 
implements a specific binary format.

> Support binary formats for QueryresponseWriter
> --
>
> Key: SOLR-486
> URL: https://issues.apache.org/jira/browse/SOLR-486
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, search
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-486.patch, SOLR-486.patch
>
>
> QueryResponse writer only allows text data to be written.
> So it is not possible to implement a binary protocol . Create another 
> interface which has a method 
> write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-127) Make Solr more friendly to external HTTP caches

2008-03-18 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579953#action_12579953
 ] 

Hoss Man commented on SOLR-127:
---

For the record: most of this discussion should have happened on the solr-dev 
list, not in the issue comments ... but i would like to address some points, so 
I'll do it here since this is where the discussion is.

1) It's true, there is no way to configure caching on a per request handler 
basis -- if you look at the history of the issue we looked into that but 
because of the necessary API changes we scaled back the scope of the patch -- 
it can be done, it just needs more thought into how to do it and people 
interested in working on it.

2) there is no doubt in my mind that having the cache awareness code on by 
default is the right approach moving forward.  These options don't cause Solr 
do do any caching, or to force any external caches to cache the pages -- they 
only result in Solr behaving correctly according to the HTTP spec sections 
relating to cache headers:  
   * *if* a request is made to Solr via an HTTP cache that cache will receive 
headers it can use to decide if/how-long to cache the response
   * *if* Solr receives a request with cache validation information then it 
responds with a 304
if you don't want that behavior then either don't access Solr via a cache, or 
explicitly set the  option; but the default 
behavior for people who are upgrading from 1.2 should be for Solr to emit 
Correct headers and to respect validation requests.  Requiring Solr users to 
explicitly turn on an option to get Solr to emit correct Caching headers would 
be like requiring them to explicitly set an option to get well formed XML 
instead of invalid XML -- the default should be the one that behaves the most 
correctly.

I admit however: this is a notable enough change that it should be mentioned in 
the "Upgrading from 1.2" section of CHANGES.txt -- I will add that.

3) if other pending patches attached to other issues have poor behavior as a 
result of the caching code, the appropriate place to discuss that is in those 
issue -- the solution may be to mark those issues dependent on a new issue to 
add the API hooks for request handlers to suppress caching (that's a good idea 
in general) but it's also possible that there are better/safer/more-logical 
solutions specific to those patches ... if the DataImportHandler is having 
problems because the caching code, i'm guessing it's because people use it to 
trigger updates using an HTTP GET -- that violates the semantics of GET and 
making work arounds in the the HttpCaching code to allow for that is a bad idea.

4) saying only the "/select" handler should get it's responses cached is 
missleading -- under Solr 1.3 there won't be anything special about /select ... 
any handler name can be used for queries, and any handler name can be used for 
updates ... if you are issuing a request that modifies the index, you should be 
sending a POST and no caching headers (or validation) will be done by Solr 
regardless of configuration.

As I said, discussion about the general topic of HTTP Caching, Solr, and what 
the defaults should be should really happen on the solr-dev list ... if there 
are any further comments let's please conduct them there and then open/update 
whatever issues we need to once a consensus has been reached.


> Make Solr more friendly to external HTTP caches
> ---
>
> Key: SOLR-127
> URL: https://issues.apache.org/jira/browse/SOLR-127
> Project: Solr
>  Issue Type: Wish
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 1.3
>
> Attachments: CacheUnitTest.patch, CacheUnitTest.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, HTTPCaching.patch, 
> HTTPCaching.patch
>
>
> an offhand comment I saw recently reminded me of something that really bugged 
> me about the serach solution i used *before* Solr -- it didn't play nicely 
> with HTTP caches that might be sitting in front of it.
> at the moment, Solr doesn't put in particularly usefull info in the HTTP 
> Response headers to aid in caching (ie: Last-Modified), responds to all HEAD 
> requests with a 400, and doesn't do anything special with If-Modified-Since.
> t the very least, we can set a Last-Modified based on when the current 
> IndexReder was open (if not the Date on the IndexReader) and use the same 
> info to dete

[jira] Updated: (SOLR-303) Distributed Search over HTTP

2008-03-18 Thread Jayson Minard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayson Minard updated SOLR-303:
---

Attachment: distributed_add_tests_for_intended_behavior.patch

A few more tests to show intended behavior when facets differ between shards 
which is likely in the wild (missing from all but valid in schema, missing from 
some, and invalid field not in schema).  The last test  is just to ensure error 
behavior matches non-distributed searches.

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-03-18 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12579977#action_12579977
 ] 

Noble Paul commented on SOLR-486:
-

I can do that.
I have a java class which can serialize/deserialize a NamedList using
which I used to send response and deserialize it back
I can post it as well
--Noble





-- 
--Noble Paul


> Support binary formats for QueryresponseWriter
> --
>
> Key: SOLR-486
> URL: https://issues.apache.org/jira/browse/SOLR-486
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, search
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-486.patch, SOLR-486.patch
>
>
> QueryResponse writer only allows text data to be written.
> So it is not possible to implement a binary protocol . Create another 
> interface which has a method 
> write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Newbie question

2008-03-18 Thread Chris Hostetter


http://people.apache.org/~hossman/#solr-dev

Please Use "[EMAIL PROTECTED]" Not "[EMAIL PROTECTED]"

Your question is better suited for the [EMAIL PROTECTED] mailing list ...
not the [EMAIL PROTECTED] list.  solr-dev is for discussing development of
the internals of the Solr application ... it is *not* the appropriate
place to ask questions about how to use Solr (or write Solr plugins) 
when developing your own applications.  Please resend your message to
the solr-user mailing list, where you are likely to get more/better
responses since that list also has a larger number of subscribers.



-Hoss

Re: svn commit: r638484 - /lucene/solr/trunk/CHANGES.txt

2008-03-18 Thread Yonik Seeley

On Tue, Mar 18, 2008 at 2:03 PM,  <[EMAIL PROTECTED]> wrote:
>  +Solr now recognizes HTTP Request headers related to HTTP Caching (see
>  +RFC 2616 sec13) and will by default respond with "304 Not Modified"
>  +when appropriate.  This should only affect users who access Solr via
>  +an HTTP Cache,

It affects browsers too.  I noticed that no new request was being
executed when I hit refresh in firefox.
I worked around it by adding a random arg like x=1234 to get it to re-execute.

-Yonik

Re: Problem with Facetting !

2008-03-18 Thread Tejaswi_Haramurali


I Have managed to solve the problem myself. It was due to an error in
scema.xml
Thanks anyways Yonik !

Tejaswi_Haramurali wrote:
> 
> Hi Yonik,
> The error is 
> 
> org.apache.solr.client.exception.SolrClientException: should be lst, not:
> str at
> org.apache.solr.client.impl.ResultsParser.processFacetInfo(ResultsParser.java:338)
> at
> org.apache.solr.client.impl.ResultsParser.process(ResultsParser.java:121)
> at
> org.apache.solr.client.impl.SolrClientImpl.query(SolrClientImpl.java:396)
> at solarqueryservlet.processRequest(solarqueryservlet.java:139) at
> solarqueryservlet.doPost(solarqueryservlet.java:43) at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:803) at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.netbeans.modules.web.monitor.server.MonitorFilter.doFilter(MonitorFilter.java:390)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> at java.lang.Thread.run(Thread.java:619)
> 
> 
> Yonik Seeley wrote:
>> 
>> On Tue, Mar 18, 2008 at 12:58 PM, Tejaswi_Haramurali
>> <[EMAIL PROTECTED]> wrote:
>>>   I am Tejaswi , a newbie with solr.
>>>  I have tried to add new facets using solrj and I have also made the
>>>  appropriate changes in schema.xml and solrconfig.xml.
>>>
>>>  My aim is to index Marc-xml files (Marc-XML is a variant of xml used
>>>  predominantly with library content)
>>>  In this situation, I am facing a problem.
>>>
>>>  Now, for instance if I want to facet X,Y,Z fields,
>>>
>>>  1) I make the changes in solr-config.xml
>>>  2) changes in schema.xml
>>>
>>>   But however, some of my XML data is variable in the sense that
>>> sometimes,
>>>  one XML file lacks the 'X' field
>>>  one XML file lacks the 'Y' field. In this case, Solr throws an
>>> exception.
>> 
>> Do you get the exception when indexing or when searching?
>> What's the full exception stack trace?
>> 
>> -Yonik
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Problem-with-Facetting-%21-tp16125687p16128404.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580007#action_12580007
 ] 

Yonik Seeley commented on SOLR-486:
---

Great! If we can get it to handle everything that the current XML handler can 
handle, then we could use it by default for distributed search.


> Support binary formats for QueryresponseWriter
> --
>
> Key: SOLR-486
> URL: https://issues.apache.org/jira/browse/SOLR-486
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, search
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-486.patch, SOLR-486.patch
>
>
> QueryResponse writer only allows text data to be written.
> So it is not possible to implement a binary protocol . Create another 
> interface which has a method 
> write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

NEW Facetting Problem

2008-03-18 Thread Tejaswi_Haramurali


Hi ,

 I am facing a problem in using solrj. I am using java (solrj) to index as
well as search data in the solr search engine.
This is some of the code 


exer.setField("name","DOC"+identity);
exer.setField("features","The Mellon Foundation");
exer.setField("language",langmap.get("008lang"));
exer.setField("date",datemap.get("008date"));
exer.setField("format",formatmap.get("formats"));


The problem is , when I do a search on 'Mellon' or any word associated with
the 'features' field ,I get the results. But However when I do a search on
any of the other fields, I dont get the results. I have ensured that
indexed=true in schema.xml for all these fields
and have also tried displaying the values I am indexing. I dont know what
mistake I am committing.

I would be glad if someone could help me on this.

Tejaswi
-- 
View this message in context: 
http://www.nabble.com/NEW-Facetting-Problem-tp16128691p16128691.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

[jira] Commented: (SOLR-303) Distributed Search over HTTP

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580024#action_12580024
 ] 

Yonik Seeley commented on SOLR-303:
---

committed addition tests... thanks!

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580026#action_12580026
 ] 

Yonik Seeley commented on SOLR-486:
---

I accidentally committed this in conjunction with another patch.
I think I'll just leave it committed though, since it looks fine.
We can continue to use this JIRA issue for the actual binary protocol 
implementation though.

> Support binary formats for QueryresponseWriter
> --
>
> Key: SOLR-486
> URL: https://issues.apache.org/jira/browse/SOLR-486
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java, search
>Reporter: Noble Paul
>Priority: Minor
> Fix For: 1.3
>
> Attachments: SOLR-486.patch, SOLR-486.patch
>
>
> QueryResponse writer only allows text data to be written.
> So it is not possible to implement a binary protocol . Create another 
> interface which has a method 
> write(OutputStream os, SolrQueryRequest request, SolrQueryResponse response)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-505) Give RequestHandlers the possiblity to suppress the generation of HTTP caching headers

2008-03-18 Thread Sean Timm (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580070#action_12580070
 ] 

Sean Timm commented on SOLR-505:


Should errors be cached?  Certainly some are transient and shouldn't be.

In the SolrDispatchFilter,
{code}
if( solrRsp.getException() != null ) {
sendError( (HttpServletResponse)response, solrRsp.getException() );
}
{code}
the Expires and Cache-Control headers could be reset as in 
HttpCacheHeaderUtil.checkAvoidHttpCaching(...).

> Give RequestHandlers the possiblity to suppress the generation of HTTP 
> caching headers
> --
>
> Key: SOLR-505
> URL: https://issues.apache.org/jira/browse/SOLR-505
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.3
>Reporter: Thomas Peuss
> Attachments: SOLR-505.patch
>
>
> The code from SOLR-127 emits HTTP cache headers for all handlers if 
> configured. We should not emit cache related headers for update request 
> handlers. Partial responses (coming from the Timeout request stuff) should 
> not be cached as well.
> To solve this problem we can simply add two methods to the SolrQueryResponse 
> class (like void setAvoidHTTPCaching(boolean) and boolean 
> isAvoidHTTPCaching() - the default for the value would be false). The update 
> request handlers should set this to true all the time. The partial response 
> stuff can set this to true as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: svn commit: r638484 - /lucene/solr/trunk/CHANGES.txt

2008-03-18 Thread Erik Hatcher



On Mar 18, 2008, at 3:04 PM, Yonik Seeley wrote:

On Tue, Mar 18, 2008 at 2:03 PM,  <[EMAIL PROTECTED]> wrote:
 +Solr now recognizes HTTP Request headers related to HTTP Caching  
(see

 +RFC 2616 sec13) and will by default respond with "304 Not Modified"
 +when appropriate.  This should only affect users who access Solr  
via

 +an HTTP Cache,


It affects browsers too.  I noticed that no new request was being
executed when I hit refresh in firefox.
I worked around it by adding a random arg like x=1234 to get it to  
re-execute.


I experienced the same thing, but instead of a random argument you  
can hold down shift while clicking the reload button to have it make  
a fresh request.


Erik

[jira] Updated: (SOLR-502) Add search time out support

2008-03-18 Thread Sean Timm (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Timm updated SOLR-502:
---

Attachment: solrTimeout.patch

This patch includes Shalin's SolrJ patch and includes the SOLR-505 patch.  HTTP 
cache headers are now suppressed on a timeout.

> Add search time out support
> ---
>
> Key: SOLR-502
> URL: https://issues.apache.org/jira/browse/SOLR-502
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sean Timm
>Priority: Minor
> Attachments: SOLR-502-solrj.patch, solrTimeout.patch, 
> solrTimeout.patch, solrTimeout.patch, solrTimeout.patch
>
>
> Uses LUCENE-997 to add time out support to Solr.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Jayson Minard (JIRA)

Spell Checking Improvements
---

 Key: SOLR-507
 URL: https://issues.apache.org/jira/browse/SOLR-507
 Project: Solr
  Issue Type: New Feature
  Components: spellchecker
Reporter: Jayson Minard


Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell checking only within the current result set so that suggestions are 
always valid
** need to merge the spell checking index structure into fields within the 
actual documents within the main index rather than using a parallel dictionary 
index (change to Lucene, or place in Solr?)
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell checking in a distributed search (SOLR-303)

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Proposing updates to spell checking including support for Distributed Search (SOLR-303)

2008-03-18 Thread Jayson Minard

I just added SOLR-507 (https://issues.apache.org/jira/browse/SOLR-507)
to start a discussion on improving spell checking/suggestion support.

The details are provided in the issue with the main areas of focus
being the merging of spell suggestions into other types of requests,
spell suggestions with distributed search, and spell suggestions that
are limited by the current result set of the query (minus the terms
being spell corrected) and filter.

I am willing to pursue working on patches in this area.

-- Jayson Minard
   MindHeap Technology

[jira] Updated: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Jayson Minard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayson Minard updated SOLR-507:
---

Description: 
Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell suggestions from within the current query (minus terms being corrected) 
and filter so that suggestions are always valid
** need to merge the spell checking index structure into fields within the 
actual documents within the main index rather than using a parallel dictionary 
index (change to Lucene, or place in Solr?)
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell suggestions from a distributed search (SOLR-303)
# spell suggestions as a search component to augment other queries

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 


  was:
Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell checking only within the current result set so that suggestions are 
always valid
** need to merge the spell checking index structure into fields within the 
actual documents within the main index rather than using a parallel dictionary 
index (change to Lucene, or place in Solr?)
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell checking in a distributed search (SOLR-303)

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 



> Spell Checking Improvements
> ---
>
> Key: SOLR-507
> URL: https://issues.apache.org/jira/browse/SOLR-507
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Jayson Minard
>
> Creating a placeholder issue to track Spell Checking Improvements.  
> Individual issues can later be created and linked for each area of separable 
> concern when they are determined.  
> Areas to discuss include:
> # spell suggestions from within the current query (minus terms being 
> corrected) and filter so that suggestions are always valid
> ** need to merge the spell checking index structure into fields within the 
> actual documents within the main index rather than using a parallel 
> dictionary index (change to Lucene, or place in Solr?)
> ** need to add spell checking as query component and make available to 
> various query handlers
> ** spell checking to be field specific to support responding correctly with 
> dismax queries
> # spell suggestions from a distributed search (SOLR-303)
> # spell suggestions as a search component to augment other queries
> What are other typical areas of concern, or suggestions for improvements for 
> spell checking that can be tracked?  
> I am willing to look at driving a patch for this area, especially for spell 
> checking working within the current result set, and across  distributed 
> search.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Jayson Minard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayson Minard updated SOLR-507:
---

Description: 
Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell suggestions from within the current query (minus terms being corrected) 
and filter so that suggestions are always valid
** need approaches to merging the spelling list with the current mask of valid 
records.  Also, is this a better change to Lucene first, or something that 
belongs in Solr?
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell suggestions from a distributed search (SOLR-303)
# spell suggestions as a search component to augment other queries

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 


  was:
Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell suggestions from within the current query (minus terms being corrected) 
and filter so that suggestions are always valid
** need to merge the spell checking index structure into fields within the 
actual documents within the main index rather than using a parallel dictionary 
index (change to Lucene, or place in Solr?)
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell suggestions from a distributed search (SOLR-303)
# spell suggestions as a search component to augment other queries

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 



> Spell Checking Improvements
> ---
>
> Key: SOLR-507
> URL: https://issues.apache.org/jira/browse/SOLR-507
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Jayson Minard
>
> Creating a placeholder issue to track Spell Checking Improvements.  
> Individual issues can later be created and linked for each area of separable 
> concern when they are determined.  
> Areas to discuss include:
> # spell suggestions from within the current query (minus terms being 
> corrected) and filter so that suggestions are always valid
> ** need approaches to merging the spelling list with the current mask of 
> valid records.  Also, is this a better change to Lucene first, or something 
> that belongs in Solr?
> ** need to add spell checking as query component and make available to 
> various query handlers
> ** spell checking to be field specific to support responding correctly with 
> dismax queries
> # spell suggestions from a distributed search (SOLR-303)
> # spell suggestions as a search component to augment other queries
> What are other typical areas of concern, or suggestions for improvements for 
> spell checking that can be tracked?  
> I am willing to look at driving a patch for this area, especially for spell 
> checking working within the current result set, and across  distributed 
> search.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Jayson Minard (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580148#action_12580148
 ] 

Jayson Minard commented on SOLR-507:


A related item from Lucene project...

* LUCENE-626 "Extended spell checker with phrase support and adaptive user 
session analysis" provides phrase-level spell suggestions.

> Spell Checking Improvements
> ---
>
> Key: SOLR-507
> URL: https://issues.apache.org/jira/browse/SOLR-507
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Jayson Minard
>
> Creating a placeholder issue to track Spell Checking Improvements.  
> Individual issues can later be created and linked for each area of separable 
> concern when they are determined.  
> Areas to discuss include:
> # spell suggestions from within the current query (minus terms being 
> corrected) and filter so that suggestions are always valid
> ** need approaches to merging the spelling list with the current mask of 
> valid records.  Also, is this a better change to Lucene first, or something 
> that belongs in Solr?
> ** need to add spell checking as query component and make available to 
> various query handlers
> ** spell checking to be field specific to support responding correctly with 
> dismax queries
> # spell suggestions from a distributed search (SOLR-303)
> # spell suggestions as a search component to augment other queries
> What are other typical areas of concern, or suggestions for improvements for 
> spell checking that can be tracked?  
> I am willing to look at driving a patch for this area, especially for spell 
> checking working within the current result set, and across  distributed 
> search.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Jayson Minard (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580148#action_12580148
 ] 

jayson.minard edited comment on SOLR-507 at 3/18/08 3:42 PM:
-

A related item from Lucene project...

* LUCENE-626 "Extended spell checker with phrase support and adaptive user 
session analysis" provides phrase-level spell suggestions.

And tracking comments about spell suggestion algorithms just in case this comes 
up:

* [Spelling Checker using 
Lucene|http://sujitpal.blogspot.com/2007/12/spelling-checker-with-lucene.html]



  was (Author: jayson.minard):
A related item from Lucene project...

* LUCENE-626 "Extended spell checker with phrase support and adaptive user 
session analysis" provides phrase-level spell suggestions.
  
> Spell Checking Improvements
> ---
>
> Key: SOLR-507
> URL: https://issues.apache.org/jira/browse/SOLR-507
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Jayson Minard
>
> Creating a placeholder issue to track Spell Checking Improvements.  
> Individual issues can later be created and linked for each area of separable 
> concern when they are determined.  
> Areas to discuss include:
> # spell suggestions from within the current query (minus terms being 
> corrected) and filter so that suggestions are always valid
> ** need approaches to merging the spelling list with the current mask of 
> valid records.  Also, is this a better change to Lucene first, or something 
> that belongs in Solr?
> ** need to add spell checking as query component and make available to 
> various query handlers
> ** spell checking to be field specific to support responding correctly with 
> dismax queries
> # spell suggestions from a distributed search (SOLR-303)
> # spell suggestions as a search component to augment other queries
> What are other typical areas of concern, or suggestions for improvements for 
> spell checking that can be tracked?  
> I am willing to look at driving a patch for this area, especially for spell 
> checking working within the current result set, and across  distributed 
> search.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Jayson Minard (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jayson Minard updated SOLR-507:
---

Description: 
Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell suggestions from within the current query (minus terms being corrected) 
and filter so that suggestions are always valid
** need approaches to merging the spelling list with the current mask of valid 
records.  Also, is this a better change to Lucene first, or something that 
belongs in Solr?
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell suggestions from a distributed search (SOLR-303) by augmenting the 
response, or alternatively just provide a federating of Spell Checker requests 
on their own and let the application decide when to use each.
# spell suggestions as a search component to augment other queries

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 


  was:
Creating a placeholder issue to track Spell Checking Improvements.  Individual 
issues can later be created and linked for each area of separable concern when 
they are determined.  

Areas to discuss include:

# spell suggestions from within the current query (minus terms being corrected) 
and filter so that suggestions are always valid
** need approaches to merging the spelling list with the current mask of valid 
records.  Also, is this a better change to Lucene first, or something that 
belongs in Solr?
** need to add spell checking as query component and make available to various 
query handlers
** spell checking to be field specific to support responding correctly with 
dismax queries
# spell suggestions from a distributed search (SOLR-303)
# spell suggestions as a search component to augment other queries

What are other typical areas of concern, or suggestions for improvements for 
spell checking that can be tracked?  

I am willing to look at driving a patch for this area, especially for spell 
checking working within the current result set, and across  distributed search. 
 



Updated description to provide alternatives for distributed search.

> Spell Checking Improvements
> ---
>
> Key: SOLR-507
> URL: https://issues.apache.org/jira/browse/SOLR-507
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Jayson Minard
>
> Creating a placeholder issue to track Spell Checking Improvements.  
> Individual issues can later be created and linked for each area of separable 
> concern when they are determined.  
> Areas to discuss include:
> # spell suggestions from within the current query (minus terms being 
> corrected) and filter so that suggestions are always valid
> ** need approaches to merging the spelling list with the current mask of 
> valid records.  Also, is this a better change to Lucene first, or something 
> that belongs in Solr?
> ** need to add spell checking as query component and make available to 
> various query handlers
> ** spell checking to be field specific to support responding correctly with 
> dismax queries
> # spell suggestions from a distributed search (SOLR-303) by augmenting the 
> response, or alternatively just provide a federating of Spell Checker 
> requests on their own and let the application decide when to use each.
> # spell suggestions as a search component to augment other queries
> What are other typical areas of concern, or suggestions for improvements for 
> spell checking that can be tracked?  
> I am willing to look at driving a patch for this area, especially for spell 
> checking working within the current result set, and across  distributed 
> search.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: svn commit: r638484 - /lucene/solr/trunk/CHANGES.txt

2008-03-18 Thread Chris Hostetter


: >  +Solr now recognizes HTTP Request headers related to HTTP Caching (see
: >  +RFC 2616 sec13) and will by default respond with "304 Not Modified"
: >  +when appropriate.  This should only affect users who access Solr via
: >  +an HTTP Cache,
: 
: It affects browsers too.  I noticed that no new request was being
: executed when I hit refresh in firefox.

Well, I would argue that paragraph is technically correct since you are 
accessing Solr via an HTTP Cache ... your browser cache.

But I'll update it to make special note of that.

-Hoss

[jira] Commented: (SOLR-507) Spell Checking Improvements

2008-03-18 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580176#action_12580176
 ] 

Yonik Seeley commented on SOLR-507:
---

Spell checking is not an area I've personally looked at, but your list of 
discussion items looks spot on.
IMO, since integrating spelling suggestions with general query results (search, 
facet, highlight) hasn't been done before in Solr, the response format is wide 
open (go crazy!)

> Spell Checking Improvements
> ---
>
> Key: SOLR-507
> URL: https://issues.apache.org/jira/browse/SOLR-507
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Jayson Minard
>
> Creating a placeholder issue to track Spell Checking Improvements.  
> Individual issues can later be created and linked for each area of separable 
> concern when they are determined.  
> Areas to discuss include:
> # spell suggestions from within the current query (minus terms being 
> corrected) and filter so that suggestions are always valid
> ** need approaches to merging the spelling list with the current mask of 
> valid records.  Also, is this a better change to Lucene first, or something 
> that belongs in Solr?
> ** need to add spell checking as query component and make available to 
> various query handlers
> ** spell checking to be field specific to support responding correctly with 
> dismax queries
> # spell suggestions from a distributed search (SOLR-303) by augmenting the 
> response, or alternatively just provide a federating of Spell Checker 
> requests on their own and let the application decide when to use each.
> # spell suggestions as a search component to augment other queries
> What are other typical areas of concern, or suggestions for improvements for 
> spell checking that can be tracked?  
> I am willing to look at driving a patch for this area, especially for spell 
> checking working within the current result set, and across  distributed 
> search.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-261) Search query with any stop words can invalidate whole query

2008-03-18 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-261:
--

Affects Version/s: 1.2

> Search query with any  stop words can invalidate whole query
> 
>
> Key: SOLR-261
> URL: https://issues.apache.org/jira/browse/SOLR-261
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.1.0, 1.2
> Environment: Centos 4.7, apache-tomcat-6.0.13, Java 1.6.0_01-b06
> SOLR Nightly solr-2007-06-06
>Reporter: Nickolas Golubev
> Fix For: 1.3
>
>
> org.apache.solr.request.StandardRequestHandler may parse the query string 
> incorrectly when "stop words" like "and" "of" etc... are used.
> We have this query:
> Collection:0  
> AND (Publisher:"Survey"^1 OR Creator:"Survey"^1 OR DocText:"Survey"^3 OR 
> Description:"Survey"^4 OR Title:"Survey"^6) 
> AND (Publisher:"of"^1 OR Creator:"of"^1 OR DocText:"of"^3 OR 
> Description:"of"^4 OR Title:"of"^6) 
> AND (Publisher:"Military"^1 OR Creator:"Military"^1 OR DocText:"Military"^3 
> OR Description:"Military"^4 OR Title:"Military"^6) 
> AND (Publisher:"Planning"^1 OR Creator:"Planning"^1 OR DocText:"Planning"^3 
> OR Description:"Planning"^4 OR Title:"Planning"^6) 
> AND (Publisher:"Systems"^1 OR Creator:"Systems"^1 OR DocText:"Systems"^3 OR 
> Description:"Systems"^4 OR Title:"Systems"^6) 
> Which got parsed into this query:
> +Collection:0 
> +(Publisher:survey Creator:survey DocText:survey^3.0 Description:survey^4.0 
> Title:survey^6.0) 
> +() 
> +(Publisher:militari Creator:militari DocText:militari^3.0 
> Description:militari^4.0 Title:militari^6.0) 
> +(Publisher:plan Creator:plan DocText:plan^3.0 Description:plan^4.0 
> Title:plan^6.0) 
> +(Publisher:system Creator:system DocText:system^3.0 Description:system^4.0 
> Title:system^6.0)
> The +() makes the query not work anymore... I am thinking it is is a bug, and 
> if all the terms are removed inside the "(" ")" the "(" ")" should be removed 
> also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (SOLR-261) Search query with any stop words can invalidate whole query

2008-03-18 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-261.
---

   Resolution: Fixed
Fix Version/s: 1.3

Since the trunk was upgraded to use Lucene 2.3, this problem has been resolved.

it can be verified against the trunk by using the following query against the 
example configs/docs...

+(name:of name:the) +name:solr
http://localhost:8983/solr/select/?debugQuery=true&q=%2B(name%3Aof+name%3Athe)+%2Bname%3Asolr&fl=name

note in the debug output the parsed query string is:   +name:solr

...in Solr 1.2 the parsed query string was:  +() +name:solr

> Search query with any  stop words can invalidate whole query
> 
>
> Key: SOLR-261
> URL: https://issues.apache.org/jira/browse/SOLR-261
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 1.1.0, 1.2
> Environment: Centos 4.7, apache-tomcat-6.0.13, Java 1.6.0_01-b06
> SOLR Nightly solr-2007-06-06
>Reporter: Nickolas Golubev
> Fix For: 1.3
>
>
> org.apache.solr.request.StandardRequestHandler may parse the query string 
> incorrectly when "stop words" like "and" "of" etc... are used.
> We have this query:
> Collection:0  
> AND (Publisher:"Survey"^1 OR Creator:"Survey"^1 OR DocText:"Survey"^3 OR 
> Description:"Survey"^4 OR Title:"Survey"^6) 
> AND (Publisher:"of"^1 OR Creator:"of"^1 OR DocText:"of"^3 OR 
> Description:"of"^4 OR Title:"of"^6) 
> AND (Publisher:"Military"^1 OR Creator:"Military"^1 OR DocText:"Military"^3 
> OR Description:"Military"^4 OR Title:"Military"^6) 
> AND (Publisher:"Planning"^1 OR Creator:"Planning"^1 OR DocText:"Planning"^3 
> OR Description:"Planning"^4 OR Title:"Planning"^6) 
> AND (Publisher:"Systems"^1 OR Creator:"Systems"^1 OR DocText:"Systems"^3 OR 
> Description:"Systems"^4 OR Title:"Systems"^6) 
> Which got parsed into this query:
> +Collection:0 
> +(Publisher:survey Creator:survey DocText:survey^3.0 Description:survey^4.0 
> Title:survey^6.0) 
> +() 
> +(Publisher:militari Creator:militari DocText:militari^3.0 
> Description:militari^4.0 Title:militari^6.0) 
> +(Publisher:plan Creator:plan DocText:plan^3.0 Description:plan^4.0 
> Title:plan^6.0) 
> +(Publisher:system Creator:system DocText:system^3.0 Description:system^4.0 
> Title:system^6.0)
> The +() makes the query not work anymore... I am thinking it is is a bug, and 
> if all the terms are removed inside the "(" ")" the "(" ")" should be removed 
> also.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-303) Distributed Search over HTTP

2008-03-18 Thread Jayson Minard (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580188#action_12580188
 ] 

Jayson Minard commented on SOLR-303:


Would it be interesting to others to have an extended response format for 
distributed queries that would bring back the list of shards numbered, and then 
code each element of the response with the source list of shards that 
contributed to the element appearing in the results?  For example, which shard 
was the source of a document?  Or which shards had the facet value present?  
And so on.

In really high shard counts it is more efficient if you can trim follow-on 
queries and pivots to only shards that matter.  This information would help 
that effort.  

Regardless, it is useful for debugging.

> Distributed Search over HTTP
> 
>
> Key: SOLR-303
> URL: https://issues.apache.org/jira/browse/SOLR-303
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Reporter: Sharad Agarwal
>Assignee: Yonik Seeley
> Attachments: distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed.patch, distributed.patch, distributed.patch, 
> distributed.patch, distributed_add_tests_for_intended_behavior.patch, 
> distributed_facet_count_bugfix.patch, distributed_pjaol.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, 
> fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, 
> fedsearch.stu.patch
>
>
> Searching over multiple shards and aggregating results.
> Motivated by http://wiki.apache.org/solr/DistributedSearch

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [jira] Commented: (SOLR-486) Support binary formats for QueryresponseWriter

2008-03-18 Thread Noble Paul നോബിള്‍ नोब्ळ्

I shall cut a patch for this as soon as I test it properly


On Wed, Mar 19, 2008 at 1:13 AM, Yonik Seeley (JIRA) <[EMAIL PROTECTED]> wrote:
>
>
> [ 
> https://issues.apache.org/jira/browse/SOLR-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12580026#action_12580026
>  ]
>
>  Yonik Seeley commented on SOLR-486:
>  ---
>
>  I accidentally committed this in conjunction with another patch.
>  I think I'll just leave it committed though, since it looks fine.
>  We can continue to use this JIRA issue for the actual binary protocol 
> implementation though.
>
>  > Support binary formats for QueryresponseWriter
>  > --
>  >
>  > Key: SOLR-486
>  > URL: https://issues.apache.org/jira/browse/SOLR-486
>  > Project: Solr
>  >  Issue Type: Improvement
>  >  Components: clients - java, search
>  >Reporter: Noble Paul
>  >Priority: Minor
>  > Fix For: 1.3
>  >
>  > Attachments: SOLR-486.patch, SOLR-486.patch
>  >
>  >
>  > QueryResponse writer only allows text data to be written.
>  > So it is not possible to implement a binary protocol . Create another 
> interface which has a method
>  > write(OutputStream os, SolrQueryRequest request, SolrQueryResponse 
> response)
>
>  --
>  This message is automatically generated by JIRA.
>  -
>  You can reply to this email to add a comment to the issue online.
>
>



-- 
--Noble Paul

50 matches

Mail list logo