[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-04-07 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854648#action_12854648
 ] 

Uri Boness commented on SOLR-1163:
--

yeah... I'm leaning toward that option as well. First, it's less intrusive, but 
also, using a proxy servlet I won't need to use XSS for the communication 
(which opens up all the XML-only api for me).

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-04-07 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854664#action_12854664
 ] 

Uri Boness commented on SOLR-1163:
--

The only downside to this is that it requires extra setup. A lot of people 
(incl. myself) like to use the bundled jetty instance for development and only 
deploy solr in a different servlet container in production. In that sense, it 
would be nice to get something ready out of the box with solr distribution (or 
at least that it would be easy to set it up with the examples directory). 

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-04-06 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854155#action_12854155
 ] 

Uri Boness commented on SOLR-1163:
--

working on a new improved patch for the explorer. But I'm at a bit of a dilemma 
here regarding exactly it should integrate with Solr. There are basically 3 
options:

1. Tight integration, where the explorer will be bound to each core and there 
will be a dedicated URL for it (say /corename/explorer). This is nice as the 
user gets this functionality out of the box, but on the other hand, I'm not 
sure users want it to be there out of the box (most of the time, if not always, 
the explorer will not be used as the final UI, but more of a temporary one, 
just to have something up and running... in production I can imagine users will 
not need it). This tight integration also means quite a lot of changes to the 
current configuration, well, first the dispatch filter will need to change a 
bit, but also a default request handler will need to be defined for all cores.

2. The other option is to keep the explorer as an external tool. The idea is to 
have it as a separate war file which can be deployed in the same servlet 
container as solr. I'm working on removing the current xml configuration and 
make it more dynamic. So when the user enters the application, she can 
configure a core by following a wizard-like process... this wizard will create 
a configuration which will be saved on the server for future logins. 

3. Well, the third option is just to leave things as they are now. That is, 
there is on configuration file which defines all the solr cores the explorer 
can communicate with. This configuration file is loaded when the web page is 
loaded. Like option 2, this is also a standalone mode.

any comments?

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based 

[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2010-04-02 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852813#action_12852813
 ] 

Uri Boness commented on SOLR-773:
-

Grant, I started looking at SOLR-1298 yesterday. The idea is to somehow merge 
all the related issues (there are currently two  open issues for the same 
purpose with two different patches). But this should be done with somewhat 
collaborated manner so everybody will be on the same page here also 
regarding the discussion about the different approaches (inline the pseudo 
fields or have them nested in a separate meta element). Is there some way to 
merge the issues? or perhaps mark one of them as duplicate, so the discussion 
will be centralized.

 Incorporate Local Lucene/Solr
 -

 Key: SOLR-773
 URL: https://issues.apache.org/jira/browse/SOLR-773
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
 lucene.tar.gz, screenshot-1.jpg, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, 
 solrGeoQuery.tar, spatial-solr.tar.gz


 Local Lucene has been donated to the Lucene project.  It has some Solr 
 components, but we should evaluate how best to incorporate it into Solr.
 See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-773) Incorporate Local Lucene/Solr

2010-04-02 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852813#action_12852813
 ] 

Uri Boness edited comment on SOLR-773 at 4/2/10 1:23 PM:
-

Grant, I started looking at SOLR-1298 yesterday. The idea is to somehow merge 
all the related issues (there are currently two  open issues for the same 
purpose with two different patches). But this should be done with somewhat 
collaborated manner so everybody will be on the same page here also 
regarding the discussion about the different approaches (inline the pseudo 
fields or have them nested in a separate meta element). Is there some way to 
merge the issues? or perhaps mark one of them as duplicate, so the discussion 
will be centralized.

btw, the other duplicate issues is SOLR-1566

  was (Author: uboness):
Grant, I started looking at SOLR-1298 yesterday. The idea is to somehow 
merge all the related issues (there are currently two  open issues for the same 
purpose with two different patches). But this should be done with somewhat 
collaborated manner so everybody will be on the same page here also 
regarding the discussion about the different approaches (inline the pseudo 
fields or have them nested in a separate meta element). Is there some way to 
merge the issues? or perhaps mark one of them as duplicate, so the discussion 
will be centralized.
  
 Incorporate Local Lucene/Solr
 -

 Key: SOLR-773
 URL: https://issues.apache.org/jira/browse/SOLR-773
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
 lucene.tar.gz, screenshot-1.jpg, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, 
 solrGeoQuery.tar, spatial-solr.tar.gz


 Local Lucene has been donated to the Lucene project.  It has some Solr 
 components, but we should evaluate how best to incorporate it into Solr.
 See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1829) Cleaned up analysis.jsp - removed all token API scriptlets

2010-03-17 Thread Uri Boness (JIRA)
Cleaned up analysis.jsp - removed all token API scriptlets
--

 Key: SOLR-1829
 URL: https://issues.apache.org/jira/browse/SOLR-1829
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Uri Boness
 Fix For: 1.5, 1.6, 3.1


The analysis.jsp was polluted with the old token stream api in scriptlets all 
over the place. Since the introduction of the FieldAnalysisRequestHandler, 
there's no need to keep this mess. Instead, the page can just call the analysis 
request handler and with parameter generated by the form and display the xml 
response the same way as it is displayed at the moment. Moreover, it will save 
some work when updating the code base to the new token stream API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1829) Cleaned up analysis.jsp - removed all token API scriptlets

2010-03-17 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1829:
-

Attachment: SOLR-1829.patch

this patch uses jquery to generate the proper requests to the field analysis 
request handler, then applies xsl transformation on the response to render it 
appropriately. It also updates the jquery version to 1.4.2. 

The UI is *slightly* different for simplicity, but also a bit enhanced. Now 
when choosing to analyze by field type/name, a drop down with all possible 
types/names will be populated.

In order to support all functionality of the analysis.jsp, the 
FieldAnalysisRequestHandler  co. had to be enhanced. They now accept 
analysis.verbose parameter which dumps more information over the 
tokenizer/filter. The verbose format differs a bit from the non-verbose, but 
that will not break BWC as when not using this parameter, the old format is 
returned.

 Cleaned up analysis.jsp - removed all token API scriptlets
 --

 Key: SOLR-1829
 URL: https://issues.apache.org/jira/browse/SOLR-1829
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 1.4
Reporter: Uri Boness
 Fix For: 1.5, 1.6, 3.1

 Attachments: SOLR-1829.patch


 The analysis.jsp was polluted with the old token stream api in scriptlets all 
 over the place. Since the introduction of the FieldAnalysisRequestHandler, 
 there's no need to keep this mess. Instead, the page can just call the 
 analysis request handler and with parameter generated by the form and display 
 the xml response the same way as it is displayed at the moment. Moreover, it 
 will save some work when updating the code base to the new token stream API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer

2010-03-05 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841776#action_12841776
 ] 

Uri Boness commented on SOLR-1716:
--

bq. ScriptTransformer supports Rhino,, Groove, Scala, etc. Someday even Erjang. 
They probably all have such things, but a consistent debugging allows better 
debugging tools.

True. Using the logger will print to the same log Solr log files. Indeed it's 
great for debugging, but also when you have fancy complex logic in the scripts 
general purpose logging (e.g. INFO, ERROR, TRACE) should also be considered.

 Add logging support for ScriptTransformer
 -

 Key: SOLR-1716
 URL: https://issues.apache.org/jira/browse/SOLR-1716
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: SOLR-1716.patch, SOLR-1716.patch


 Currently it's very hard to debug the logic embedded in the script ran by the 
 ScriptTransformer. There should be a possibility to add a logger to the 
 function signature, which can be used for logging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2010-01-27 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805601#action_12805601
 ] 

Uri Boness commented on SOLR-1163:
--

Actually I've been working on a new version for the explorer which I plan to 
put soon as a patch here.

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-27 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805612#action_12805612
 ] 

Uri Boness commented on SOLR-1725:
--

{quote}
Performance:

It looks like scripts are read from the resource loader and parsed again (eval) 
for every update request. This can be pretty expensive, esp for those scripting 
languages that generate java class files instead of using an interpreter. One 
way to combat this would be to cache and reuse them.
{quote}
Yes, indeed the scripts are evaluated per request but for a reason. One of the 
goals here is to keep the scripts as close as possible to the update processor 
interface, so the functions in the scripts has the same signature as the 
methods in the processor. But in order for the scripts to be flexible I decided 
to introduce some global scoped variables which are accessible in the 
functions. (currently the current solr request, response and a logger are 
there). The problem is that the API only defines 3 scopes where you can 
register variables and the lowest one is the engine itself. Since the 
evaluation of a script is done on the engine level as well, when using this API 
together with the global variables I don't think you can escape the need for 
creating an engine per request (thus, also evaluating the scripts).

But I agree with you that if there is a way around it, caching the 
evaluated/compiled scripts will definitely boost things up. I'll need to 
investigate this further and come up with alternatives (I already have some 
ideas using ThreadLocals).

bq. Should we have a way to specify a script in-line (in solrconfig.xml)?

Personally I prefer keeping the solrconfig.xml as clean as possible. I do 
however think that a standardization of Solr scripting support in general can 
be great. (for example, have a scripts folder under _solr.solr.home_ were all 
the scripts are placed, or come up with a standard configuration structure for 
the scripts... perhaps something in the direction Hoss suggested above).

bq. This seems to raise the visibility of the UpdateCommand classes, directly 
exposing them to users w/o plugins. We should perhaps consider interface 
cleanups on these classes at the same time as this issue.
+1

bq. Examples! Using javascript (since it's both fast and included in JDK6), 
let's see what the scripts are for some common usecases. This both helps 
improve the design as well as lets other people give feedback w/o having to 
read through code.
Yep.. that would probably be very helpful. basically I think anyone who's ever 
written an update processor can perhaps try to convert it to a script and see 
how it works. The usual use case for me is to just add a few fields which are 
derived from the other fields, but perhaps there are some other more 
interesting use cases out there. I guess these examples should be put in the 
Wiki, right?





 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-27 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805672#action_12805672
 ] 

Uri Boness commented on SOLR-1725:
--

Been looking more into it and I think there's a nice way in which we can cache 
the evaluated scripts. But... (and there's always a but) to make it work 
cleanly we need to be able to extend the scripting support, which means we need 
to be able to compile the code in Java 6.

And this brings us back to Mark's comment above on how do we want to do that.

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-27 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805691#action_12805691
 ] 

Uri Boness commented on SOLR-1725:
--

Well then... I just hope others will not shed tears as well and we can make 
Solr 1.5 Java 6 compiled :-)

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-26 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805235#action_12805235
 ] 

Uri Boness commented on SOLR-1725:
--

Lance, I lost you a bit as well.

bq. Uri, I'd prefer if the manner of configuration was as similar as possible, 
i.e. if we could get rid of the lst name=params part, and instead pass all 
top-level params directly to the script (except the scripts param itself).

Hmm... personally I prefer configurations that clearly indicate their purpose. 
leaving out the _params_ list will make things a bit confusing - some 
parameters are available for the scripts, others are not... it's not really 
clear.

bq. manner of configuration was as similar as possible

The configuration are similar. All elements in solrconfig.xml have one standard 
way of configuration which can be anything from a _lst_, _bool_, _str_, etc 
Tomorrow a new processor will popup which will also require a _lst_ 
configuration... and that's fine. 

bq.Even better if the definition of a processor was in a separate xml section 
and then refer by name only in each chain, but that is a bigger change outside 
scope of this patch.

Well, indeed that's a bigger change. Like everything, this kind of 
configuration has it's proscons.

I guess it's best if people will just state their preferences regarding how 
they would like to see this processor configured and based on that I'll adjust 
the patch.

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-26 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805254#action_12805254
 ] 

Uri Boness commented on SOLR-1725:
--

bq. 1) what is the value add in making ScriptUpdateProcessorFactory support 
multiple scripts ? ... wouldn't it be simpler to require that users declare 
multiple instances of ScriptUpdateProcessorFactory (that hte processor chain 
already executes in sequence) then to add sequential processing to the 
ScriptUpdateProcessor?

Well... to my taste it makes the configuration cleaner (no need to define 
several script processors). The thing is, you have the choice here - either 
specify several scripts (comma separated) or split them to several processors.

bq. 2) The NamedList init args can be as deep of a data structure as you want, 
so something like this would be totally feasible (if desired) ...

That's definitely another option.

The only thing is that you'd probably want some way to define shared parameters 
(shared between the scripts that is) and not be forced to specify them several 
times for each script. I guess you can do something like this:

{code}
processor class=solr.ScriptUpdateProcessorFactory
  lst name=sharedParams
bool name=paramNametrue/bool
  /lst
  lst name=scripts
lst name=updateProcessor1.js
  bool name=someParamNametrue/bool
  int name=someOtherParamName3/int
/lst
lst name=updateProcessor2.js
  bool name=fooParamtrue/bool
  str name=barParam3/str
/lst
  /lst
  lst name=otherProcessorOPtionsIfNeeded
...
  /lst
/processor
{code}

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-23 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804092#action_12804092
 ] 

Uri Boness commented on SOLR-1725:
--

If we move this to the contrib. we'll need to extract the script engine 
abstraction of a separate contrib. utils library (so the DIH will be able to 
utilize it). I believe this can create a bit of a mess just for this small 
(though useful) functionality. Is there are real reason for not keeping it in 
the core? 

I think either way, if people will want to use it they'll need to read 
somewhere how... I think it'd be nice to save them the extra effort of putting 
an extra jar file in the lib directory - the configuration (writing the script 
and configuring the update processors) they'll need to adjust anyway. The only 
thing that we must stress in the documentation (both in the schema and in the 
wiki) is that they can only use this feature in Java 6.

Two additional things to note:
1. JDK 5 has reached the end of service life (EOSL) already and is not actively 
supported by Sun (/Oracle).
2. The general recommendation is to run Solr on Java 6 anyways (due to some 
threading issues in Java 5).

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-23 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804094#action_12804094
 ] 

Uri Boness commented on SOLR-1725:
--

bq. Would it make more sense to execute the scripts in the order they are named 
in the scripts param? If I have two pipelines/chains, that need to use the same 
scripts but in different orders, I'm in trouble.

Absolutely! The reason why it is currently lexicographically ordered is due to 
an initial (different) implementation that i had. I'll change it and add a 
patch for it.

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-23 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804114#action_12804114
 ] 

Uri Boness commented on SOLR-1725:
--

bq. oh and one other thing - I really like this patch, Uri! I'm looking to 
integrate it into a data processing project here at JPL. Great idea!
Thanks :-)

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-19 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802496#action_12802496
 ] 

Uri Boness commented on SOLR-1725:
--

The DIH ScriptTransformer can really be cleaned up using this patch as well. I 
didn't add it to this patch as I didn't know whether it was a good idea to put 
too much into one patch. 

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, 
 SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)
Script based UpdateRequestProcessorFactory
--

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness


A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). 
The main goal of this plugin is to be able to configure/write update processors 
without the need to write and package Java code.

The update request processor factory enables writing update processors in 
scripts located in {{solr.solr.home}} directory. The functory accepts one 
(mandatory) configuration parameter named {{scripts}} which accepts a 
comma-separated list of file names. It will look for these files under the 
{{conf}} directory in solr home. When multiple scripts are defined, their 
execution order is defined by the lexicographical order of the script file name 
(so {{scriptA.js}} will be executed before {{scriptB.js}}).

The script language is resolved based on the script file extension (that is, a 
*.js files will be treated as a JavaScript script), therefore an extension is 
mandatory.

Each script file is expected to have one or more methods with the same 
signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
*not* required to define all methods, only those hat are required by the 
processing logic.

The following variables are define as global variables for each script:
 * {{req}} - The SolrQueryRequest
 * {{rsp}}- The SolrQueryResponse
 * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1725:
-

Attachment: SOLR-1725.patch

Initial implementation. Includes a simple test (probably more tests are 
required). Builds a script engine per script file - each file has its own scope.

This patch also introduces a new Interface - {{SolrResourceLoaderAware}} which 
can be used by any plugin loaded by SolrCore. (Any plugin implementing this 
interface will be injected by the resource loader of the SolrCore). The 
ScriptUpdateRequestProcessorFactory uses the resource loader to load the 
scripts from solr home conf directory.

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801983#action_12801983
 ] 

Uri Boness commented on SOLR-1725:
--

bq. What about the existing ResourceLoaderAware?

Woops... missed that one out :-)... I'll check it out and update the patch

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801988#action_12801988
 ] 

Uri Boness commented on SOLR-1725:
--

Is there any reason for currently limiting the classes that can be 
ResourceLoaderAware? This limitation is explicit in SolrResourceLoader (line: 
584):

{code}
awareCompatibility.put(
  ResourceLoaderAware.class, new Class[] {
CharFilterFactory.class,
TokenFilterFactory.class,
TokenizerFactory.class,
FieldType.class
  }
);
{code}

If the type is not one of this classes an exception is thrown

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801994#action_12801994
 ] 

Uri Boness commented on SOLR-1725:
--

Right... ok... I'll add another class to this list (I just don't understand why 
would you want to limit the types that can be *Aware)

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801994#action_12801994
 ] 

Uri Boness edited comment on SOLR-1725 at 1/18/10 11:52 PM:


Right... ok... I'll add another class to this list (I just don't understand why 
would you want to limit the types that can be *Aware - in a way it defeats the 
whole idea of the *Aware abstraction).

  was (Author: uboness):
Right... ok... I'll add another class to this list (I just don't understand 
why would you want to limit the types that can be *Aware)
  
 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1725:
-

Attachment: SOLR-1725.patch

A new patch, this time leverages the already existing ResourceLoaderAware 
interface

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802001#action_12802001
 ] 

Uri Boness commented on SOLR-1725:
--

Thanks for the reference

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802022#action_12802022
 ] 

Uri Boness commented on SOLR-1725:
--

Yes, it depends on Java 6. I guess the concern is mainly for the unit tests? 
(at runtime the it shouldn't really matter)


 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802024#action_12802024
 ] 

Uri Boness commented on SOLR-1725:
--

Sorry... of course it matters for the build :-)

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1725) Script based UpdateRequestProcessorFactory

2010-01-18 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1725:
-

Attachment: SOLR-1725.patch

Third try :-), this time Java 5 compatible

 Script based UpdateRequestProcessorFactory
 --

 Key: SOLR-1725
 URL: https://issues.apache.org/jira/browse/SOLR-1725
 Project: Solr
  Issue Type: New Feature
  Components: update
Affects Versions: 1.4
Reporter: Uri Boness
 Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch


 A script based UpdateRequestProcessorFactory (Uses JDK6 script engine 
 support). The main goal of this plugin is to be able to configure/write 
 update processors without the need to write and package Java code.
 The update request processor factory enables writing update processors in 
 scripts located in {{solr.solr.home}} directory. The functory accepts one 
 (mandatory) configuration parameter named {{scripts}} which accepts a 
 comma-separated list of file names. It will look for these files under the 
 {{conf}} directory in solr home. When multiple scripts are defined, their 
 execution order is defined by the lexicographical order of the script file 
 name (so {{scriptA.js}} will be executed before {{scriptB.js}}).
 The script language is resolved based on the script file extension (that is, 
 a *.js files will be treated as a JavaScript script), therefore an extension 
 is mandatory.
 Each script file is expected to have one or more methods with the same 
 signature as the methods in the {{UpdateRequestProcessor}} interface. It is 
 *not* required to define all methods, only those hat are required by the 
 processing logic.
 The following variables are define as global variables for each script:
  * {{req}} - The SolrQueryRequest
  * {{rsp}}- The SolrQueryResponse
  * {{logger}} - A logger that can be used for logging purposes in the script

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1716) Add logging support for ScriptTransformer

2010-01-12 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1716:
-

Attachment: SOLR-1716.patch

This patch puts the logger in the global scope, so you don't have to specify 
the logger as part of the function signature. It also cleans up the code the 
related classes a bit.

 Add logging support for ScriptTransformer
 -

 Key: SOLR-1716
 URL: https://issues.apache.org/jira/browse/SOLR-1716
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: SOLR-1716.patch, SOLR-1716.patch


 Currently it's very hard to debug the logic embedded in the script ran by the 
 ScriptTransformer. There should be a possibility to add a logger to the 
 function signature, which can be used for logging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer

2010-01-12 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799186#action_12799186
 ] 

Uri Boness commented on SOLR-1716:
--

There is still one thing to improve here. Right now a ScriptTransformer is 
created for each function in the script, which means an engine is created for 
each function. This can be optimized by creating one script engine per 
EntityProcessor which each ScriptTransformer will use to execute a dedicated 
function.

 Add logging support for ScriptTransformer
 -

 Key: SOLR-1716
 URL: https://issues.apache.org/jira/browse/SOLR-1716
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: SOLR-1716.patch, SOLR-1716.patch


 Currently it's very hard to debug the logic embedded in the script ran by the 
 ScriptTransformer. There should be a possibility to add a logger to the 
 function signature, which can be used for logging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1716) Add logging support for ScriptTransformer

2010-01-11 Thread Uri Boness (JIRA)
Add logging support for ScriptTransformer
-

 Key: SOLR-1716
 URL: https://issues.apache.org/jira/browse/SOLR-1716
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Uri Boness
 Fix For: 1.5


Currently it's very hard to debug the logic embedded in the script ran by the 
ScriptTransformer. There should be a possibility to add a logger to the 
function signature, which can be used for logging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1698) load balanced distributed search

2010-01-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798851#action_12798851
 ] 

Uri Boness commented on SOLR-1698:
--

I think the patch doesn't work. I just checkout the trunk and applying the 
patch fails with a conflict for LBHttpSolrServer.java

 load balanced distributed search
 

 Key: SOLR-1698
 URL: https://issues.apache.org/jira/browse/SOLR-1698
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, 
 SOLR-1698.patch


 Provide syntax and implementation of load-balancing across shard replicas.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1698) load balanced distributed search

2010-01-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798870#action_12798870
 ] 

Uri Boness commented on SOLR-1698:
--

yep.. that works

 load balanced distributed search
 

 Key: SOLR-1698
 URL: https://issues.apache.org/jira/browse/SOLR-1698
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, 
 SOLR-1698.patch, SOLR-1698.patch


 Provide syntax and implementation of load-balancing across shard replicas.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer

2010-01-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798884#action_12798884
 ] 

Uri Boness commented on SOLR-1716:
--

yeah, I thought about the global context as well, but this was just something 
that I implemented anyway (as I needed it myself) and it works. You don't have 
to supply the logger, but if you do you need to specify the full method 
signature, that is:
{code}
function(row, context, logger) {

}
{code}

but the following will work as well:
{code}
function(row, context) {

}
{code}
and
{code}
function(row) {

}
{code}


 Add logging support for ScriptTransformer
 -

 Key: SOLR-1716
 URL: https://issues.apache.org/jira/browse/SOLR-1716
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: SOLR-1716.patch


 Currently it's very hard to debug the logic embedded in the script ran by the 
 ScriptTransformer. There should be a possibility to add a logger to the 
 function signature, which can be used for logging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer

2010-01-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798896#action_12798896
 ] 

Uri Boness commented on SOLR-1716:
--

working on a new patch to put the logging in a global context (and cleaning up 
the code a bit)

 Add logging support for ScriptTransformer
 -

 Key: SOLR-1716
 URL: https://issues.apache.org/jira/browse/SOLR-1716
 Project: Solr
  Issue Type: Improvement
  Components: contrib - DataImportHandler
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: SOLR-1716.patch


 Currently it's very hard to debug the logic embedded in the script ran by the 
 ScriptTransformer. There should be a possibility to add a logger to the 
 function signature, which can be used for logging.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1705) Move QueryConvertor into SpellCheckComponent configuration

2010-01-06 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12797049#action_12797049
 ] 

Uri Boness commented on SOLR-1705:
--

Wouldn't QueryTokenizer be a more appropriate name for this class?

 Move QueryConvertor into SpellCheckComponent configuration
 --

 Key: SOLR-1705
 URL: https://issues.apache.org/jira/browse/SOLR-1705
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 1.5


 QueryConvertor is a top level XML tag in solrconfig.xml but it is used by 
 SpellCheckComponent only. Deprecate the current queryConvertor configuration 
 and move it inside SpellCheckComponent configurationl.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1602) Refactor SOLR package structure to include o.a.solr.response and move QueryResponseWriters in there

2010-01-02 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795920#action_12795920
 ] 

Uri Boness commented on SOLR-1602:
--

I think it is very important to understand all sides here. 

I fully and totally support Chris's attempts to clean up the code base which 
rightfully involves moving classes from one package to another. I think in some 
cases such cleanups need to come at the cost of user comfort as eventually 
they, as users, also gain from it as the system as a whole becomes more robust, 
extensible and maintainable. The good thing is that besides the deprecation 
issue I believe there is a consensus about the required changes. So thumbs up 
Chris!!!

To deprecate or not to deprecate, that is the question. In a widely used 
library/framework/system with a large (or fast growing) install/user base such 
as the Solr community, the common practice is *not* to just break BWC without 
giving the users some grace period in which they can adjust their deployments 
to the new changes. Sometimes, it's absolutely necessary (such in the cases of 
bug fixes) but when it's not, in general it can create the opposite effect than 
you want with the community - instead having the community appreciate your 
improvements and see Solr as an improving with time product, they turn and 
see Solr as an inconsistent and sometime even unreliable product. So from my 
experience with delivering goods for the users, especially in the open source 
world (and I do have quite a bit of experience in that respect with the Spring 
Framework) you always need to strive to 100% BWC in theory and ~95% BWC in 
practice (never less than 90% though). If you stick to that, I believe changes 
will be widely accepted as improvements rather than harassments. 

But there's a catch here In order to stick to these numbers, you have to 
adhere to two important conditions:

1. You need to have a rather solid architecture and code base to start with. If 
you don't, then naturally in the beginning you can expect many more 
extreme/major changes which lead to quite a few BWC breaks (it will gradually 
be reduced as the architecture/codebase improves). Whether Solr answers this 
condition is open for debate... there are a lot of solid parts in Solr and 
quite a few parts where a complete rewrite is appropriate.

2. You need to have a steady and short release cycles. This is one thing that 
Solr lacks... big time. In a 1 year release cycle, deprecating code means that 
for the next year (in some cases two years), the code base will be messy with 
deprecated classes all over the place. In that respect, I can definitely 
understand Chris's objection for deprecation as the cleanup tasks that he's 
implementing may end up creating more mess (at least for a long while) than you 
had before the cleanup all together. I believe that moving to shorter release 
cycles (including bug-fix releases) will greatly help promoting deprecation in 
general.

(NOTE: just a small note about the first condition. One thing to take into 
account is that *every* piece of software reaches a point in time where it 
needs to be completely re-written or at least go through a *major* 
refactoring/re-architecturing phase. This can be caused by many factors, let it 
be new technologies that are introduced, or simply limitations of the 
architecture that were not foreseen. It's very important to understand and 
admit to this fact - even from the user point of view it's acceptable. What is 
not acceptable, if it happens too many times and too frequent) 

Bottom line, it's always a conflict between the user point of view and the 
developer point of view. And there needs to be a balance and understanding of 
both sides. Each side needs to understand and give in to some extend to create 
this balance. But to make it happen, the culture, environment and well defined 
policies need to be in place. Arguing endlessly who's right here will never 
bring to a good outcome, simply because both sides are right and wrong at the 
same time, if you treat it as a black or white issue you'll end up loosing 
something - either the user trust or a better software. How about creating a 
proper release plan for the upcoming year, say a release every two months? 
Chris, if you have such a release schedule, will you feel more comfortable with 
deprecation?

 Refactor SOLR package structure to include o.a.solr.response and move 
 QueryResponseWriters in there
 ---

 Key: SOLR-1602
 URL: https://issues.apache.org/jira/browse/SOLR-1602
 Project: Solr
  Issue Type: Improvement
  Components: Response Writers
Affects Versions: 1.2, 1.3, 1.4
 Environment: independent of environment (code structure)
Reporter: Chris A. 

[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields

2009-12-23 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794250#action_12794250
 ] 

Uri Boness commented on SOLR-1298:
--

{quote}I think they should be inline, as they are just values associated with a 
document. I think putting it in some other list is sticking too literally to 
what Lucene calls a field, which I don't think Solr has to do that. One could 
easily imagine a Solr component that brought in a database or other storage 
repository for supplementary fields and it should all be seamless to the 
client.{quote}

I definitely agree that one shouldn't see a field in Solr as a field in Lucene. 
That said, I think do have a tendency to see a field in Solr as somehow bound 
to the Solr schema. 

One thing to notice is that eventually we end up with the same discussion 
regarding this feature in the context of different issues, let it be 
highlighting or field collapsing. In some cases it feel just right to return 
the data as a field in a document, in other places it feels right to have as 
something else. It is true that when you interact with solr directly (specially 
if you do it manually) you certainly know what queries you send, what functions 
you request and what you should expect in the result. But from experience, a 
lot of times you try to automate things a bit and creating a well structured 
and descriptive protocols is the safe way to enable that. 

{quote}I don't want to have to go look it up in some other list while I am 
iterating over my results when all the other values I'm displaying/using are 
right there associated with the document.{quote}

Having a sub-section under each documents still associates it with the 
document. The way I see it, It's like OOP... you can have a Person class that 
holds all the information of the person it it as primitive fields, or you can 
group related data, like address info, int a separate Address class. 

{quote}That being said, it could be useful to add an attribute that indicates 
it is a generated name{quote}

That's one way to group fields together, but if you're already doing that, then 
why not go all the way? If you need to distinguish between generated and 
non-generated names, why not make it simpler and just separate the two in a 
different list? (To continue the analogies line I started above :-)) it's like 
XML, you can have a single level hierarchy were each element defines attributes 
to relate it to other elements, but a more suitable solution would just be to 
group all related elements under one parent element.

{quote}I'd even argue that highlighter results should be inline, too, but that 
is a different issue and a bigger can of worms since it has a well used API 
already.{quote}
In some cases it might be (well it just is) more appropriate to have the 
highlighting inlined. In other cases it might not be possible, specially with 
some of the latest requests to have highlighting functionality available for 
arbitrary text loaded from anywhere (which I believe will lead for a 
highlighting component/requestHandler that will be independent of the query 
component).

{quote}Not saying this is right or wrong, but I think it would be useful to 
document here the rationale about why not to do it. Is it just b/c that method 
is expected to do, more or less, what the Lucene IndexSearcher does?{quote}
I guess so... I guess SolrIndexSearcher is in fact a Lucene IndexSearcher which 
is the source for this association. In some ways I think it's also relates a 
bit to the response structure (not directly though, but conceptually)... if the 
IndexSearcher represents Lucene and the document contains fields coming from 
other sources as well, perhaps this functionality of gathering all these fields 
(/metadata ;-)) should be done in a higher level where SolrIndexSearcher just 
serves as on field source. The main reason why Chris's patch puts this 
functionality in the doc() method of the SolrIndexSearcher is simply because 
it's the easiest and the simplest solution right now... and I don't thing 
there's nothing wrong with that... simple is good! Even with this solution as 
it is, the field sources are still abstracted away in the form of a 
FieldValues or DocumentMutator, so architecture-wise I don't see leaving it 
as is will compromise anything.

 FunctionQuery results as pseudo-fields
 --

 Key: SOLR-1298
 URL: https://issues.apache.org/jira/browse/SOLR-1298
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1298-FieldValues.patch, SOLR-1298.patch


 It would be helpful if the results of FunctionQueries could be added as 
 fields to a document. 
 Couple of options here:
 1. Run FunctionQuery 

[jira] Commented: (SOLR-236) Field collapsing

2009-12-23 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794252#action_12794252
 ] 

Uri Boness commented on SOLR-236:
-

{quote}If we are returning a number of documents (as opposed to a number of 
groups) to the user, how do they avoid splitting on a page in the middle of the 
group?{quote}

As far as I know (Martijn, correct me if I'm wrong), Martijn's patch returns 
the number of groups *and* documents, where each group is actually represented 
as a document. So in that sense, the total count applies to the result set as 
is (groups count as documents) and therefore pagination just works. 

{quote}The only thing this algorithm can't do (related to pagination) is give 
the total number of documents after collapsing (and hence can't calculate the 
exact number of pages). This can be fine in many circumstances as long as the 
gui handles it (people don't seem to mind google doing it... I just tried it. 
Google didn't show the result count right unless displaying the last 
page).{quote}

First of all, I must admit that I never noticed that in Google, so I guess 
you're right :-). But when you think about it, with Google, how many time do 
you get a low hit count that only fits in 2-3 pages? Well, I hardly ever get 
it, and when I do I don't even bother to check the result I just try to improve 
my search. With Solr, a lot of times its different, specially when all these 
discovery features and faceting are so often used to narrow the search 
extensively... I'm not saying not having a perfect pagination mechanism is a 
problem... not at all, I'm just saying that it *might* be an issue for specific 
use cases or specific domains but that's just an assumption (or a gut 
feeling) :-)

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-12-22 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793554#action_12793554
 ] 

Uri Boness commented on SOLR-236:
-

bq. Why is it wrong. it is about adding meta-info to the docs. This is what we 
plan to do with SOLR-1566

This is exactly the point, it's not really meta-data over the document, but on 
the group the document belongs to. And you also need a more obvious way to mark 
this document as a group representation (to distinguish it from other normal 
documents).

bq. Even when we collapse what we are expecting is simple search results. So a 
drastic deviation from the standard format is not a good idea.
I definitely agree that BWC should be kept, specially here when we're dealing 
with a query component. But extending the current doc element, doesn't mean 
we break BWC. Adding a collapse-info (or collapse-meta-data) sub element to 
it, will certainly not break anything, specially when we still don't have a 
formal xsd for the responses (I know we're working on it, but it's still not 
out there so it's safe).



 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-12-22 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793565#action_12793565
 ] 

Uri Boness commented on SOLR-236:
-

@Yonik

As far as I understand from your collapse algorithm proposal, in order to save 
memory you'd like to restrict the group creation to only those that belong in 
the requested results page. Beyond loosing the faceting support over the 
collapsed DocSet, I think there might be a problem with pagination as well. For 
every page you'll end up with a different total count and therefore different 
number of pages. This can be very confusing from the user perspective - imagine 
going to the first page and calculating (and displaying) that you have 3 pages 
of results, then when the user asks for the second page, s/he gets a response 
with 2 pages and different total count. 

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-12-21 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793411#action_12793411
 ] 

Uri Boness commented on SOLR-236:
-

@Shalin

I think mixing the collapse information with document fields is wrong. The 
collapse fields don't really belong to the document, but to the group the 
document represents, while the other field do belong to it. The response format 
should somehow indicate this difference.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1668) Declarative configuration meta-data for Solr plugins

2009-12-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792454#action_12792454
 ] 

Uri Boness commented on SOLR-1668:
--

@Erik

{quote}
Also note that Ant's configuration mechanism isn't just with setters. A java 
task for example can take any number of sysproperty sub-elements, and they 
get injected via addSysproperty(Environment.Variable sysp). 
{quote}

System properties can be supported in 2 ways:
1. On the configuration level using an expression language (a la Spring... 
yes.. Spring supports it :-)). This means that in the schema you'll be able to 
configure properties like: stopWordFile=${conf.dir}/stopwords.txt. the 
conf.dir parameter can be replaced either from system properties, properties 
file, or other source. Eventually these properties  
2. Using another annotation (say, @SystemProperty) which indicates the value 
should first be taken from the system properties and then converted to the 
required data type

bq.It's the specifying of the converter class in the annotation that I don't 
like. It can be more implicit than that, like magic

The @Converter annotation is mainly aimed for user extensions. Indeed all the 
out-of-the-box plugins don't need to have it as default converters can be 
pre-registered to handle all the data types we need at the moment. For users 
who want to provide their own plugins, we need to provide them a simple 
mechanism to register converters and I found the @Converter annotation to be 
the simplest one.

bq. We'd have setStopWordList(SolrFile f), and we'd only that setter after the 
system properties were in the mix. 
As you said, I believe once we have system properties supported this will be a 
no brainer and indeed I believe this belongs to an earlier properties 
substitution phase (as mentioned above).

@Noble

bq. is there anyone building it?
Oh yes :-), but beyond that, this will open up opportunities to develop plugins 
to IDE's/TextEditors for Solr... even just for better support in writing the 
schema files with auto-completion, validation, etc...

bq. Why do we need this magic in String- Object conversion at all? 
Well, my obvious response is because of the nature of Solr configuration which 
is text based while at runtime you're dealing with other data types. Of course 
you can just create String setters and do the conversion yourself, but why do 
that if you can have done automatically and keep your classes clean. Just to be 
clear, the magic is not really magic we can be very clear about what 
converters are supported out of the box and (as I mentioned above) with the 
@Converter annotation users can be more explicit in how they want the 
conversion to take place. Bottom line, in the end of the day you want to be 
able to focus and write the plugins as POJO's  using properties of the correct 
data types and focus on the plugin's logic rather than also focusing on 
configuration logic. 




 Declarative configuration meta-data for Solr plugins
 

 Key: SOLR-1668
 URL: https://issues.apache.org/jira/browse/SOLR-1668
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5

 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch


 The idea here is for plugins in Solr to carry more meta data over their 
 configuration. This can be very useful for building tools around Solr where 
 this meta data can be used to assist users in configuring solr. One common 
 mechanism to provide this meta data is by using standard Java Beans for the 
 different configuration constructs where the properties define the 
 configurable attributes and annotations are used to provide extra information 
 about them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-12-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792458#action_12792458
 ] 

Uri Boness commented on SOLR-236:
-

bq. I'm curious as to whether anyone has just thought of using the Clustering 
component for this? If your collapse field was a single token, I wonder if 
you would get the results you're looking for.

The main difference between the two components is that while the clustering 
works more as a function where the input is the doclist/docset and the output 
is a separate data structure representing the groups, the collapse component 
operates directly on the docset  doclist modifies them and incorporates the 
groups within the final search result.

In all occurrences where we found the need for the collapse component, we 
needed to incorporate the grouping within the search result, and adjust the 
sorting and the pagination accordingly. As far as I know you cannot do that 
with the clustering component. This tight integration with the result is also 
the reason why the collapse component right now is actually a replacement to 
the query component.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-12-18 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792686#action_12792686
 ] 

Uri Boness commented on SOLR-236:
-

Essentially it boils down to two options:

# Keep it out of the trunk, in which case users that will need this 
functionality will only get it by working with a patched Solr version of their 
own, or use a branch (in both cases, most likely they will miss the continuous 
work done on the trunk unless they keep on merging the changes)
# Keep in the trunk with some caveats, in which case they users have a chance 
to use this functionality out of the box

In both cases, the user have a choice to make:
- be satisfied by the performance of this feature
- look for an alternative solution (other products)
- give up this functionality all together (if their business requirements allow 
that)

So the main difference here I would say is in how easy you'd like to provide 
this functionality to the users. On the Solr development part, indeed once this 
is committed to the trunk there's much more responsibility on the committers to 
make it work (enhance performance and fix bugs)... but this is a *good* thing 
as there is a high demand for this feature and as a community driven project 
this demand should to be satisfied. And I *do* think that the number of users 
using this patch already is a good indicator that it is good enough for quite a 
lot of use cases.

I do agree though that before committing anything, the public API should be 
re-evaluated to minimize chances for BWC issues later on. BTW, regarding the 
response, Solr already has a few places where the response format is still 
marked as experimental and as subject to changes in the future (but it doesn't 
stop people from using this functionality as they take the responsibility to 
adapt to any such future changes when the come).

Now... writing this, it suddenly occurred to me that there might be another 
solution to this all discussion which is in a way a combination of many of the 
suggestions in this thread. What if, this patch would be split to two: the 
changes to the core and the component itself. Now, if the changes to the core 
are not that drastic and make sense (or at least everyone can live with them) 
then perhaps they can be committed to the trunk. As for the rest of the patch 
(which consists of the search components and its other supporting classes), 
this can be put in SVN as separate branch for contrib. The good thing about 
this solution is that the work done on this functionality will be in SVN so you 
benefit from it as David mentioned above. The other benefit is that with this 
layout you can actually build the branched code base separately and distribute 
this functionality as a separate jar which can be deployed in Solr 1.5x 
distribution. Again, a bit of work left to the users (too much to my taste) but 
at least they're not forced to use a patched version of Solr. Would that be a 
possible solution?

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, 
 SOLR-236_collapsing.patch, SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.

[jira] Updated: (SOLR-1668) Declarative configuration meta-data for Solr plugins

2009-12-18 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1668:
-

Attachment: SOLR-1668.patch

In this patch I removed the need for the @InitProperty annotation. Instead any 
setter in the class will be considered as an initialization property. You can 
use the @Required annotation to mark properties as mandatory and the 
@ArgumentName to customize the name of the argument used to initialize it.

 Declarative configuration meta-data for Solr plugins
 

 Key: SOLR-1668
 URL: https://issues.apache.org/jira/browse/SOLR-1668
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5

 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch, 
 SOLR-1668.patch


 The idea here is for plugins in Solr to carry more meta data over their 
 configuration. This can be very useful for building tools around Solr where 
 this meta data can be used to assist users in configuring solr. One common 
 mechanism to provide this meta data is by using standard Java Beans for the 
 different configuration constructs where the properties define the 
 configurable attributes and annotations are used to provide extra information 
 about them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-236) Field collapsing

2009-12-17 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792189#action_12792189
 ] 

Uri Boness commented on SOLR-236:
-

{quote}
Grant, this patch may not be perfect but I think we all agree that it is a 
great start. This is stable, used by many and has been well supported by the 
community. This is also a large patch and as I have known from my 
DataImportHandler experience, maintaining a large patch is quite a pain (and 
DataImportHandler didn't even touch the core). How about we commit this (after 
some review, of course), mark this as experimental (no guarantees of any sort) 
and then start improving it one issue at a time? Alternately, if you are not 
comfortable adding it to trunk, we can commit this on a branch and merge into 
trunk later.
{quote}

I think managing a separate branch will be just as hard as managing a patch. I 
do however agree that it's about time this patch will be committed to the 
trunk. Even though the current solution is not scalable in terms of distributed 
search (and I agree that the current solution for that is not really a viable 
solution), many are already using it and it is the most wanted feature in JIRA 
after all. One think you can do, is apply the changed to the core (which are 
not really many) and commit the rest of the patch as a contrib (along with all 
the disclaimers Shalin mentioned above). 

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1668) Declarative configuration meta-data for Solr plugins

2009-12-17 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1668:
-

Attachment: commons-beanutils-1.8.2.jar
SOLR-1668.patch

This patch provides Java Bean configuration for all MapInitializedPlugins. To 
showcase this functionality, I changed the TokenizerFactory to implement the 
MapInitializedPlugin interface and changed the PatternTokenizerFactory to use 
the new Java Bean configuration. This implementation depends on the 
commons-beanutils library which should be added to the lib directory.

 Declarative configuration meta-data for Solr plugins
 

 Key: SOLR-1668
 URL: https://issues.apache.org/jira/browse/SOLR-1668
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5

 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch


 The idea here is for plugins in Solr to carry more meta data over their 
 configuration. This can be very useful for building tools around Solr where 
 this meta data can be used to assist users in configuring solr. One common 
 mechanism to provide this meta data is by using standard Java Beans for the 
 different configuration constructs where the properties define the 
 configurable attributes and annotations are used to provide extra information 
 about them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1668) Declarative configuration meta-data for Solr plugins

2009-12-17 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792312#action_12792312
 ] 

Uri Boness commented on SOLR-1668:
--

Thanks! Well... no it's not Ant yet or Spring, but it's a start that can 
already help with Tokenizers  Filters. The current patch is actually based on 
setters but adding annotations on top of them can add even more meta data. For 
example, marking a property as required or associating a different 
configuration name perhaps to differentiate user friendly naming from code 
friendly naming (How does Ant deal with these stuff?).

 Declarative configuration meta-data for Solr plugins
 

 Key: SOLR-1668
 URL: https://issues.apache.org/jira/browse/SOLR-1668
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5

 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch


 The idea here is for plugins in Solr to carry more meta data over their 
 configuration. This can be very useful for building tools around Solr where 
 this meta data can be used to assist users in configuring solr. One common 
 mechanism to provide this meta data is by using standard Java Beans for the 
 different configuration constructs where the properties define the 
 configurable attributes and annotations are used to provide extra information 
 about them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-17) XSD for solr requests/responses

2009-12-15 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791023#action_12791023
 ] 

Uri Boness commented on SOLR-17:


Having well defined XSD's for public services can be *extremely* helpful in 
many aspects... together with proper version management they define the 
contract between the users the the service. Some of the use cases that Chris 
listed above are definitely valid and realistic. Moreover, XSD provides a 
natural and proper documentation for the supported formats which any decent xml 
editor can make use of and provide you with hints for writing the 
solrconfig.xml and the schema.xml (for example). 

That said... most of the xml formats in Solr are too generic to benefit from 
XSD's. The only format where it makes sense is the schema.xml as it has an 
expressive domain-driven structure. Unfortunately this is something you cannot 
say for for the response formats and the solrconfig.xml where the 
expressiveness lays within the *values* of the elements/attributes rather than 
in the elements/attribute *names* themselves. XSD doesn't handle 
element/attribute values very well.



 XSD for solr requests/responses
 ---

 Key: SOLR-17
 URL: https://issues.apache.org/jira/browse/SOLR-17
 Project: Solr
  Issue Type: Improvement
Reporter: Mike Baranczak
Priority: Minor
 Attachments: solr-complex.xml, solr-rev2.xsd, solr.xsd, 
 UselessRequestHandler.java


 Attaching an XML schema definition for the responses and the update requests. 
 I needed to do this for myself anyway, so I might as well contribute it to 
 the project.
 At the moment, I have no plans to write an XSD for the config documents, but 
 it wouldn't be a bad idea.
 TODO: change the schema URL. I'm guessing that Apache already has some sort 
 of naming convention for these?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-17) XSD for solr requests/responses

2009-12-15 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791126#action_12791126
 ] 

Uri Boness commented on SOLR-17:


{quote}
However, as a start, I think contributing and committing the SOLR XML response 
writer output XSD (and a DTD, which I'll attach) is something that adds value, 
doesn't take anything away, or touch other parts of the code, etc., and is 
worthwhile to do.
{quote}

Fair enough. I guess it can always serve as a reference to better understanding 
what to expect from a Solr response (instead of trying to figure things out 
from the code). Good thing about this generic format is that it's unlikely to 
change that frequently, so the XSD's will probably not change that often as 
well.

 XSD for solr requests/responses
 ---

 Key: SOLR-17
 URL: https://issues.apache.org/jira/browse/SOLR-17
 Project: Solr
  Issue Type: Improvement
Reporter: Mike Baranczak
Priority: Minor
 Attachments: solr-complex.xml, solr-rev2.xsd, solr.xsd, 
 UselessRequestHandler.java


 Attaching an XML schema definition for the responses and the update requests. 
 I needed to do this for myself anyway, so I might as well contribute it to 
 the project.
 At the moment, I have no plans to write an XSD for the config documents, but 
 it wouldn't be a bad idea.
 TODO: change the schema URL. I'm guessing that Apache already has some sort 
 of naming convention for these?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1649) Refactor all ResponseWriters to be more extension-friendly

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789884#action_12789884
 ] 

Uri Boness commented on SOLR-1649:
--

I think the class hierarchy needs to change. see: 
https://issues.apache.org/jira/browse/SOLR-1123?focusedCommentId=12711133page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12711133

 Refactor all ResponseWriters to be more extension-friendly
 --

 Key: SOLR-1649
 URL: https://issues.apache.org/jira/browse/SOLR-1649
 Project: Solr
  Issue Type: Improvement
  Components: Response Writers
Affects Versions: 1.4
 Environment: My local MacBook pro over the Christmas break.
Reporter: Chris A. Mattmann
 Fix For: 1.5


 I'd like to refactor all the ResponseWriters to be a bit less brittle. 
 ResponseWriters should follow a standard interface with more existing methods 
 than is currently present in the interface, and with lots of refactored 
 utility code and more concrete control/data flow. I'll take a hard look at 
 the existing response writers and try to generalize.
 See this thread for background:
 http://www.lucidimagination.com/search/document/e8bb6cac84c1f520/namespaces_in_response_solr_1586#cc50ba9e9d8fe2dc

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789889#action_12789889
 ] 

Uri Boness commented on SOLR-1298:
--

{quote}
I certainly can. I hadn't thought about having a function as an fl parameter 
value, but that makes alot of sense and I can support that through my work as 
well. I'll work on extracting the code today and will get a patch here ASAP.
{quote}

As far as I recall the fact the functions are specified in the fl parameter 
should still work with the FieldValueSource as it is at the moment. The 
registry enables you to register any value for any string key, in this case the 
string key is the function.

 FunctionQuery results as pseudo-fields
 --

 Key: SOLR-1298
 URL: https://issues.apache.org/jira/browse/SOLR-1298
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5


 It would be helpful if the results of FunctionQueries could be added as 
 fields to a document. 
 Couple of options here:
 1. Run FunctionQuery as part of relevance score and add that piece to the 
 document
 2. Run the function (not really a query) during Document/Field retrieval

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789890#action_12789890
 ] 

Uri Boness commented on SOLR-1298:
--

Chris, another thing. You might want to update the FieldValueSource solution to 
work with SOLR-1644 (instead of the request context)

 FunctionQuery results as pseudo-fields
 --

 Key: SOLR-1298
 URL: https://issues.apache.org/jira/browse/SOLR-1298
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5


 It would be helpful if the results of FunctionQueries could be added as 
 fields to a document. 
 Couple of options here:
 1. Run FunctionQuery as part of relevance score and add that piece to the 
 document
 2. Run the function (not really a query) during Document/Field retrieval

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789908#action_12789908
 ] 

Uri Boness commented on SOLR-1298:
--

I like the idea of giving the providing a broader context (document, request, 
response). This will also allow them to operate on multiple documents in the 
response (whether it's the docset or the doclist).

One thing to take into consideration here is that one you introduce dependency 
between the fields, there must be a way to determine the ordering of the 
providers (as one provider might depend on fields generated by another 
provider).

as for the field AS alias syntax. I think this should be consistent with 
the work in SOLR-1351 which is currently based on localparams. Perhaps there 
should be a common approach to handle aliases in requests.

I think that the proper approach is to separate the stored fields from other 
fields.. perhaps even put it in a separate meta-data section under the 
document. But once you do that, again, for the sake of consistency, it would 
also be wise *not* to include these fields/functions in the fl parameter. So 
the fl parameter will refer to fields, and another parameter meta will 
refer to meta-data values.

bq. fl={!func}foo
+1 or even func:foo. Then you can have things like url:url or file:file 
path or even db:db alias + field

 FunctionQuery results as pseudo-fields
 --

 Key: SOLR-1298
 URL: https://issues.apache.org/jira/browse/SOLR-1298
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1298-FieldValues.patch, SOLR-1298.patch


 It would be helpful if the results of FunctionQueries could be added as 
 fields to a document. 
 Couple of options here:
 1. Run FunctionQuery as part of relevance score and add that piece to the 
 document
 2. Run the function (not really a query) during Document/Field retrieval

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789920#action_12789920
 ] 

Uri Boness commented on SOLR-1644:
--

bq. if (rb.store.get(HighlightingComponent.DO_HIGHLIGHTING) == Boolean.TRUE)

This is verbose... too verbose to my taste. I believe a Store interface can 
help here which provide access to data by a key and will also provide helper 
methods to keep the code clean. (a MapStore can be a simple implementation 
which wraps a MapString, Object instance):

{code}
public interface Context {

Boolean getBoolean(String key);
boolean getBoolean(String key, boolean defaultValue);

Integer getInt(String key);
int getInt(String key, int defaultValue);

//other methods for all primitive types and dates.
}
{code}

so now you have:

{code}
if (rb.store.getBoolean(HighlightingComponent.DO_HIGHLIGHTING, false))
{code}

which is cleaner and is NPE-safe.

bq. I believe the public API's should have no dependency on components .
I agree. Basically avoid having circular dependencies. You don't want to change 
the platform API every time you introduce a new component.

 Provide a clean way to keep flags and helper objects in ResponseBuilder
 ---

 Key: SOLR-1644
 URL: https://issues.apache.org/jira/browse/SOLR-1644
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1644.patch


 Many components such as StatsComponent, FacetComponent etc keep flags and 
 helper objects in ResponseBuilder. Having to modify the ResponseBuilder for 
 such things is a very kludgy solution.
 Let us provide a clean way for components to keep arbitrary objects for the 
 duration of a (distributed) search request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789920#action_12789920
 ] 

Uri Boness edited comment on SOLR-1644 at 12/13/09 6:13 PM:


bq. if (rb.store.get(HighlightingComponent.DO_HIGHLIGHTING) == Boolean.TRUE)

This is verbose... too verbose to my taste. I believe a Store interface can 
help here which provide access to data by a key and will also provide helper 
methods to keep the code clean. (a MapStore can be a simple implementation 
which wraps a MapString, Object instance):

{code}
public interface Store {

Boolean getBoolean(String key);
boolean getBoolean(String key, boolean defaultValue);

Integer getInt(String key);
int getInt(String key, int defaultValue);

//other methods for all primitive types and dates.
}
{code}

so now you have:

{code}
if (rb.store.getBoolean(HighlightingComponent.DO_HIGHLIGHTING, false))
{code}

which is cleaner and is NPE-safe.

bq. I believe the public API's should have no dependency on components .
I agree. Basically avoid having circular dependencies. You don't want to change 
the platform API every time you introduce a new component.

  was (Author: uboness):
bq. if (rb.store.get(HighlightingComponent.DO_HIGHLIGHTING) == Boolean.TRUE)

This is verbose... too verbose to my taste. I believe a Store interface can 
help here which provide access to data by a key and will also provide helper 
methods to keep the code clean. (a MapStore can be a simple implementation 
which wraps a MapString, Object instance):

{code}
public interface Context {

Boolean getBoolean(String key);
boolean getBoolean(String key, boolean defaultValue);

Integer getInt(String key);
int getInt(String key, int defaultValue);

//other methods for all primitive types and dates.
}
{code}

so now you have:

{code}
if (rb.store.getBoolean(HighlightingComponent.DO_HIGHLIGHTING, false))
{code}

which is cleaner and is NPE-safe.

bq. I believe the public API's should have no dependency on components .
I agree. Basically avoid having circular dependencies. You don't want to change 
the platform API every time you introduce a new component.
  
 Provide a clean way to keep flags and helper objects in ResponseBuilder
 ---

 Key: SOLR-1644
 URL: https://issues.apache.org/jira/browse/SOLR-1644
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1644.patch


 Many components such as StatsComponent, FacetComponent etc keep flags and 
 helper objects in ResponseBuilder. Having to modify the ResponseBuilder for 
 such things is a very kludgy solution.
 Let us provide a clean way for components to keep arbitrary objects for the 
 duration of a (distributed) search request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type

2009-12-13 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789922#action_12789922
 ] 

Uri Boness commented on SOLR-1123:
--

I think the main issue with the inheritance right now is that the 
QueryResponseWriter interface is dealing with a Writer rather than with an 
OutputStream. This accounts for the hacky GenericBinaryResponseWriter. 

Looking at SOLR-1516 I'm a bit confused. I always had the impression that the 
main idea behind the response writers is that all they need to know is how to 
marshal a NamedList (so they don't need explicit knowledge of documents, 
highlighting, etc...). But now the GenericTextResponseWriter knows about 
documents (via the SingleResponseWriter). But perhaps I just go it wrong.

 Change the JSONResponseWriter content type
 --

 Key: SOLR-1123
 URL: https://issues.apache.org/jira/browse/SOLR-1123
 Project: Solr
  Issue Type: Improvement
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: JSON_contentType_incl_tests.patch


 Currently the jSON content type is not used. Instead the palin/text content 
 type is used. The reason for this as I understand is to enable viewing the 
 json response as as text in the browser. While this is valid argument, I do 
 believe that there should at least be an option to configure this writer to 
 use the JSON content type. According to 
 [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to 
 be application/json (and not text/x-json). The reason this can be very 
 helpful is that today you have plugins for browsers (e.g. 
 [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page 
 with application/json content type in a user friendly manner (just like xml 
 is supported).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder

2009-12-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789236#action_12789236
 ] 

Uri Boness commented on SOLR-1644:
--

Don't you have it already with the SolrQueryRequest.getContext()

 Provide a clean way to keep flags and helper objects in ResponseBuilder
 ---

 Key: SOLR-1644
 URL: https://issues.apache.org/jira/browse/SOLR-1644
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5


 Many components such as StatsComponent, FacetComponent etc keep flags and 
 helper objects in ResponseBuilder. Having to modify the ResponseBuilder for 
 such things is a very kludgy solution.
 Let us provide a clean way for components to keep arbitrary objects for the 
 duration of a (distributed) search request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder

2009-12-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789259#action_12789259
 ] 

Uri Boness commented on SOLR-1644:
--

It should also be possible to share objects between components to eliminate 
duplicate computations (if one component can re-use a computation that was 
already done in another component). I guess this can be supported by publishing 
the key as a public static field.

 Provide a clean way to keep flags and helper objects in ResponseBuilder
 ---

 Key: SOLR-1644
 URL: https://issues.apache.org/jira/browse/SOLR-1644
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1644.patch


 Many components such as StatsComponent, FacetComponent etc keep flags and 
 helper objects in ResponseBuilder. Having to modify the ResponseBuilder for 
 such things is a very kludgy solution.
 Let us provide a clean way for components to keep arbitrary objects for the 
 duration of a (distributed) search request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder

2009-12-11 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789290#action_12789290
 ] 

Uri Boness commented on SOLR-1644:
--

bq. We should not keep it static. public final should be good enough

Is there a special reason for this? Is the plan to have a KEY per component 
instance? If so, how would it be possible to refer to the key from other 
components?

This is what I had in mind - Assuming Component1 computed something and 
registered it in the store using KEY. Then Component2 can reuse this 
computation by accessing it as follows: 

{code}
Object someValue = rb.store.get(Component2.KEY);
// do something with someValue
{code}

 Provide a clean way to keep flags and helper objects in ResponseBuilder
 ---

 Key: SOLR-1644
 URL: https://issues.apache.org/jira/browse/SOLR-1644
 Project: Solr
  Issue Type: Improvement
  Components: search
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 1.5

 Attachments: SOLR-1644.patch


 Many components such as StatsComponent, FacetComponent etc keep flags and 
 helper objects in ResponseBuilder. Having to modify the ResponseBuilder for 
 such things is a very kludgy solution.
 Let us provide a clean way for components to keep arbitrary objects for the 
 duration of a (distributed) search request.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1625) Add regexp support for TermsComponent

2009-12-09 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788047#action_12788047
 ] 

Uri Boness commented on SOLR-1625:
--

regexp vs. regex - I really don't know. I always use/d regexp, but I guess we 
need to come up with something that is consistent with Solr. The first thing 
that comes to mind with a regular expression configuration in Solr is the 
highlighting component and indeed it uses regex, so it's best to stick to 
that.

bq. have expplicit strings like regex.flag=case_sensitiveregex.flag=multiline 
Yeah... I had this feeling as well, but I thought it might be too many extra 
parameters just for the regular expression support. If you think that's best I 
can add it.

I'll make the changes tonight and submit a new patch.

 Add regexp support for TermsComponent
 -

 Key: SOLR-1625
 URL: https://issues.apache.org/jira/browse/SOLR-1625
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Uri Boness
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1625.patch, SOLR-1625.patch


 At the moment the only way to filter the returned terms is by a prefix. It 
 would be nice it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent

2009-12-09 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1625:
-

Attachment: SOLR-1625.patch

Updated the patch to support the following changes (as discussed above):

- using terms.regex param (instead of terms.regexp)
- using more explicit names for the regex flags

 Add regexp support for TermsComponent
 -

 Key: SOLR-1625
 URL: https://issues.apache.org/jira/browse/SOLR-1625
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Uri Boness
Assignee: Noble Paul
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1625.patch, SOLR-1625.patch, SOLR-1625.patch


 At the moment the only way to filter the returned terms is by a prefix. It 
 would be nice it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent

2009-12-08 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1625:
-

Attachment: SOLR-1625.patch

Added support for regexp hints based on the different constants in the Pattern 
class. The terms.regexp.hints parameter accepts an int value corresponding to 
the value passed to the Pattern.compile(String expression, int hints) factory 
method. 

Using hints it is now possible to support case insensitive patterns.

 Add regexp support for TermsComponent
 -

 Key: SOLR-1625
 URL: https://issues.apache.org/jira/browse/SOLR-1625
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1625.patch, SOLR-1625.patch


 At the moment the only way to filter the returned terms is by a prefix. It 
 would be nice it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-343) Constraining date facets by facet.mincount

2009-12-05 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-343:


Attachment: SOLR-343.patch

Updated this patch to work with the current trunk and added tests

 Constraining date facets by facet.mincount
 --

 Key: SOLR-343
 URL: https://issues.apache.org/jira/browse/SOLR-343
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.2
 Environment: Solr 1.2+
Reporter: Raiko Eckstein
Priority: Minor
 Attachments: DateFacetsMincountPatch.patch, SOLR-343.patch


 It would be helpful to allow the facet.mincount parameter to work with date 
 facets, i.e. constraining the results so that it would be possible to filter 
 out date ranges in the results where no documents occur from the server-side. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1625) Add regexp support for TermsComponent

2009-12-05 Thread Uri Boness (JIRA)
Add regexp support for TermsComponent
-

 Key: SOLR-1625
 URL: https://issues.apache.org/jira/browse/SOLR-1625
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5


At the moment the only way to filter the returned terms is by a prefix. It 
would be nice it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent

2009-12-05 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1625:
-

Attachment: SOLR-1625.patch

 Add regexp support for TermsComponent
 -

 Key: SOLR-1625
 URL: https://issues.apache.org/jira/browse/SOLR-1625
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.5

 Attachments: SOLR-1625.patch


 At the moment the only way to filter the returned terms is by a prefix. It 
 would be nice it the filter could also be done by regular expression

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1351) facet on same field different ways

2009-09-13 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1351:
-

Attachment: SOLR-1351.patch

Took the approach as described above. The only difference is that instead of 
the id parameter I reused the key parameter already supported by this 
component. The idea is that now, when the key local param is specified, all 
the specific facet params need to use the key instead of the field name.

{code}
q=*:*facet=truefacet.field={!key=cat1}catf.cat1.facet.sort=truef.cat1.facet.limit=20f.cat1
facet.mincount=1facet.field={!key=cat2}catf.cat2.facet.sort=falsef.cat2.facet.count=0
{code}

This not only applies to simple filed facets but also to date facets:

{code}
q=*:*facet=truefacet.date={!key=foo}bdayf.foo.facet.date.start=1976-07-01T00:00:00.000Z
f.foo.facet.date.end=1976-07-01T00:00:00.000Z+1MONTHf.foo.facet.date.gap=+1DAY
f.foo.facet.date.other=allfacet.date={!key=bar}bday
f.bar.facet.date.end=1976-07-01T00:00:00.000Z+7DAYf.bar.facet.date.gap=+1DAY
{code}

 facet on same field different ways
 --

 Key: SOLR-1351
 URL: https://issues.apache.org/jira/browse/SOLR-1351
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Fix For: 1.5

 Attachments: SOLR-1351.patch


 There is a general need to facet on the same field in different ways 
 (different prefixes, different filters).  We need a way to express this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1351) facet on same field different ways

2009-09-12 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754506#action_12754506
 ] 

Uri Boness commented on SOLR-1351:
--

This is something that I've done in the far past (Solr 1.2) and they way I see 
it, facets should be identified by a unique idea rather than by the field name 
and the facet results will then be grouped by these ids. I think this can be 
done by just adding one extra parameter in the form: 

{code}
f.fieldName.facet.id
{code}

This parameter will practically mean that all other specific parameter for 
field facet will need to use this id instead of the field name, that is:

Assuming we have a field called cat to represent a category. Right now 
(without an id) we ca do:

{code}
q=*:*facet=truefacet.field=catf.cat.facet.sort=truef.cat.facet.limit=20f.cat.facet.mincount=1
{code}

with introducing the id:

{code}
q=*:*facet=truefacet.field=catf.cat.facet.id=categoryf.category.facet.sort=truef.category.facet.limit=20f.category.facet.mincount=1
{code}

Now to support multiple configurations:

{code}
q=*:*facet=truefacet.field=catf.cat.facet.id=cat1f.cat1.facet.sort=truef.cat1.facet.limit=20f.cat1facet.mincount=1f.cat.facet.id=cat2f.cat2.facet.sort=falsef.cat2.facet.count=0
{code}

Note that even after introducing the id param, backward compatibility can 
easily be maintained - we just determine that when the id param is not 
specified, the field name is the default id.

From experience, I can tell you that adding this feature not only will enable 
multiple facets on the same field, but IMO will also make it much easier to 
develop search clients and tools on top of Solr.

If this solution sounds reasonable, I can start working on a patch for it.

 facet on same field different ways
 --

 Key: SOLR-1351
 URL: https://issues.apache.org/jira/browse/SOLR-1351
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Fix For: 1.5


 There is a general need to facet on the same field in different ways 
 (different prefixes, different filters).  We need a way to express this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1311) pseudo-field-collapsing

2009-09-12 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754509#action_12754509
 ] 

Uri Boness commented on SOLR-1311:
--

Wouldn't be an idea to try and merge this code with the original field 
collapsing patch? Quite a bit of work was done recently on that patch to make 
it more extensible. So for example, you now have a _Collapser_ interface that 
encapsulates the actual collapsing algorithm, and my guess is that your 
algorithm can probably fit there. Indeed when the corpus is large, adjacent 
field collapsing can turn into a performance issue, and having this pseudo 
algorithm seems to make a lot of sense. So for example, using the original 
field collapsing patch, it would be nice if we could just define another 
parameter called collapse.type which will hold one of three values: adjacent, 
pseudo-adjacent, and non-adjacent.

BTW, I haven't looked at your patch yet and I don't know how well it works with 
faceting? But integrating it with the original patch will enable you that 
support (i.e. before/after collapse facet counts support) automatically.

 pseudo-field-collapsing
 ---

 Key: SOLR-1311
 URL: https://issues.apache.org/jira/browse/SOLR-1311
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4
Reporter: Marc Sturlese
 Fix For: 1.5

 Attachments: SOLR-1311-pseudo-field-collapsing.patch


 I am trying to develope a new way of doing field collapsing based on the 
 adjacent field collapsing algorithm. I have started developing it beacuse I 
 am experiencing performance problems with the field collapsing patch with big 
 index (8G).
 The algorith does adjacent-pseudo-field collapsing. It does collapsing on the 
 first X documents. Instead of making the collapsed docs disapear, the 
 algorith will send them to a given position of the relevance results list.
 The reason I just do collapsing in the first X documents is that if I have 
 for example 60 results and I am showing 10 results per page, I really 
 don't need to do collapsing in the page 3 or even not in the 3000. Doing 
 this I am noticing dramatically better performance. The problem is I couldn't 
 find a way to plug the algorithm as a component and keep good performance. I 
 had to hack few classes in SolrIndexSearcher.java
 This patch is just experimental and for testing purposes. In case someone 
 finds it interesting would be good do find a way to integrate it in a better 
 way than it is at the moment.
 Advices are more than welcome.
   
 Functionality:
 In solrconfig.xml we specify the pseudo-collapsing parameters:
  str name=plus.considerMoreDocstrue/str
  str name=plus.considerHowMany3000/str
  str name=plus.considerFieldname/str
 (at the moment there's no threshold and other parameters that exist in the 
 current collapse-field patch)
 plus.considerMoreDocs one enables pseudo-collapsing
 plus.considerHowMany sets the number of resultant documents in wich we want 
 to apply the algorithm
 plus.considerField is the field to do pseudo-collapsing
 If the number of results is lower than plus.considerHowMany the algorithm 
 will be applyed to all the results.
 Let's say there is a query with 60 results and we've set considerHowMany 
 to 3000 (and we already have the docs sorted by relevance). 
 What adjacent-pseudo-collapse does is, if the 2nd doc has to be collapsed it 
 will be sent to the pos 2999 of the relevance results array. If the 3th has 
 to be collpased too  will go to the position 2998 and successively like this.
 The algorithm is not applyed when a sortspec is set or plus.considerMoreDocs 
 is set to false. It neighter is applyed when using MoreLikeThisRequestHanlder.
 Example with a query of 9 results:
 Results sorted by relevance without pseudo-collapse-algorithm:
 doc1 - collapse_field_value 3
 doc2 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc6 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 Results pseudo-collapsed with plus.considerHowMany = 5
 doc1 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc2 - collapse_field_value 3*
 doc6 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 Results pseudo-collapsed with plus.considerHowMany = 9
 doc1 - collapse_field_value 3
 doc3 - collapse_field_value 4
 doc4 - collapse_field_value 7
 doc5 - collapse_field_value 6
 doc7 - collapse_field_value 5
 doc8 - collapse_field_value 1
 doc9 - collapse_field_value 2
 doc6 - collapse_field_value 6*
 doc2 - collapse_field_value 3*
 *pseudo-collapsed documents

-- 

[jira] Commented: (SOLR-236) Field collapsing

2009-09-12 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754523#action_12754523
 ] 

Uri Boness commented on SOLR-236:
-

Martijn, I think a more appropriate way to fix the threading issue is to bind 
the collapseRequest to the request context and drop the class field all 
together. So:

{code}
public void prepare(ResponseBuilder rb) throws IOException {
super.prepare(rb);
rb.req.getContext().put(collapseRequest, resolveCollapseRequest(rb));
}
{code}

and 

{code}
public void process(ResponseBuilder rb) throws IOException {
CollapseRequest collapseRequest = 
rb.req.getContext().remove(collapseRequest);
if (collapseRequest == null) {
  super.process(rb);
  return;
}
doProcess(rb, collapseRequest);
}
{code}

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
 Fix For: 1.5

 Attachments: collapsing-patch-to-1.3.0-dieter.patch, 
 collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, 
 SOLR-236_collapsing.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1351) facet on same field different ways

2009-09-12 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754542#action_12754542
 ] 

Uri Boness commented on SOLR-1351:
--

Another option is to define the id as a local param:

{code}
q=*:*facet=truefacet.field={!id=category}catf.category.facet.sort=truef.category.facet.limit=20f.category.facet.mincount=1
{code}

and for multiple configurations:

{code}
q=*:*facet=truefacet.field={!id=cat1}catf.cat1.facet.sort=truef.cat1.facet.limit=20f.cat1facet.mincount=1facet.field={!id=cat2}catf.cat2.facet.sort=falsef.cat2.facet.count=0
{code}

I guess it plays nicer with the new functionality in 1.4



 facet on same field different ways
 --

 Key: SOLR-1351
 URL: https://issues.apache.org/jira/browse/SOLR-1351
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Fix For: 1.5


 There is a general need to facet on the same field in different ways 
 (different prefixes, different filters).  We need a way to express this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-09-08 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752417#action_12752417
 ] 

Uri Boness commented on SOLR-1071:
--

Looks good! 

As for the naming, I really like your suggestion (in one of the comments above) 
to replace suggestion with alternatives. So the client code can look 
something like:

{code}
response.getSuggestions().get(hell).getAlternatives().get(0);
{code}

One more thing - I think it will be more intuitive to use a SimpleOrderedMap 
instead of a NamedList for the suggestions node. for the xml response it 
won't make much difference I guess, but for the json one it will be more 
intuitive and easier to work with. So to take your example above, you'd get 
something like:

{code}
spellcheck: {
suggestions: {
hell:{
numFound:2,
startOffset:0,
endOffset:4,
origFreq:0,
alternatives:[
{
word:dell,
freq:4
},
{
word:all,
freq:4
}
]
},
correctlySpelled:false}}}
{code}



 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Yonik Seeley
 Fix For: 1.4

 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-09-08 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752448#action_12752448
 ] 

Uri Boness commented on SOLR-1071:
--

bq. It already does... it's just the client code that checks for NamedList (the 
parent of SimpleOrderedMap)

No... sorry, I mean the top most suggestions node. line 182 in the patched 
class.

 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Yonik Seeley
 Fix For: 1.4

 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-09-08 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752472#action_12752472
 ] 

Uri Boness commented on SOLR-1071:
--

bq. I guess it depends on how important order is in the top-level suggestions 
list?

I guess the order is not that important, it's just that using a 
SimpleOrderedMap will output a more intuitive JSON output to work with IMO.

bq. It would break back compat for the non-extended results too (for JSON and 
friends).

True... I didn't think about that one. hmm... well... I guess you can keep it 
as is then. I mean, it's not like you cannot work with the current format after 
all :-)

 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Yonik Seeley
 Fix For: 1.4

 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-09-08 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752841#action_12752841
 ] 

Uri Boness commented on SOLR-1071:
--

cool! thanks for the effort Yonik. I've updated the wiki so you can focus on 
the release ;-)

 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Yonik Seeley
 Fix For: 1.4

 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-09-02 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750659#action_12750659
 ] 

Uri Boness commented on SOLR-1071:
--

Because there are issues with the current format is json (and perhaps also in 
other formats)... see comments above

 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Grant Ingersoll
 Fix For: 1.4

 Attachments: SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2009-08-31 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749693#action_12749693
 ] 

Uri Boness commented on SOLR-1163:
--

Hi Lance,

Great feedback, thanks!

bq.You did not mention the console button at the lower right corner. This is 
very very useful!
(well, you always have to leave some room for surprises ;-))

Obviously the two issues are bugs. I'll try to find some time this week to fix 
them and upload a new patch.

Cheers,
Uri



 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2009-08-21 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746395#action_12746395
 ] 

Uri Boness commented on SOLR-1163:
--

bq. Does a GWT client application have a clean license?
If having a pure Apache 2 license is considered to be clean, then yes.

bq. Are there any other GWT apps in the Apache project? 
No as far as I know. But you do have 
[LucidGaze|http://www.lucidimagination.com/Downloads/Certified-Distributions#lucidgaze]
 which is a Solr monitoring tool and I think it's also a GWT application.

bq. +1. This is great.
Thanks, you can also vote for it ;-)

bq. The Simile project has some nice data explorer UIs. The Simile-Widget 
gallery displays them.
Thanks for the suggestion. I know this project, but from my experience some of 
their widgets don't perform really well. Personally, when it comes to data 
visualization I think flash is the best technology we have at the moment and 
it's quite easy to interact with it via Javascript and GWT (that's how Google 
does for most of their applications/services: analytics, finances, etc..)

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler

2009-08-10 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741302#action_12741302
 ] 

Uri Boness commented on SOLR-1099:
--

bq. there are a number of oddities (things like using the complete text of a 
field as the key or name in a map value, listing the value twice, requiring a 
uniqueKey)

Yes, I know... I didn't feel best with it as well, but that how the original 
analysis handler worked so I just followed that

{quote}
And that got me thinking why there are SolrJ classes dedicated to it... and I'm 
not sure that we should take up space for that.

IMO, common things in SolrJ should have easier, more type safe interfaces and 
uncommon, advanced features should be accessed via the generic APIs in order to 
keep the interfaces smaller and more understandable for the general user.
{quote}

Why wouldn't you want SolrJ support for it? IMO, it would be great to have 
SolrJ support for every request handler that ships out of the box with Solr. It 
makes the user's life simpler and easier to use Solr this way. And as far as 
space is concerned... how much does it really add to the overall size of solrj 
jar? In any case, we're not talking of megabytes here... and for most people it 
doesn't really matter - I think it's more important to provide a simple and 
user friendly API to work with, and if the cost is to add a few extra classes I 
think it's a pretty cheap price to pay. It is true (I also mentioned it before) 
that it's not a major functionality that will be used often... but it is useful 
to have for tooling support - We're using it in one of the tools that we've 
created and the admin website can use it as well.

 FieldAnalysisRequestHandler
 ---

 Key: SOLR-1099
 URL: https://issues.apache.org/jira/browse/SOLR-1099
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: AnalisysRequestHandler_refactored.patch, 
 analysis_request_handlers_incl_solrj.patch, 
 AnalysisRequestHandler_refactored1.patch, 
 FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, 
 SOLR-1099.patch, SOLR-1099.patch


 The FieldAnalysisRequestHandler provides the analysis functionality of the 
 web admin page as a service. This handler accepts a filetype/fieldname 
 parameter and a value and as a response returns a breakdown of the analysis 
 process. It is also possible to send a query value which will use the 
 configured query analyzer as well as a showmatch parameter which will then 
 mark every matched token as a match.
 If this handler is added to the code base, I also recommend to rename the 
 current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
 them both inherit from one AnalysisRequestHandlerBase class which provides 
 the common functionality of the analysis breakdown and its translation to 
 named lists. This will also enhance the current AnalysisRequestHandler which 
 right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-07-27 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12735856#action_12735856
 ] 

Uri Boness commented on SOLR-1071:
--

bq. I debated the two different structures for a while and ultimately decided 
that people would have to deal with it no matter what. My suspicion was that 
people either use extendedResults or not and that they don't mix them, but 
perhaps I was wrong. Even if they do mix them, they still need code for 
recognizing when there is a difference (unless they are just spitting back out 
the raw, which means it doesn't matter anyway), so I don't know if it matters 
either way. Since this is out in the wild already, I think we should just fix 
the bug. 

I guess you're right - the users will have to handle the differences between 
the results anyway

 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Grant Ingersoll
 Fix For: 1.4

 Attachments: SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-07-27 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12735859#action_12735859
 ] 

Uri Boness commented on SOLR-1071:
--

I'm not sure it's a bug in the JSONRW, it seems to me that it was intentionally 
implemented to behave in this manner. It is confusing though, and indeed when 
developing components one has to keep in mind the consequences of using a 
_SimpleOrderedMap_ vs. a simple _NamedList_. 

I think there are several ways to tackle this:

1. Do nothing. In which case people should always know the consequences of 
using a _SimpleOrderedMap_ vs. a simple _NamedList._ 
*Advantages:* you probably don't break existing functionality. No code changes 
need to take place.
*Disadvantages:* (as you mentioned) more error prone - easier to introduce such 
bugs when writing new components. People need to know the best practices which 
are not enforced.

2. In the _SimpleOrderedMap_, keep track of duplicate keys. If a 
_SimpleOrderedMap_ hold duplicate keys then it should not be rendered as a JSON 
object, but more like a normal _NamedList_
*Advantages:* you probably break nothing.. if components already use duplicate 
keys in a _SimpleOrderedMap_ then most probably they've introduced this same 
bug.
*Disadvantage:* Inconsistent in the sense that in different occasions a 
_SimpleOrderedMap_ will be rendered differently. If duplicate keys are added, 
then there's no added value in choosing _SimpleOrderedMap_ over a normal 
_NamedList_. Which brings me to the last option

3. Make sure that _SimpleOrderedMap_ does not accept duplicates. Either by 
enforcing it (e.g. by throwing an exception) or just by overriding the values.
*Advantages:* Gives the _SimpleOrderedMap_ a true meaning and a reason to 
exist. With this in place, it will be clear when and how it can be used. No 
changes need to be applied to the JSONRW.
*Disadvantages:* Existing functionality might break, yet again... if duplicate 
keys are already used than this bug is introduced anyway. According to the 
Javadoc, the _SimpleOrderedMap_ implementation intentionality doesn't prevent 
duplicate keys... so there must be a reason for that.

Personally, I'm for option 3. The current implementation of _SimpleOrderedMap_ 
doesn't seem to add any functionality to the _NamedList_ class, so it seems to 
me this class was created just as a hint for the response writers to render it 
differently. The name SimpleOrderedMap also suggest a Map-like 
functionality which doesn't support duplicate keys. But again, I'm not sure 
about the original reasons for not preventing duplicate keys in the first 
place, so there might be something I'm missing here.


 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Grant Ingersoll
 Fix For: 1.4

 Attachments: SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (SOLR-1133) solr-common 1.4-SNAPSHOT is not in maven2 repository

2009-07-27 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness resolved SOLR-1133.
--

Resolution: Invalid

In 1.4 solr-common.jar isbundled with solr-solrj.jar 

 solr-common 1.4-SNAPSHOT is not in maven2 repository
 

 Key: SOLR-1133
 URL: https://issues.apache.org/jira/browse/SOLR-1133
 Project: Solr
  Issue Type: Bug
Reporter: Uri Boness
 Fix For: 1.4


 Looking at the apache maven2 repository 
 ([http://people.apache.org/repo/m2-snapshot-repository/]) 
 solr-common-1.4-SNAPSHOT is missing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2009-07-06 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727415#action_12727415
 ] 

Uri Boness commented on SOLR-773:
-

I guess it is possible to configure the executor service via the configuration 
of the query parser. That said, having a way to configure executor services in 
solr config will eliminate some code duplication. I don't think it's a good 
practice to have on executor service for all components to use - the last thing 
you want is to have component depend on each other in terms of race 
conditions over threads. I think it is better to fine tune each component with 
a thread pool of its own.

 Incorporate Local Lucene/Solr
 -

 Key: SOLR-773
 URL: https://issues.apache.org/jira/browse/SOLR-773
 Project: Solr
  Issue Type: New Feature
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Attachments: lucene-spatial-2.9-dev.jar, lucene.tar.gz, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
 SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, 
 SOLR-773.patch, spatial-solr.tar.gz


 Local Lucene has been donated to the Lucene project.  It has some Solr 
 components, but we should evaluate how best to incorporate it into Solr.
 See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2009-06-22 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722763#action_12722763
 ] 

Uri Boness commented on SOLR-1163:
--

Thanks Yonik.

The console is in the application. If you look down at the lower right corner, 
you'll find a small console icon (a la FireBug), click on it and the console 
will open up.

Where do I see it fitting? Well, if http://localhost:8983/solr/admin is for the 
admin page, then I guess http://localhost:8983/solr/explorer can be for the 
explorer. I don't know, what do you think?

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type

2009-05-20 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12711075#action_12711075
 ] 

Uri Boness commented on SOLR-1123:
--

Indeed this is just for convenience and should not be in a high priority, but I 
definitely see it as a nice to have one. Just to clarify, the suggestion is not 
to have another request parameter (that would probably be too much as you 
mentioned) but instead add a configuration parameter in solrconfig. So you'll 
be able to define the json response writer as follows:

{code:xml}
queryResponseWriter name=json 
class=org.apache.solr.request.JSONResponseWriter
bool name=useJsonContentTypetrue/bool
/queryResponseWriter
{code} 


 Change the JSONResponseWriter content type
 --

 Key: SOLR-1123
 URL: https://issues.apache.org/jira/browse/SOLR-1123
 Project: Solr
  Issue Type: Improvement
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: JSON_contentType_incl_tests.patch


 Currently the jSON content type is not used. Instead the palin/text content 
 type is used. The reason for this as I understand is to enable viewing the 
 json response as as text in the browser. While this is valid argument, I do 
 believe that there should at least be an option to configure this writer to 
 use the JSON content type. According to 
 [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to 
 be application/json (and not text/x-json). The reason this can be very 
 helpful is that today you have plugins for browsers (e.g. 
 [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page 
 with application/json content type in a user friendly manner (just like xml 
 is supported).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type

2009-05-20 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12711133#action_12711133
 ] 

Uri Boness commented on SOLR-1123:
--

I think that would be the best option. The problem right now is in the current 
class hierarchy of the response writers. Basically, I think the 
QueryResponseWriter interface should change to:

{code}
public interface QueryResponseWriter extends NamedListInitializedPlugin {
 
  public void write(OutputStream out, SolrQueryRequest request, 
SolrQueryResponse response) throws IOException;

  public String getContentType(SolrQueryRequest request, SolrQueryResponse 
response);

}
{code}

Note: this interface will play nicer with the binary response writer

Then we can have an AbstractTextResponseWriter which will serve as a parent for 
all non-binary response writers:

{code}
public abstract class AbstractTextResponseWriter extends 
NamedListInitializedPlugin {

  public final static String CONTENT_TYPE_PARAM = contentType;
  public static String DEFAULT_CONTENT_TYPE=text/plain; charset=UTF-8;
  
  private final String contentType;
  
  protected AbstractTextResponseWriter() {
this(DEFAULT_CONTENT_TYPE);
  }

  protected AbstractTextResponseWriter(String defaultContentType) {
this.contentType = defaultContentType;
  }

  public void init(NamedList args) {
String configuredContentType = (String) args.get(CONTENT_TYPE_PARAM);
if (configuredContentType != null) {
  contentType = configuredContentType;;
}
  }

  public String getContentType(SolrQueryRequest request, SolrQueryResponse 
response) {
return contentType;
  }
 
  public final void write(OutputStream out, SolrQueryRequest request, 
SolrQueryResponse response) throws IOException {
OutputStreamWriter writer = new OutputStreamWriter(out, UTF-8);
write(writer, request, response);
  }

  protected abstract void write(Writer writer, SolrQueryRequest request, 
SolrQueryResponse response) throws IOException;

}
{code}

This will make it easy for every response writer to define its default content 
type, yet it will still allow to override this default using the contentType 
parameter in solrconfig. (I assume here that there's no need to customize the 
content type for the binary response writer as it's internal and specific for 
the current implementation).

 Change the JSONResponseWriter content type
 --

 Key: SOLR-1123
 URL: https://issues.apache.org/jira/browse/SOLR-1123
 Project: Solr
  Issue Type: Improvement
Reporter: Uri Boness
 Fix For: 1.5

 Attachments: JSON_contentType_incl_tests.patch


 Currently the jSON content type is not used. Instead the palin/text content 
 type is used. The reason for this as I understand is to enable viewing the 
 json response as as text in the browser. While this is valid argument, I do 
 believe that there should at least be an option to configure this writer to 
 use the JSON content type. According to 
 [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to 
 be application/json (and not text/x-json). The reason this can be very 
 helpful is that today you have plugins for browsers (e.g. 
 [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page 
 with application/json content type in a user friendly manner (just like xml 
 is supported).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2009-05-13 Thread Uri Boness (JIRA)
Solr Explorer - A generic GWT client for Solr
-

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness


The attached patch is a GWT generic client for solr. It is currently 
standalone, meaning that once built, one can open the generated HTML file in a 
browser and communicate with any deployed solr. It is configured with it's own 
configuration file, where one can configure the solr instance/core to connect 
to. Since it's currently standalone and completely client side based, it uses 
JSON with padding (cross-side scripting) to connect to remote solr servers. 
Some of the supported features:

- Simple query search
- Sorting - one can dynamically define new sort criterias
- Search results are rendered very much like Google search results are 
rendered. It is also possible to view all stored field values for every hit. 
- Custom hit rendering - It is possible to show thumbnails (images) per hit and 
also customize a view for a hit based on html templates
- Faceting - one can dynamically define field and query facets via the UI. it 
is also possible to pre-configure these facets in the configuration file.
- Highlighting - you can dynamically configure highlighting. it can also be 
pre-configured in the configuration file
- Spellchecking - you can dynamically configure spell checking. Can also be 
done in the configuration file. Supports collation. It is also possible to send 
build and reload commands.
- Data import handler - if used, it is possible to send a full-import and 
status command (delta-import is not implemented yet, but it's easy to add)
- Console - For development time, there's a small console which can help to 
better understand what's going on behind the scenes. One can use it to:
** view the client logs
** browse the solr scheme
** View a break down of the current search context
** View a break down of the query URL that is sent to solr
** View the raw JSON response returning from Solr

This client is actually a platform that can be greatly extended for more 
things. The goal is to have a client where the explorer part is just one view 
of it. Other future views include: Monitoring, Administration, Query Builder, 
DataImportHandler configuration, and more...

To get a better view of what's currently possible. We've set up a public 
version of this client at: http://search.jteam.nl/explorer. This client is 
configured with one solr instance where crawled YouTube movies where indexed. 
You can also check out a screencast for this deployed client: 
http://search.jteam.nl/help

The patch created a new folder in the contrib. directory. Since the patch 
doesn't contain binaries, an additional zip file is provides that needs to be 
extract to add all the required graphics. This module is maven2 based and is 
configured in such a way that all GWT related tools/libraries are automatically 
downloaded when the modules is compiled. One of the artifacts of the build is a 
war file which can be deployed in any servlet container.

NOTE: this client works best on WebKit based browsers (for performance reason) 
but also works on firefox and ie 7+. That said, it should be taken into account 
that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2009-05-13 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1163:
-

Attachment: graphics.zip
solr-explorer.patch

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1163) Solr Explorer - A generic GWT client for Solr

2009-05-13 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1163:
-

Attachment: solr-explorer.patch

fixed the groupId in the pom

 Solr Explorer - A generic GWT client for Solr
 -

 Key: SOLR-1163
 URL: https://issues.apache.org/jira/browse/SOLR-1163
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Affects Versions: 1.3
Reporter: Uri Boness
 Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch


 The attached patch is a GWT generic client for solr. It is currently 
 standalone, meaning that once built, one can open the generated HTML file in 
 a browser and communicate with any deployed solr. It is configured with it's 
 own configuration file, where one can configure the solr instance/core to 
 connect to. Since it's currently standalone and completely client side based, 
 it uses JSON with padding (cross-side scripting) to connect to remote solr 
 servers. Some of the supported features:
 - Simple query search
 - Sorting - one can dynamically define new sort criterias
 - Search results are rendered very much like Google search results are 
 rendered. It is also possible to view all stored field values for every hit. 
 - Custom hit rendering - It is possible to show thumbnails (images) per hit 
 and also customize a view for a hit based on html templates
 - Faceting - one can dynamically define field and query facets via the UI. it 
 is also possible to pre-configure these facets in the configuration file.
 - Highlighting - you can dynamically configure highlighting. it can also be 
 pre-configured in the configuration file
 - Spellchecking - you can dynamically configure spell checking. Can also be 
 done in the configuration file. Supports collation. It is also possible to 
 send build and reload commands.
 - Data import handler - if used, it is possible to send a full-import and 
 status command (delta-import is not implemented yet, but it's easy to add)
 - Console - For development time, there's a small console which can help to 
 better understand what's going on behind the scenes. One can use it to:
 ** view the client logs
 ** browse the solr scheme
 ** View a break down of the current search context
 ** View a break down of the query URL that is sent to solr
 ** View the raw JSON response returning from Solr
 This client is actually a platform that can be greatly extended for more 
 things. The goal is to have a client where the explorer part is just one view 
 of it. Other future views include: Monitoring, Administration, Query Builder, 
 DataImportHandler configuration, and more...
 To get a better view of what's currently possible. We've set up a public 
 version of this client at: http://search.jteam.nl/explorer. This client is 
 configured with one solr instance where crawled YouTube movies where indexed. 
 You can also check out a screencast for this deployed client: 
 http://search.jteam.nl/help
 The patch created a new folder in the contrib. directory. Since the patch 
 doesn't contain binaries, an additional zip file is provides that needs to be 
 extract to add all the required graphics. This module is maven2 based and is 
 configured in such a way that all GWT related tools/libraries are 
 automatically downloaded when the modules is compiled. One of the artifacts 
 of the build is a war file which can be deployed in any servlet container.
 NOTE: this client works best on WebKit based browsers (for performance 
 reason) but also works on firefox and ie 7+. That said, it should be taken 
 into account that it is still under development.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler

2009-05-03 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705406#action_12705406
 ] 

Uri Boness commented on SOLR-1099:
--

{quote}I guess contrary to the javadocs, you need to specify either 
analysis.fieldname or analysis.fieldtype along with analysis.fieldvalue to make 
it work. They are optional but one of them must be present.{quote}

That's true, one of them must be set.

{quote}On second thought, we could just use the default search field if both 
fieldname and fieldtype are not specified.{quote}

Sounds like a reasonable fallback.

 FieldAnalysisRequestHandler
 ---

 Key: SOLR-1099
 URL: https://issues.apache.org/jira/browse/SOLR-1099
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: AnalisysRequestHandler_refactored.patch, 
 analysis_request_handlers_incl_solrj.patch, 
 AnalysisRequestHandler_refactored1.patch, 
 FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, SOLR-1099.patch


 The FieldAnalysisRequestHandler provides the analysis functionality of the 
 web admin page as a service. This handler accepts a filetype/fieldname 
 parameter and a value and as a response returns a breakdown of the analysis 
 process. It is also possible to send a query value which will use the 
 configured query analyzer as well as a showmatch parameter which will then 
 mark every matched token as a match.
 If this handler is added to the code base, I also recommend to rename the 
 current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
 them both inherit from one AnalysisRequestHandlerBase class which provides 
 the common functionality of the analysis breakdown and its translation to 
 named lists. This will also enhance the current AnalysisRequestHandler which 
 right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1133) solr-common 1.4-SNAPSHOT is not in maven2 repository

2009-04-28 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703593#action_12703593
 ] 

Uri Boness commented on SOLR-1133:
--

Actually, the same issue is with solr-lucen-contrib library (the pom's are 
there, but not the jar)

 solr-common 1.4-SNAPSHOT is not in maven2 repository
 

 Key: SOLR-1133
 URL: https://issues.apache.org/jira/browse/SOLR-1133
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
Reporter: Uri Boness
 Fix For: 1.3.1


 Looking at the apache maven2 repository 
 ([http://people.apache.org/repo/m2-snapshot-repository/]) 
 solr-common-1.4-SNAPSHOT is missing

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1122) Move the lib directory of the velocity contrib out of the src directory.

2009-04-22 Thread Uri Boness (JIRA)
Move the lib directory of the velocity contrib out of the src directory.


 Key: SOLR-1122
 URL: https://issues.apache.org/jira/browse/SOLR-1122
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Uri Boness
Priority: Minor
 Fix For: 1.4


Currently the lib folder is located under the 
{{trunk/contrib/velocity/src/main/solr}} folder I guess it should be in 
{{turk/contrib/velocity}} instead (will also be consistent with the other 
contrib folders).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1

2009-04-22 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701492#action_12701492
 ] 

Uri Boness commented on SOLR-1071:
--

One more thing to consider: now this component is a bit inconsistent with its 
response format. When extendedResults is used, the suggestions are put in an 
array called alternatives, while when it's not used the suggestions are put 
in an array called suggestion. I think it will be wise to consider changing 
the later to alternatives as well, but of course it will break backward 
compatibility and as this component is probably widely used it's a risk. 
Another option is at least temporary for 1.4 release add support for another 
parameter (something like, spellcheck.version=1.3) that will then signal the 
component to render the response in the 1.3 format - a bit ugly, but it will at 
least solve the compatibility issues.

 spellcheck.extendedResults returns an invalid JSON response when count  1
 --

 Key: SOLR-1071
 URL: https://issues.apache.org/jira/browse/SOLR-1071
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Grant Ingersoll
 Fix For: 1.3.1

 Attachments: SpellCheckComponent_fix.patch, 
 SpellCheckComponent_new_structure.patch, 
 SpellCheckComponent_new_structure_incl_test.patch


 When: wt=json  spellcheck.extendedResults=true  spellcheck.count  1, the 
 suggestions are returned in the following format:
 suggestions:[
   amsterdm,{
numFound:5,
startOffset:0,
endOffset:8,
origFreq:0,
suggestion:{
 frequency:8498,
 word:amsterdam},
suggestion:{
 frequency:1,
 word:amsterd},
suggestion:{
 frequency:8,
 word:amsterdams},
suggestion:{
 frequency:1,
 word:amstedam},
suggestion:{
 frequency:22,
 word:amsterdamse}},
   beak,{
numFound:5,
startOffset:9,
endOffset:13,
origFreq:0,
suggestion:{
 frequency:379,
 word:beek},
suggestion:{
 frequency:26,
 word:beau},
suggestion:{
 frequency:26,
 word:baak},
suggestion:{
 frequency:15,
 word:teak},
suggestion:{
 frequency:11,
 word:beuk}},
   correctlySpelled,false,
   collation,amsterdam beek]}}
 This is an invalid json as each term is associated with a JSON object which 
 holds multiple suggestion attributes. When working with a JSON library only 
 the last suggestion attribute is picked up.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-1123) Change the JSONResponseWriter content type

2009-04-22 Thread Uri Boness (JIRA)
Change the JSONResponseWriter content type
--

 Key: SOLR-1123
 URL: https://issues.apache.org/jira/browse/SOLR-1123
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Uri Boness
 Fix For: 1.3.1


Currently the jSON content type is not used. Instead the palin/text content 
type is used. The reason for this as I understand is to enable viewing the json 
response as as text in the browser. While this is valid argument, I do believe 
that there should at least be an option to configure this writer to use the 
JSON content type. According to [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] 
the json content type needs to be application/json (and not text/x-json). The 
reason this can be very helpful is that today you have plugins for browsers 
(e.g. [JSONView|http://brh.numbera.com/software/jsonview]) that can render any 
page with application/json content type in a user friendly manner (just like 
xml is supported).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1123) Change the JSONResponseWriter content type

2009-04-22 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1123:
-

Attachment: JSON_contentType_incl_tests.patch

This patch is a simple implementation for this functionality. The writer can be 
configured with a {{userJsonContentType}} boolean parameter that when set to 
{{true}} the content type for the output will be application/json instead of 
text/plain. For backward compatibility reasons, when this parameter is 
absent, the text/plain content type will be used.

 Change the JSONResponseWriter content type
 --

 Key: SOLR-1123
 URL: https://issues.apache.org/jira/browse/SOLR-1123
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.3
Reporter: Uri Boness
 Fix For: 1.3.1

 Attachments: JSON_contentType_incl_tests.patch


 Currently the jSON content type is not used. Instead the palin/text content 
 type is used. The reason for this as I understand is to enable viewing the 
 json response as as text in the browser. While this is valid argument, I do 
 believe that there should at least be an option to configure this writer to 
 use the JSON content type. According to 
 [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to 
 be application/json (and not text/x-json). The reason this can be very 
 helpful is that today you have plugins for browsers (e.g. 
 [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page 
 with application/json content type in a user friendly manner (just like xml 
 is supported).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler

2009-04-21 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701094#action_12701094
 ] 

Uri Boness commented on SOLR-1099:
--

Actually, there is not dependency between the handlers and SolrJ. SolrJ comes 
with its own {{FieldAnalysisRequest}} and {{DocumentAnalysisRequest}} classes 
(which extend the {{SolrRequest}} class). The inner classes in the handlers are 
used to represent analysis requests on the server side. 

Another thing. I believe that the default names for the handlers as you defined 
in the default solrconfig.xml (i.e. analysis/field and analysis/document) 
are better than the ones I came up with :-). The only thing left to do is to 
update these defaults in the SolrJ request classes:  {{FieldAnalysisRequest}} 
and {{DocumentAnalysisRequests}}.

 FieldAnalysisRequestHandler
 ---

 Key: SOLR-1099
 URL: https://issues.apache.org/jira/browse/SOLR-1099
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: AnalisysRequestHandler_refactored.patch, 
 analysis_request_handlers_incl_solrj.patch, 
 AnalysisRequestHandler_refactored1.patch, 
 FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch


 The FieldAnalysisRequestHandler provides the analysis functionality of the 
 web admin page as a service. This handler accepts a filetype/fieldname 
 parameter and a value and as a response returns a breakdown of the analysis 
 process. It is also possible to send a query value which will use the 
 configured query analyzer as well as a showmatch parameter which will then 
 mark every matched token as a match.
 If this handler is added to the code base, I also recommend to rename the 
 current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
 them both inherit from one AnalysisRequestHandlerBase class which provides 
 the common functionality of the analysis breakdown and its translation to 
 named lists. This will also enhance the current AnalysisRequestHandler which 
 right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1099) FieldAnalysisRequestHandler

2009-04-20 Thread Uri Boness (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uri Boness updated SOLR-1099:
-

Attachment: analysis_request_handlers_incl_solrj.patch

latest patch. This one includes SolrJ support.

 FieldAnalysisRequestHandler
 ---

 Key: SOLR-1099
 URL: https://issues.apache.org/jira/browse/SOLR-1099
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: AnalisysRequestHandler_refactored.patch, 
 analysis_request_handlers_incl_solrj.patch, 
 AnalysisRequestHandler_refactored1.patch, 
 FieldAnalysisRequestHandler_incl_test.patch


 The FieldAnalysisRequestHandler provides the analysis functionality of the 
 web admin page as a service. This handler accepts a filetype/fieldname 
 parameter and a value and as a response returns a breakdown of the analysis 
 process. It is also possible to send a query value which will use the 
 configured query analyzer as well as a showmatch parameter which will then 
 mark every matched token as a match.
 If this handler is added to the code base, I also recommend to rename the 
 current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
 them both inherit from one AnalysisRequestHandlerBase class which provides 
 the common functionality of the analysis breakdown and its translation to 
 named lists. This will also enhance the current AnalysisRequestHandler which 
 right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler

2009-04-20 Thread Uri Boness (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700787#action_12700787
 ] 

Uri Boness commented on SOLR-1099:
--

Not all input can be sent as input parameters (the documents will still be sent 
as a request body via a POST) but of course it's still possible to fold 
everything in one handler. It just feels like putting too much logic  
responsibility on a single handler which increases code complexity and makes it 
harder to maintain (at least in my opinion). The deprecation also provides 
users who already use the current ARH a chance to move to the DocumentARH 
(which has a different response format)

 FieldAnalysisRequestHandler
 ---

 Key: SOLR-1099
 URL: https://issues.apache.org/jira/browse/SOLR-1099
 Project: Solr
  Issue Type: New Feature
  Components: Analysis
Affects Versions: 1.3
Reporter: Uri Boness
Assignee: Shalin Shekhar Mangar
 Fix For: 1.4

 Attachments: AnalisysRequestHandler_refactored.patch, 
 analysis_request_handlers_incl_solrj.patch, 
 AnalysisRequestHandler_refactored1.patch, 
 FieldAnalysisRequestHandler_incl_test.patch


 The FieldAnalysisRequestHandler provides the analysis functionality of the 
 web admin page as a service. This handler accepts a filetype/fieldname 
 parameter and a value and as a response returns a breakdown of the analysis 
 process. It is also possible to send a query value which will use the 
 configured query analyzer as well as a showmatch parameter which will then 
 mark every matched token as a match.
 If this handler is added to the code base, I also recommend to rename the 
 current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have 
 them both inherit from one AnalysisRequestHandlerBase class which provides 
 the common functionality of the analysis breakdown and its translation to 
 named lists. This will also enhance the current AnalysisRequestHandler which 
 right now is fairly simplistic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   >