[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854648#action_12854648 ] Uri Boness commented on SOLR-1163: -- yeah... I'm leaning toward that option as well. First, it's less intrusive, but also, using a proxy servlet I won't need to use XSS for the communication (which opens up all the XML-only api for me). Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854664#action_12854664 ] Uri Boness commented on SOLR-1163: -- The only downside to this is that it requires extra setup. A lot of people (incl. myself) like to use the bundled jetty instance for development and only deploy solr in a different servlet container in production. In that sense, it would be nice to get something ready out of the box with solr distribution (or at least that it would be easy to set it up with the examples directory). Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854155#action_12854155 ] Uri Boness commented on SOLR-1163: -- working on a new improved patch for the explorer. But I'm at a bit of a dilemma here regarding exactly it should integrate with Solr. There are basically 3 options: 1. Tight integration, where the explorer will be bound to each core and there will be a dedicated URL for it (say /corename/explorer). This is nice as the user gets this functionality out of the box, but on the other hand, I'm not sure users want it to be there out of the box (most of the time, if not always, the explorer will not be used as the final UI, but more of a temporary one, just to have something up and running... in production I can imagine users will not need it). This tight integration also means quite a lot of changes to the current configuration, well, first the dispatch filter will need to change a bit, but also a default request handler will need to be defined for all cores. 2. The other option is to keep the explorer as an external tool. The idea is to have it as a separate war file which can be deployed in the same servlet container as solr. I'm working on removing the current xml configuration and make it more dynamic. So when the user enters the application, she can configure a core by following a wizard-like process... this wizard will create a configuration which will be saved on the server for future logins. 3. Well, the third option is just to leave things as they are now. That is, there is on configuration file which defines all the solr cores the explorer can communicate with. This configuration file is loaded when the web page is loaded. Like option 2, this is also a standalone mode. any comments? Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based
[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr
[ https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852813#action_12852813 ] Uri Boness commented on SOLR-773: - Grant, I started looking at SOLR-1298 yesterday. The idea is to somehow merge all the related issues (there are currently two open issues for the same purpose with two different patches). But this should be done with somewhat collaborated manner so everybody will be on the same page here also regarding the discussion about the different approaches (inline the pseudo fields or have them nested in a separate meta element). Is there some way to merge the issues? or perhaps mark one of them as duplicate, so the discussion will be centralized. Incorporate Local Lucene/Solr - Key: SOLR-773 URL: https://issues.apache.org/jira/browse/SOLR-773 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, lucene.tar.gz, screenshot-1.jpg, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, solrGeoQuery.tar, spatial-solr.tar.gz Local Lucene has been donated to the Lucene project. It has some Solr components, but we should evaluate how best to incorporate it into Solr. See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-773) Incorporate Local Lucene/Solr
[ https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12852813#action_12852813 ] Uri Boness edited comment on SOLR-773 at 4/2/10 1:23 PM: - Grant, I started looking at SOLR-1298 yesterday. The idea is to somehow merge all the related issues (there are currently two open issues for the same purpose with two different patches). But this should be done with somewhat collaborated manner so everybody will be on the same page here also regarding the discussion about the different approaches (inline the pseudo fields or have them nested in a separate meta element). Is there some way to merge the issues? or perhaps mark one of them as duplicate, so the discussion will be centralized. btw, the other duplicate issues is SOLR-1566 was (Author: uboness): Grant, I started looking at SOLR-1298 yesterday. The idea is to somehow merge all the related issues (there are currently two open issues for the same purpose with two different patches). But this should be done with somewhat collaborated manner so everybody will be on the same page here also regarding the discussion about the different approaches (inline the pseudo fields or have them nested in a separate meta element). Is there some way to merge the issues? or perhaps mark one of them as duplicate, so the discussion will be centralized. Incorporate Local Lucene/Solr - Key: SOLR-773 URL: https://issues.apache.org/jira/browse/SOLR-773 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, lucene.tar.gz, screenshot-1.jpg, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, solrGeoQuery.tar, spatial-solr.tar.gz Local Lucene has been donated to the Lucene project. It has some Solr components, but we should evaluate how best to incorporate it into Solr. See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1829) Cleaned up analysis.jsp - removed all token API scriptlets
Cleaned up analysis.jsp - removed all token API scriptlets -- Key: SOLR-1829 URL: https://issues.apache.org/jira/browse/SOLR-1829 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 1.4 Reporter: Uri Boness Fix For: 1.5, 1.6, 3.1 The analysis.jsp was polluted with the old token stream api in scriptlets all over the place. Since the introduction of the FieldAnalysisRequestHandler, there's no need to keep this mess. Instead, the page can just call the analysis request handler and with parameter generated by the form and display the xml response the same way as it is displayed at the moment. Moreover, it will save some work when updating the code base to the new token stream API. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1829) Cleaned up analysis.jsp - removed all token API scriptlets
[ https://issues.apache.org/jira/browse/SOLR-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1829: - Attachment: SOLR-1829.patch this patch uses jquery to generate the proper requests to the field analysis request handler, then applies xsl transformation on the response to render it appropriately. It also updates the jquery version to 1.4.2. The UI is *slightly* different for simplicity, but also a bit enhanced. Now when choosing to analyze by field type/name, a drop down with all possible types/names will be populated. In order to support all functionality of the analysis.jsp, the FieldAnalysisRequestHandler co. had to be enhanced. They now accept analysis.verbose parameter which dumps more information over the tokenizer/filter. The verbose format differs a bit from the non-verbose, but that will not break BWC as when not using this parameter, the old format is returned. Cleaned up analysis.jsp - removed all token API scriptlets -- Key: SOLR-1829 URL: https://issues.apache.org/jira/browse/SOLR-1829 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 1.4 Reporter: Uri Boness Fix For: 1.5, 1.6, 3.1 Attachments: SOLR-1829.patch The analysis.jsp was polluted with the old token stream api in scriptlets all over the place. Since the introduction of the FieldAnalysisRequestHandler, there's no need to keep this mess. Instead, the page can just call the analysis request handler and with parameter generated by the form and display the xml response the same way as it is displayed at the moment. Moreover, it will save some work when updating the code base to the new token stream API. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer
[ https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12841776#action_12841776 ] Uri Boness commented on SOLR-1716: -- bq. ScriptTransformer supports Rhino,, Groove, Scala, etc. Someday even Erjang. They probably all have such things, but a consistent debugging allows better debugging tools. True. Using the logger will print to the same log Solr log files. Indeed it's great for debugging, but also when you have fancy complex logic in the scripts general purpose logging (e.g. INFO, ERROR, TRACE) should also be considered. Add logging support for ScriptTransformer - Key: SOLR-1716 URL: https://issues.apache.org/jira/browse/SOLR-1716 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Uri Boness Fix For: 1.5 Attachments: SOLR-1716.patch, SOLR-1716.patch Currently it's very hard to debug the logic embedded in the script ran by the ScriptTransformer. There should be a possibility to add a logger to the function signature, which can be used for logging. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805601#action_12805601 ] Uri Boness commented on SOLR-1163: -- Actually I've been working on a new version for the explorer which I plan to put soon as a patch here. Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805612#action_12805612 ] Uri Boness commented on SOLR-1725: -- {quote} Performance: It looks like scripts are read from the resource loader and parsed again (eval) for every update request. This can be pretty expensive, esp for those scripting languages that generate java class files instead of using an interpreter. One way to combat this would be to cache and reuse them. {quote} Yes, indeed the scripts are evaluated per request but for a reason. One of the goals here is to keep the scripts as close as possible to the update processor interface, so the functions in the scripts has the same signature as the methods in the processor. But in order for the scripts to be flexible I decided to introduce some global scoped variables which are accessible in the functions. (currently the current solr request, response and a logger are there). The problem is that the API only defines 3 scopes where you can register variables and the lowest one is the engine itself. Since the evaluation of a script is done on the engine level as well, when using this API together with the global variables I don't think you can escape the need for creating an engine per request (thus, also evaluating the scripts). But I agree with you that if there is a way around it, caching the evaluated/compiled scripts will definitely boost things up. I'll need to investigate this further and come up with alternatives (I already have some ideas using ThreadLocals). bq. Should we have a way to specify a script in-line (in solrconfig.xml)? Personally I prefer keeping the solrconfig.xml as clean as possible. I do however think that a standardization of Solr scripting support in general can be great. (for example, have a scripts folder under _solr.solr.home_ were all the scripts are placed, or come up with a standard configuration structure for the scripts... perhaps something in the direction Hoss suggested above). bq. This seems to raise the visibility of the UpdateCommand classes, directly exposing them to users w/o plugins. We should perhaps consider interface cleanups on these classes at the same time as this issue. +1 bq. Examples! Using javascript (since it's both fast and included in JDK6), let's see what the scripts are for some common usecases. This both helps improve the design as well as lets other people give feedback w/o having to read through code. Yep.. that would probably be very helpful. basically I think anyone who's ever written an update processor can perhaps try to convert it to a script and see how it works. The usual use case for me is to just add a few fields which are derived from the other fields, but perhaps there are some other more interesting use cases out there. I guess these examples should be put in the Wiki, right? Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805672#action_12805672 ] Uri Boness commented on SOLR-1725: -- Been looking more into it and I think there's a nice way in which we can cache the evaluated scripts. But... (and there's always a but) to make it work cleanly we need to be able to extend the scripting support, which means we need to be able to compile the code in Java 6. And this brings us back to Mark's comment above on how do we want to do that. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805691#action_12805691 ] Uri Boness commented on SOLR-1725: -- Well then... I just hope others will not shed tears as well and we can make Solr 1.5 Java 6 compiled :-) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805235#action_12805235 ] Uri Boness commented on SOLR-1725: -- Lance, I lost you a bit as well. bq. Uri, I'd prefer if the manner of configuration was as similar as possible, i.e. if we could get rid of the lst name=params part, and instead pass all top-level params directly to the script (except the scripts param itself). Hmm... personally I prefer configurations that clearly indicate their purpose. leaving out the _params_ list will make things a bit confusing - some parameters are available for the scripts, others are not... it's not really clear. bq. manner of configuration was as similar as possible The configuration are similar. All elements in solrconfig.xml have one standard way of configuration which can be anything from a _lst_, _bool_, _str_, etc Tomorrow a new processor will popup which will also require a _lst_ configuration... and that's fine. bq.Even better if the definition of a processor was in a separate xml section and then refer by name only in each chain, but that is a bigger change outside scope of this patch. Well, indeed that's a bigger change. Like everything, this kind of configuration has it's proscons. I guess it's best if people will just state their preferences regarding how they would like to see this processor configured and based on that I'll adjust the patch. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805254#action_12805254 ] Uri Boness commented on SOLR-1725: -- bq. 1) what is the value add in making ScriptUpdateProcessorFactory support multiple scripts ? ... wouldn't it be simpler to require that users declare multiple instances of ScriptUpdateProcessorFactory (that hte processor chain already executes in sequence) then to add sequential processing to the ScriptUpdateProcessor? Well... to my taste it makes the configuration cleaner (no need to define several script processors). The thing is, you have the choice here - either specify several scripts (comma separated) or split them to several processors. bq. 2) The NamedList init args can be as deep of a data structure as you want, so something like this would be totally feasible (if desired) ... That's definitely another option. The only thing is that you'd probably want some way to define shared parameters (shared between the scripts that is) and not be forced to specify them several times for each script. I guess you can do something like this: {code} processor class=solr.ScriptUpdateProcessorFactory lst name=sharedParams bool name=paramNametrue/bool /lst lst name=scripts lst name=updateProcessor1.js bool name=someParamNametrue/bool int name=someOtherParamName3/int /lst lst name=updateProcessor2.js bool name=fooParamtrue/bool str name=barParam3/str /lst /lst lst name=otherProcessorOPtionsIfNeeded ... /lst /processor {code} Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804092#action_12804092 ] Uri Boness commented on SOLR-1725: -- If we move this to the contrib. we'll need to extract the script engine abstraction of a separate contrib. utils library (so the DIH will be able to utilize it). I believe this can create a bit of a mess just for this small (though useful) functionality. Is there are real reason for not keeping it in the core? I think either way, if people will want to use it they'll need to read somewhere how... I think it'd be nice to save them the extra effort of putting an extra jar file in the lib directory - the configuration (writing the script and configuring the update processors) they'll need to adjust anyway. The only thing that we must stress in the documentation (both in the schema and in the wiki) is that they can only use this feature in Java 6. Two additional things to note: 1. JDK 5 has reached the end of service life (EOSL) already and is not actively supported by Sun (/Oracle). 2. The general recommendation is to run Solr on Java 6 anyways (due to some threading issues in Java 5). Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804094#action_12804094 ] Uri Boness commented on SOLR-1725: -- bq. Would it make more sense to execute the scripts in the order they are named in the scripts param? If I have two pipelines/chains, that need to use the same scripts but in different orders, I'm in trouble. Absolutely! The reason why it is currently lexicographically ordered is due to an initial (different) implementation that i had. I'll change it and add a patch for it. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804114#action_12804114 ] Uri Boness commented on SOLR-1725: -- bq. oh and one other thing - I really like this patch, Uri! I'm looking to integrate it into a data processing project here at JPL. Great idea! Thanks :-) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802496#action_12802496 ] Uri Boness commented on SOLR-1725: -- The DIH ScriptTransformer can really be cleaned up using this patch as well. I didn't add it to this patch as I didn't know whether it was a good idea to put too much into one patch. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1725) Script based UpdateRequestProcessorFactory
Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1725: - Attachment: SOLR-1725.patch Initial implementation. Includes a simple test (probably more tests are required). Builds a script engine per script file - each file has its own scope. This patch also introduces a new Interface - {{SolrResourceLoaderAware}} which can be used by any plugin loaded by SolrCore. (Any plugin implementing this interface will be injected by the resource loader of the SolrCore). The ScriptUpdateRequestProcessorFactory uses the resource loader to load the scripts from solr home conf directory. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801983#action_12801983 ] Uri Boness commented on SOLR-1725: -- bq. What about the existing ResourceLoaderAware? Woops... missed that one out :-)... I'll check it out and update the patch Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801988#action_12801988 ] Uri Boness commented on SOLR-1725: -- Is there any reason for currently limiting the classes that can be ResourceLoaderAware? This limitation is explicit in SolrResourceLoader (line: 584): {code} awareCompatibility.put( ResourceLoaderAware.class, new Class[] { CharFilterFactory.class, TokenFilterFactory.class, TokenizerFactory.class, FieldType.class } ); {code} If the type is not one of this classes an exception is thrown Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801994#action_12801994 ] Uri Boness commented on SOLR-1725: -- Right... ok... I'll add another class to this list (I just don't understand why would you want to limit the types that can be *Aware) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12801994#action_12801994 ] Uri Boness edited comment on SOLR-1725 at 1/18/10 11:52 PM: Right... ok... I'll add another class to this list (I just don't understand why would you want to limit the types that can be *Aware - in a way it defeats the whole idea of the *Aware abstraction). was (Author: uboness): Right... ok... I'll add another class to this list (I just don't understand why would you want to limit the types that can be *Aware) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1725: - Attachment: SOLR-1725.patch A new patch, this time leverages the already existing ResourceLoaderAware interface Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802001#action_12802001 ] Uri Boness commented on SOLR-1725: -- Thanks for the reference Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802022#action_12802022 ] Uri Boness commented on SOLR-1725: -- Yes, it depends on Java 6. I guess the concern is mainly for the unit tests? (at runtime the it shouldn't really matter) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802024#action_12802024 ] Uri Boness commented on SOLR-1725: -- Sorry... of course it matters for the build :-) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1725: - Attachment: SOLR-1725.patch Third try :-), this time Java 5 compatible Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1716) Add logging support for ScriptTransformer
[ https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1716: - Attachment: SOLR-1716.patch This patch puts the logger in the global scope, so you don't have to specify the logger as part of the function signature. It also cleans up the code the related classes a bit. Add logging support for ScriptTransformer - Key: SOLR-1716 URL: https://issues.apache.org/jira/browse/SOLR-1716 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Uri Boness Fix For: 1.5 Attachments: SOLR-1716.patch, SOLR-1716.patch Currently it's very hard to debug the logic embedded in the script ran by the ScriptTransformer. There should be a possibility to add a logger to the function signature, which can be used for logging. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer
[ https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799186#action_12799186 ] Uri Boness commented on SOLR-1716: -- There is still one thing to improve here. Right now a ScriptTransformer is created for each function in the script, which means an engine is created for each function. This can be optimized by creating one script engine per EntityProcessor which each ScriptTransformer will use to execute a dedicated function. Add logging support for ScriptTransformer - Key: SOLR-1716 URL: https://issues.apache.org/jira/browse/SOLR-1716 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Uri Boness Fix For: 1.5 Attachments: SOLR-1716.patch, SOLR-1716.patch Currently it's very hard to debug the logic embedded in the script ran by the ScriptTransformer. There should be a possibility to add a logger to the function signature, which can be used for logging. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1716) Add logging support for ScriptTransformer
Add logging support for ScriptTransformer - Key: SOLR-1716 URL: https://issues.apache.org/jira/browse/SOLR-1716 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Uri Boness Fix For: 1.5 Currently it's very hard to debug the logic embedded in the script ran by the ScriptTransformer. There should be a possibility to add a logger to the function signature, which can be used for logging. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1698) load balanced distributed search
[ https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798851#action_12798851 ] Uri Boness commented on SOLR-1698: -- I think the patch doesn't work. I just checkout the trunk and applying the patch fails with a conflict for LBHttpSolrServer.java load balanced distributed search Key: SOLR-1698 URL: https://issues.apache.org/jira/browse/SOLR-1698 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch Provide syntax and implementation of load-balancing across shard replicas. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1698) load balanced distributed search
[ https://issues.apache.org/jira/browse/SOLR-1698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798870#action_12798870 ] Uri Boness commented on SOLR-1698: -- yep.. that works load balanced distributed search Key: SOLR-1698 URL: https://issues.apache.org/jira/browse/SOLR-1698 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Attachments: SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch, SOLR-1698.patch Provide syntax and implementation of load-balancing across shard replicas. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer
[ https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798884#action_12798884 ] Uri Boness commented on SOLR-1716: -- yeah, I thought about the global context as well, but this was just something that I implemented anyway (as I needed it myself) and it works. You don't have to supply the logger, but if you do you need to specify the full method signature, that is: {code} function(row, context, logger) { } {code} but the following will work as well: {code} function(row, context) { } {code} and {code} function(row) { } {code} Add logging support for ScriptTransformer - Key: SOLR-1716 URL: https://issues.apache.org/jira/browse/SOLR-1716 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Uri Boness Fix For: 1.5 Attachments: SOLR-1716.patch Currently it's very hard to debug the logic embedded in the script ran by the ScriptTransformer. There should be a possibility to add a logger to the function signature, which can be used for logging. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1716) Add logging support for ScriptTransformer
[ https://issues.apache.org/jira/browse/SOLR-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798896#action_12798896 ] Uri Boness commented on SOLR-1716: -- working on a new patch to put the logging in a global context (and cleaning up the code a bit) Add logging support for ScriptTransformer - Key: SOLR-1716 URL: https://issues.apache.org/jira/browse/SOLR-1716 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Reporter: Uri Boness Fix For: 1.5 Attachments: SOLR-1716.patch Currently it's very hard to debug the logic embedded in the script ran by the ScriptTransformer. There should be a possibility to add a logger to the function signature, which can be used for logging. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1705) Move QueryConvertor into SpellCheckComponent configuration
[ https://issues.apache.org/jira/browse/SOLR-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12797049#action_12797049 ] Uri Boness commented on SOLR-1705: -- Wouldn't QueryTokenizer be a more appropriate name for this class? Move QueryConvertor into SpellCheckComponent configuration -- Key: SOLR-1705 URL: https://issues.apache.org/jira/browse/SOLR-1705 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Shalin Shekhar Mangar Priority: Minor Fix For: 1.5 QueryConvertor is a top level XML tag in solrconfig.xml but it is used by SpellCheckComponent only. Deprecate the current queryConvertor configuration and move it inside SpellCheckComponent configurationl. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1602) Refactor SOLR package structure to include o.a.solr.response and move QueryResponseWriters in there
[ https://issues.apache.org/jira/browse/SOLR-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12795920#action_12795920 ] Uri Boness commented on SOLR-1602: -- I think it is very important to understand all sides here. I fully and totally support Chris's attempts to clean up the code base which rightfully involves moving classes from one package to another. I think in some cases such cleanups need to come at the cost of user comfort as eventually they, as users, also gain from it as the system as a whole becomes more robust, extensible and maintainable. The good thing is that besides the deprecation issue I believe there is a consensus about the required changes. So thumbs up Chris!!! To deprecate or not to deprecate, that is the question. In a widely used library/framework/system with a large (or fast growing) install/user base such as the Solr community, the common practice is *not* to just break BWC without giving the users some grace period in which they can adjust their deployments to the new changes. Sometimes, it's absolutely necessary (such in the cases of bug fixes) but when it's not, in general it can create the opposite effect than you want with the community - instead having the community appreciate your improvements and see Solr as an improving with time product, they turn and see Solr as an inconsistent and sometime even unreliable product. So from my experience with delivering goods for the users, especially in the open source world (and I do have quite a bit of experience in that respect with the Spring Framework) you always need to strive to 100% BWC in theory and ~95% BWC in practice (never less than 90% though). If you stick to that, I believe changes will be widely accepted as improvements rather than harassments. But there's a catch here In order to stick to these numbers, you have to adhere to two important conditions: 1. You need to have a rather solid architecture and code base to start with. If you don't, then naturally in the beginning you can expect many more extreme/major changes which lead to quite a few BWC breaks (it will gradually be reduced as the architecture/codebase improves). Whether Solr answers this condition is open for debate... there are a lot of solid parts in Solr and quite a few parts where a complete rewrite is appropriate. 2. You need to have a steady and short release cycles. This is one thing that Solr lacks... big time. In a 1 year release cycle, deprecating code means that for the next year (in some cases two years), the code base will be messy with deprecated classes all over the place. In that respect, I can definitely understand Chris's objection for deprecation as the cleanup tasks that he's implementing may end up creating more mess (at least for a long while) than you had before the cleanup all together. I believe that moving to shorter release cycles (including bug-fix releases) will greatly help promoting deprecation in general. (NOTE: just a small note about the first condition. One thing to take into account is that *every* piece of software reaches a point in time where it needs to be completely re-written or at least go through a *major* refactoring/re-architecturing phase. This can be caused by many factors, let it be new technologies that are introduced, or simply limitations of the architecture that were not foreseen. It's very important to understand and admit to this fact - even from the user point of view it's acceptable. What is not acceptable, if it happens too many times and too frequent) Bottom line, it's always a conflict between the user point of view and the developer point of view. And there needs to be a balance and understanding of both sides. Each side needs to understand and give in to some extend to create this balance. But to make it happen, the culture, environment and well defined policies need to be in place. Arguing endlessly who's right here will never bring to a good outcome, simply because both sides are right and wrong at the same time, if you treat it as a black or white issue you'll end up loosing something - either the user trust or a better software. How about creating a proper release plan for the upcoming year, say a release every two months? Chris, if you have such a release schedule, will you feel more comfortable with deprecation? Refactor SOLR package structure to include o.a.solr.response and move QueryResponseWriters in there --- Key: SOLR-1602 URL: https://issues.apache.org/jira/browse/SOLR-1602 Project: Solr Issue Type: Improvement Components: Response Writers Affects Versions: 1.2, 1.3, 1.4 Environment: independent of environment (code structure) Reporter: Chris A.
[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields
[ https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794250#action_12794250 ] Uri Boness commented on SOLR-1298: -- {quote}I think they should be inline, as they are just values associated with a document. I think putting it in some other list is sticking too literally to what Lucene calls a field, which I don't think Solr has to do that. One could easily imagine a Solr component that brought in a database or other storage repository for supplementary fields and it should all be seamless to the client.{quote} I definitely agree that one shouldn't see a field in Solr as a field in Lucene. That said, I think do have a tendency to see a field in Solr as somehow bound to the Solr schema. One thing to notice is that eventually we end up with the same discussion regarding this feature in the context of different issues, let it be highlighting or field collapsing. In some cases it feel just right to return the data as a field in a document, in other places it feels right to have as something else. It is true that when you interact with solr directly (specially if you do it manually) you certainly know what queries you send, what functions you request and what you should expect in the result. But from experience, a lot of times you try to automate things a bit and creating a well structured and descriptive protocols is the safe way to enable that. {quote}I don't want to have to go look it up in some other list while I am iterating over my results when all the other values I'm displaying/using are right there associated with the document.{quote} Having a sub-section under each documents still associates it with the document. The way I see it, It's like OOP... you can have a Person class that holds all the information of the person it it as primitive fields, or you can group related data, like address info, int a separate Address class. {quote}That being said, it could be useful to add an attribute that indicates it is a generated name{quote} That's one way to group fields together, but if you're already doing that, then why not go all the way? If you need to distinguish between generated and non-generated names, why not make it simpler and just separate the two in a different list? (To continue the analogies line I started above :-)) it's like XML, you can have a single level hierarchy were each element defines attributes to relate it to other elements, but a more suitable solution would just be to group all related elements under one parent element. {quote}I'd even argue that highlighter results should be inline, too, but that is a different issue and a bigger can of worms since it has a well used API already.{quote} In some cases it might be (well it just is) more appropriate to have the highlighting inlined. In other cases it might not be possible, specially with some of the latest requests to have highlighting functionality available for arbitrary text loaded from anywhere (which I believe will lead for a highlighting component/requestHandler that will be independent of the query component). {quote}Not saying this is right or wrong, but I think it would be useful to document here the rationale about why not to do it. Is it just b/c that method is expected to do, more or less, what the Lucene IndexSearcher does?{quote} I guess so... I guess SolrIndexSearcher is in fact a Lucene IndexSearcher which is the source for this association. In some ways I think it's also relates a bit to the response structure (not directly though, but conceptually)... if the IndexSearcher represents Lucene and the document contains fields coming from other sources as well, perhaps this functionality of gathering all these fields (/metadata ;-)) should be done in a higher level where SolrIndexSearcher just serves as on field source. The main reason why Chris's patch puts this functionality in the doc() method of the SolrIndexSearcher is simply because it's the easiest and the simplest solution right now... and I don't thing there's nothing wrong with that... simple is good! Even with this solution as it is, the field sources are still abstracted away in the form of a FieldValues or DocumentMutator, so architecture-wise I don't see leaving it as is will compromise anything. FunctionQuery results as pseudo-fields -- Key: SOLR-1298 URL: https://issues.apache.org/jira/browse/SOLR-1298 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: SOLR-1298-FieldValues.patch, SOLR-1298.patch It would be helpful if the results of FunctionQueries could be added as fields to a document. Couple of options here: 1. Run FunctionQuery
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12794252#action_12794252 ] Uri Boness commented on SOLR-236: - {quote}If we are returning a number of documents (as opposed to a number of groups) to the user, how do they avoid splitting on a page in the middle of the group?{quote} As far as I know (Martijn, correct me if I'm wrong), Martijn's patch returns the number of groups *and* documents, where each group is actually represented as a document. So in that sense, the total count applies to the result set as is (groups count as documents) and therefore pagination just works. {quote}The only thing this algorithm can't do (related to pagination) is give the total number of documents after collapsing (and hence can't calculate the exact number of pages). This can be fine in many circumstances as long as the gui handles it (people don't seem to mind google doing it... I just tried it. Google didn't show the result count right unless displaying the last page).{quote} First of all, I must admit that I never noticed that in Google, so I guess you're right :-). But when you think about it, with Google, how many time do you get a low hit count that only fits in 2-3 pages? Well, I hardly ever get it, and when I do I don't even bother to check the result I just try to improve my search. With Solr, a lot of times its different, specially when all these discovery features and faceting are so often used to narrow the search extensively... I'm not saying not having a perfect pagination mechanism is a problem... not at all, I'm just saying that it *might* be an issue for specific use cases or specific domains but that's just an assumption (or a gut feeling) :-) Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793554#action_12793554 ] Uri Boness commented on SOLR-236: - bq. Why is it wrong. it is about adding meta-info to the docs. This is what we plan to do with SOLR-1566 This is exactly the point, it's not really meta-data over the document, but on the group the document belongs to. And you also need a more obvious way to mark this document as a group representation (to distinguish it from other normal documents). bq. Even when we collapse what we are expecting is simple search results. So a drastic deviation from the standard format is not a good idea. I definitely agree that BWC should be kept, specially here when we're dealing with a query component. But extending the current doc element, doesn't mean we break BWC. Adding a collapse-info (or collapse-meta-data) sub element to it, will certainly not break anything, specially when we still don't have a formal xsd for the responses (I know we're working on it, but it's still not out there so it's safe). Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793565#action_12793565 ] Uri Boness commented on SOLR-236: - @Yonik As far as I understand from your collapse algorithm proposal, in order to save memory you'd like to restrict the group creation to only those that belong in the requested results page. Beyond loosing the faceting support over the collapsed DocSet, I think there might be a problem with pagination as well. For every page you'll end up with a different total count and therefore different number of pages. This can be very confusing from the user perspective - imagine going to the first page and calculating (and displaying) that you have 3 pages of results, then when the user asks for the second page, s/he gets a response with 2 pages and different total count. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793411#action_12793411 ] Uri Boness commented on SOLR-236: - @Shalin I think mixing the collapse information with document fields is wrong. The collapse fields don't really belong to the document, but to the group the document represents, while the other field do belong to it. The response format should somehow indicate this difference. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1668) Declarative configuration meta-data for Solr plugins
[ https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792454#action_12792454 ] Uri Boness commented on SOLR-1668: -- @Erik {quote} Also note that Ant's configuration mechanism isn't just with setters. A java task for example can take any number of sysproperty sub-elements, and they get injected via addSysproperty(Environment.Variable sysp). {quote} System properties can be supported in 2 ways: 1. On the configuration level using an expression language (a la Spring... yes.. Spring supports it :-)). This means that in the schema you'll be able to configure properties like: stopWordFile=${conf.dir}/stopwords.txt. the conf.dir parameter can be replaced either from system properties, properties file, or other source. Eventually these properties 2. Using another annotation (say, @SystemProperty) which indicates the value should first be taken from the system properties and then converted to the required data type bq.It's the specifying of the converter class in the annotation that I don't like. It can be more implicit than that, like magic The @Converter annotation is mainly aimed for user extensions. Indeed all the out-of-the-box plugins don't need to have it as default converters can be pre-registered to handle all the data types we need at the moment. For users who want to provide their own plugins, we need to provide them a simple mechanism to register converters and I found the @Converter annotation to be the simplest one. bq. We'd have setStopWordList(SolrFile f), and we'd only that setter after the system properties were in the mix. As you said, I believe once we have system properties supported this will be a no brainer and indeed I believe this belongs to an earlier properties substitution phase (as mentioned above). @Noble bq. is there anyone building it? Oh yes :-), but beyond that, this will open up opportunities to develop plugins to IDE's/TextEditors for Solr... even just for better support in writing the schema files with auto-completion, validation, etc... bq. Why do we need this magic in String- Object conversion at all? Well, my obvious response is because of the nature of Solr configuration which is text based while at runtime you're dealing with other data types. Of course you can just create String setters and do the conversion yourself, but why do that if you can have done automatically and keep your classes clean. Just to be clear, the magic is not really magic we can be very clear about what converters are supported out of the box and (as I mentioned above) with the @Converter annotation users can be more explicit in how they want the conversion to take place. Bottom line, in the end of the day you want to be able to focus and write the plugins as POJO's using properties of the correct data types and focus on the plugin's logic rather than also focusing on configuration logic. Declarative configuration meta-data for Solr plugins Key: SOLR-1668 URL: https://issues.apache.org/jira/browse/SOLR-1668 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch The idea here is for plugins in Solr to carry more meta data over their configuration. This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr. One common mechanism to provide this meta data is by using standard Java Beans for the different configuration constructs where the properties define the configurable attributes and annotations are used to provide extra information about them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792458#action_12792458 ] Uri Boness commented on SOLR-236: - bq. I'm curious as to whether anyone has just thought of using the Clustering component for this? If your collapse field was a single token, I wonder if you would get the results you're looking for. The main difference between the two components is that while the clustering works more as a function where the input is the doclist/docset and the output is a separate data structure representing the groups, the collapse component operates directly on the docset doclist modifies them and incorporates the groups within the final search result. In all occurrences where we found the need for the collapse component, we needed to incorporate the grouping within the search result, and adjust the sorting and the pagination accordingly. As far as I know you cannot do that with the clustering component. This tight integration with the result is also the reason why the collapse component right now is actually a replacement to the query component. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792686#action_12792686 ] Uri Boness commented on SOLR-236: - Essentially it boils down to two options: # Keep it out of the trunk, in which case users that will need this functionality will only get it by working with a patched Solr version of their own, or use a branch (in both cases, most likely they will miss the continuous work done on the trunk unless they keep on merging the changes) # Keep in the trunk with some caveats, in which case they users have a chance to use this functionality out of the box In both cases, the user have a choice to make: - be satisfied by the performance of this feature - look for an alternative solution (other products) - give up this functionality all together (if their business requirements allow that) So the main difference here I would say is in how easy you'd like to provide this functionality to the users. On the Solr development part, indeed once this is committed to the trunk there's much more responsibility on the committers to make it work (enhance performance and fix bugs)... but this is a *good* thing as there is a high demand for this feature and as a community driven project this demand should to be satisfied. And I *do* think that the number of users using this patch already is a good indicator that it is good enough for quite a lot of use cases. I do agree though that before committing anything, the public API should be re-evaluated to minimize chances for BWC issues later on. BTW, regarding the response, Solr already has a few places where the response format is still marked as experimental and as subject to changes in the future (but it doesn't stop people from using this functionality as they take the responsibility to adapt to any such future changes when the come). Now... writing this, it suddenly occurred to me that there might be another solution to this all discussion which is in a way a combination of many of the suggestions in this thread. What if, this patch would be split to two: the changes to the core and the component itself. Now, if the changes to the core are not that drastic and make sense (or at least everyone can live with them) then perhaps they can be committed to the trunk. As for the rest of the patch (which consists of the search components and its other supporting classes), this can be put in SVN as separate branch for contrib. The good thing about this solution is that the work done on this functionality will be in SVN so you benefit from it as David mentioned above. The other benefit is that with this layout you can actually build the branched code base separately and distribute this functionality as a separate jar which can be deployed in Solr 1.5x distribution. Again, a bit of work left to the users (too much to my taste) but at least they're not forced to use a patched version of Solr. Would that be a possible solution? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection.
[jira] Updated: (SOLR-1668) Declarative configuration meta-data for Solr plugins
[ https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1668: - Attachment: SOLR-1668.patch In this patch I removed the need for the @InitProperty annotation. Instead any setter in the class will be considered as an initialization property. You can use the @Required annotation to mark properties as mandatory and the @ArgumentName to customize the name of the argument used to initialize it. Declarative configuration meta-data for Solr plugins Key: SOLR-1668 URL: https://issues.apache.org/jira/browse/SOLR-1668 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch, SOLR-1668.patch The idea here is for plugins in Solr to carry more meta data over their configuration. This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr. One common mechanism to provide this meta data is by using standard Java Beans for the different configuration constructs where the properties define the configurable attributes and annotations are used to provide extra information about them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792189#action_12792189 ] Uri Boness commented on SOLR-236: - {quote} Grant, this patch may not be perfect but I think we all agree that it is a great start. This is stable, used by many and has been well supported by the community. This is also a large patch and as I have known from my DataImportHandler experience, maintaining a large patch is quite a pain (and DataImportHandler didn't even touch the core). How about we commit this (after some review, of course), mark this as experimental (no guarantees of any sort) and then start improving it one issue at a time? Alternately, if you are not comfortable adding it to trunk, we can commit this on a branch and merge into trunk later. {quote} I think managing a separate branch will be just as hard as managing a patch. I do however agree that it's about time this patch will be committed to the trunk. Even though the current solution is not scalable in terms of distributed search (and I agree that the current solution for that is not really a viable solution), many are already using it and it is the most wanted feature in JIRA after all. One think you can do, is apply the changed to the core (which are not really many) and commit the rest of the patch as a contrib (along with all the disclaimers Shalin mentioned above). Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1668) Declarative configuration meta-data for Solr plugins
[ https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1668: - Attachment: commons-beanutils-1.8.2.jar SOLR-1668.patch This patch provides Java Bean configuration for all MapInitializedPlugins. To showcase this functionality, I changed the TokenizerFactory to implement the MapInitializedPlugin interface and changed the PatternTokenizerFactory to use the new Java Bean configuration. This implementation depends on the commons-beanutils library which should be added to the lib directory. Declarative configuration meta-data for Solr plugins Key: SOLR-1668 URL: https://issues.apache.org/jira/browse/SOLR-1668 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch The idea here is for plugins in Solr to carry more meta data over their configuration. This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr. One common mechanism to provide this meta data is by using standard Java Beans for the different configuration constructs where the properties define the configurable attributes and annotations are used to provide extra information about them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1668) Declarative configuration meta-data for Solr plugins
[ https://issues.apache.org/jira/browse/SOLR-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792312#action_12792312 ] Uri Boness commented on SOLR-1668: -- Thanks! Well... no it's not Ant yet or Spring, but it's a start that can already help with Tokenizers Filters. The current patch is actually based on setters but adding annotations on top of them can add even more meta data. For example, marking a property as required or associating a different configuration name perhaps to differentiate user friendly naming from code friendly naming (How does Ant deal with these stuff?). Declarative configuration meta-data for Solr plugins Key: SOLR-1668 URL: https://issues.apache.org/jira/browse/SOLR-1668 Project: Solr Issue Type: Improvement Components: Schema and Analysis Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 Attachments: commons-beanutils-1.8.2.jar, SOLR-1668.patch The idea here is for plugins in Solr to carry more meta data over their configuration. This can be very useful for building tools around Solr where this meta data can be used to assist users in configuring solr. One common mechanism to provide this meta data is by using standard Java Beans for the different configuration constructs where the properties define the configurable attributes and annotations are used to provide extra information about them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-17) XSD for solr requests/responses
[ https://issues.apache.org/jira/browse/SOLR-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791023#action_12791023 ] Uri Boness commented on SOLR-17: Having well defined XSD's for public services can be *extremely* helpful in many aspects... together with proper version management they define the contract between the users the the service. Some of the use cases that Chris listed above are definitely valid and realistic. Moreover, XSD provides a natural and proper documentation for the supported formats which any decent xml editor can make use of and provide you with hints for writing the solrconfig.xml and the schema.xml (for example). That said... most of the xml formats in Solr are too generic to benefit from XSD's. The only format where it makes sense is the schema.xml as it has an expressive domain-driven structure. Unfortunately this is something you cannot say for for the response formats and the solrconfig.xml where the expressiveness lays within the *values* of the elements/attributes rather than in the elements/attribute *names* themselves. XSD doesn't handle element/attribute values very well. XSD for solr requests/responses --- Key: SOLR-17 URL: https://issues.apache.org/jira/browse/SOLR-17 Project: Solr Issue Type: Improvement Reporter: Mike Baranczak Priority: Minor Attachments: solr-complex.xml, solr-rev2.xsd, solr.xsd, UselessRequestHandler.java Attaching an XML schema definition for the responses and the update requests. I needed to do this for myself anyway, so I might as well contribute it to the project. At the moment, I have no plans to write an XSD for the config documents, but it wouldn't be a bad idea. TODO: change the schema URL. I'm guessing that Apache already has some sort of naming convention for these? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-17) XSD for solr requests/responses
[ https://issues.apache.org/jira/browse/SOLR-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791126#action_12791126 ] Uri Boness commented on SOLR-17: {quote} However, as a start, I think contributing and committing the SOLR XML response writer output XSD (and a DTD, which I'll attach) is something that adds value, doesn't take anything away, or touch other parts of the code, etc., and is worthwhile to do. {quote} Fair enough. I guess it can always serve as a reference to better understanding what to expect from a Solr response (instead of trying to figure things out from the code). Good thing about this generic format is that it's unlikely to change that frequently, so the XSD's will probably not change that often as well. XSD for solr requests/responses --- Key: SOLR-17 URL: https://issues.apache.org/jira/browse/SOLR-17 Project: Solr Issue Type: Improvement Reporter: Mike Baranczak Priority: Minor Attachments: solr-complex.xml, solr-rev2.xsd, solr.xsd, UselessRequestHandler.java Attaching an XML schema definition for the responses and the update requests. I needed to do this for myself anyway, so I might as well contribute it to the project. At the moment, I have no plans to write an XSD for the config documents, but it wouldn't be a bad idea. TODO: change the schema URL. I'm guessing that Apache already has some sort of naming convention for these? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1649) Refactor all ResponseWriters to be more extension-friendly
[ https://issues.apache.org/jira/browse/SOLR-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789884#action_12789884 ] Uri Boness commented on SOLR-1649: -- I think the class hierarchy needs to change. see: https://issues.apache.org/jira/browse/SOLR-1123?focusedCommentId=12711133page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12711133 Refactor all ResponseWriters to be more extension-friendly -- Key: SOLR-1649 URL: https://issues.apache.org/jira/browse/SOLR-1649 Project: Solr Issue Type: Improvement Components: Response Writers Affects Versions: 1.4 Environment: My local MacBook pro over the Christmas break. Reporter: Chris A. Mattmann Fix For: 1.5 I'd like to refactor all the ResponseWriters to be a bit less brittle. ResponseWriters should follow a standard interface with more existing methods than is currently present in the interface, and with lots of refactored utility code and more concrete control/data flow. I'll take a hard look at the existing response writers and try to generalize. See this thread for background: http://www.lucidimagination.com/search/document/e8bb6cac84c1f520/namespaces_in_response_solr_1586#cc50ba9e9d8fe2dc -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields
[ https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789889#action_12789889 ] Uri Boness commented on SOLR-1298: -- {quote} I certainly can. I hadn't thought about having a function as an fl parameter value, but that makes alot of sense and I can support that through my work as well. I'll work on extracting the code today and will get a patch here ASAP. {quote} As far as I recall the fact the functions are specified in the fl parameter should still work with the FieldValueSource as it is at the moment. The registry enables you to register any value for any string key, in this case the string key is the function. FunctionQuery results as pseudo-fields -- Key: SOLR-1298 URL: https://issues.apache.org/jira/browse/SOLR-1298 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 It would be helpful if the results of FunctionQueries could be added as fields to a document. Couple of options here: 1. Run FunctionQuery as part of relevance score and add that piece to the document 2. Run the function (not really a query) during Document/Field retrieval -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields
[ https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789890#action_12789890 ] Uri Boness commented on SOLR-1298: -- Chris, another thing. You might want to update the FieldValueSource solution to work with SOLR-1644 (instead of the request context) FunctionQuery results as pseudo-fields -- Key: SOLR-1298 URL: https://issues.apache.org/jira/browse/SOLR-1298 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 It would be helpful if the results of FunctionQueries could be added as fields to a document. Couple of options here: 1. Run FunctionQuery as part of relevance score and add that piece to the document 2. Run the function (not really a query) during Document/Field retrieval -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1298) FunctionQuery results as pseudo-fields
[ https://issues.apache.org/jira/browse/SOLR-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789908#action_12789908 ] Uri Boness commented on SOLR-1298: -- I like the idea of giving the providing a broader context (document, request, response). This will also allow them to operate on multiple documents in the response (whether it's the docset or the doclist). One thing to take into consideration here is that one you introduce dependency between the fields, there must be a way to determine the ordering of the providers (as one provider might depend on fields generated by another provider). as for the field AS alias syntax. I think this should be consistent with the work in SOLR-1351 which is currently based on localparams. Perhaps there should be a common approach to handle aliases in requests. I think that the proper approach is to separate the stored fields from other fields.. perhaps even put it in a separate meta-data section under the document. But once you do that, again, for the sake of consistency, it would also be wise *not* to include these fields/functions in the fl parameter. So the fl parameter will refer to fields, and another parameter meta will refer to meta-data values. bq. fl={!func}foo +1 or even func:foo. Then you can have things like url:url or file:file path or even db:db alias + field FunctionQuery results as pseudo-fields -- Key: SOLR-1298 URL: https://issues.apache.org/jira/browse/SOLR-1298 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: SOLR-1298-FieldValues.patch, SOLR-1298.patch It would be helpful if the results of FunctionQueries could be added as fields to a document. Couple of options here: 1. Run FunctionQuery as part of relevance score and add that piece to the document 2. Run the function (not really a query) during Document/Field retrieval -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder
[ https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789920#action_12789920 ] Uri Boness commented on SOLR-1644: -- bq. if (rb.store.get(HighlightingComponent.DO_HIGHLIGHTING) == Boolean.TRUE) This is verbose... too verbose to my taste. I believe a Store interface can help here which provide access to data by a key and will also provide helper methods to keep the code clean. (a MapStore can be a simple implementation which wraps a MapString, Object instance): {code} public interface Context { Boolean getBoolean(String key); boolean getBoolean(String key, boolean defaultValue); Integer getInt(String key); int getInt(String key, int defaultValue); //other methods for all primitive types and dates. } {code} so now you have: {code} if (rb.store.getBoolean(HighlightingComponent.DO_HIGHLIGHTING, false)) {code} which is cleaner and is NPE-safe. bq. I believe the public API's should have no dependency on components . I agree. Basically avoid having circular dependencies. You don't want to change the platform API every time you introduce a new component. Provide a clean way to keep flags and helper objects in ResponseBuilder --- Key: SOLR-1644 URL: https://issues.apache.org/jira/browse/SOLR-1644 Project: Solr Issue Type: Improvement Components: search Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: SOLR-1644.patch Many components such as StatsComponent, FacetComponent etc keep flags and helper objects in ResponseBuilder. Having to modify the ResponseBuilder for such things is a very kludgy solution. Let us provide a clean way for components to keep arbitrary objects for the duration of a (distributed) search request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder
[ https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789920#action_12789920 ] Uri Boness edited comment on SOLR-1644 at 12/13/09 6:13 PM: bq. if (rb.store.get(HighlightingComponent.DO_HIGHLIGHTING) == Boolean.TRUE) This is verbose... too verbose to my taste. I believe a Store interface can help here which provide access to data by a key and will also provide helper methods to keep the code clean. (a MapStore can be a simple implementation which wraps a MapString, Object instance): {code} public interface Store { Boolean getBoolean(String key); boolean getBoolean(String key, boolean defaultValue); Integer getInt(String key); int getInt(String key, int defaultValue); //other methods for all primitive types and dates. } {code} so now you have: {code} if (rb.store.getBoolean(HighlightingComponent.DO_HIGHLIGHTING, false)) {code} which is cleaner and is NPE-safe. bq. I believe the public API's should have no dependency on components . I agree. Basically avoid having circular dependencies. You don't want to change the platform API every time you introduce a new component. was (Author: uboness): bq. if (rb.store.get(HighlightingComponent.DO_HIGHLIGHTING) == Boolean.TRUE) This is verbose... too verbose to my taste. I believe a Store interface can help here which provide access to data by a key and will also provide helper methods to keep the code clean. (a MapStore can be a simple implementation which wraps a MapString, Object instance): {code} public interface Context { Boolean getBoolean(String key); boolean getBoolean(String key, boolean defaultValue); Integer getInt(String key); int getInt(String key, int defaultValue); //other methods for all primitive types and dates. } {code} so now you have: {code} if (rb.store.getBoolean(HighlightingComponent.DO_HIGHLIGHTING, false)) {code} which is cleaner and is NPE-safe. bq. I believe the public API's should have no dependency on components . I agree. Basically avoid having circular dependencies. You don't want to change the platform API every time you introduce a new component. Provide a clean way to keep flags and helper objects in ResponseBuilder --- Key: SOLR-1644 URL: https://issues.apache.org/jira/browse/SOLR-1644 Project: Solr Issue Type: Improvement Components: search Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: SOLR-1644.patch Many components such as StatsComponent, FacetComponent etc keep flags and helper objects in ResponseBuilder. Having to modify the ResponseBuilder for such things is a very kludgy solution. Let us provide a clean way for components to keep arbitrary objects for the duration of a (distributed) search request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type
[ https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789922#action_12789922 ] Uri Boness commented on SOLR-1123: -- I think the main issue with the inheritance right now is that the QueryResponseWriter interface is dealing with a Writer rather than with an OutputStream. This accounts for the hacky GenericBinaryResponseWriter. Looking at SOLR-1516 I'm a bit confused. I always had the impression that the main idea behind the response writers is that all they need to know is how to marshal a NamedList (so they don't need explicit knowledge of documents, highlighting, etc...). But now the GenericTextResponseWriter knows about documents (via the SingleResponseWriter). But perhaps I just go it wrong. Change the JSONResponseWriter content type -- Key: SOLR-1123 URL: https://issues.apache.org/jira/browse/SOLR-1123 Project: Solr Issue Type: Improvement Reporter: Uri Boness Fix For: 1.5 Attachments: JSON_contentType_incl_tests.patch Currently the jSON content type is not used. Instead the palin/text content type is used. The reason for this as I understand is to enable viewing the json response as as text in the browser. While this is valid argument, I do believe that there should at least be an option to configure this writer to use the JSON content type. According to [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to be application/json (and not text/x-json). The reason this can be very helpful is that today you have plugins for browsers (e.g. [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page with application/json content type in a user friendly manner (just like xml is supported). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder
[ https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789236#action_12789236 ] Uri Boness commented on SOLR-1644: -- Don't you have it already with the SolrQueryRequest.getContext() Provide a clean way to keep flags and helper objects in ResponseBuilder --- Key: SOLR-1644 URL: https://issues.apache.org/jira/browse/SOLR-1644 Project: Solr Issue Type: Improvement Components: search Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 1.5 Many components such as StatsComponent, FacetComponent etc keep flags and helper objects in ResponseBuilder. Having to modify the ResponseBuilder for such things is a very kludgy solution. Let us provide a clean way for components to keep arbitrary objects for the duration of a (distributed) search request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder
[ https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789259#action_12789259 ] Uri Boness commented on SOLR-1644: -- It should also be possible to share objects between components to eliminate duplicate computations (if one component can re-use a computation that was already done in another component). I guess this can be supported by publishing the key as a public static field. Provide a clean way to keep flags and helper objects in ResponseBuilder --- Key: SOLR-1644 URL: https://issues.apache.org/jira/browse/SOLR-1644 Project: Solr Issue Type: Improvement Components: search Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: SOLR-1644.patch Many components such as StatsComponent, FacetComponent etc keep flags and helper objects in ResponseBuilder. Having to modify the ResponseBuilder for such things is a very kludgy solution. Let us provide a clean way for components to keep arbitrary objects for the duration of a (distributed) search request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1644) Provide a clean way to keep flags and helper objects in ResponseBuilder
[ https://issues.apache.org/jira/browse/SOLR-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789290#action_12789290 ] Uri Boness commented on SOLR-1644: -- bq. We should not keep it static. public final should be good enough Is there a special reason for this? Is the plan to have a KEY per component instance? If so, how would it be possible to refer to the key from other components? This is what I had in mind - Assuming Component1 computed something and registered it in the store using KEY. Then Component2 can reuse this computation by accessing it as follows: {code} Object someValue = rb.store.get(Component2.KEY); // do something with someValue {code} Provide a clean way to keep flags and helper objects in ResponseBuilder --- Key: SOLR-1644 URL: https://issues.apache.org/jira/browse/SOLR-1644 Project: Solr Issue Type: Improvement Components: search Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: SOLR-1644.patch Many components such as StatsComponent, FacetComponent etc keep flags and helper objects in ResponseBuilder. Having to modify the ResponseBuilder for such things is a very kludgy solution. Let us provide a clean way for components to keep arbitrary objects for the duration of a (distributed) search request. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1625) Add regexp support for TermsComponent
[ https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788047#action_12788047 ] Uri Boness commented on SOLR-1625: -- regexp vs. regex - I really don't know. I always use/d regexp, but I guess we need to come up with something that is consistent with Solr. The first thing that comes to mind with a regular expression configuration in Solr is the highlighting component and indeed it uses regex, so it's best to stick to that. bq. have expplicit strings like regex.flag=case_sensitiveregex.flag=multiline Yeah... I had this feeling as well, but I thought it might be too many extra parameters just for the regular expression support. If you think that's best I can add it. I'll make the changes tonight and submit a new patch. Add regexp support for TermsComponent - Key: SOLR-1625 URL: https://issues.apache.org/jira/browse/SOLR-1625 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Uri Boness Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1625.patch, SOLR-1625.patch At the moment the only way to filter the returned terms is by a prefix. It would be nice it the filter could also be done by regular expression -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent
[ https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1625: - Attachment: SOLR-1625.patch Updated the patch to support the following changes (as discussed above): - using terms.regex param (instead of terms.regexp) - using more explicit names for the regex flags Add regexp support for TermsComponent - Key: SOLR-1625 URL: https://issues.apache.org/jira/browse/SOLR-1625 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Uri Boness Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1625.patch, SOLR-1625.patch, SOLR-1625.patch At the moment the only way to filter the returned terms is by a prefix. It would be nice it the filter could also be done by regular expression -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent
[ https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1625: - Attachment: SOLR-1625.patch Added support for regexp hints based on the different constants in the Pattern class. The terms.regexp.hints parameter accepts an int value corresponding to the value passed to the Pattern.compile(String expression, int hints) factory method. Using hints it is now possible to support case insensitive patterns. Add regexp support for TermsComponent - Key: SOLR-1625 URL: https://issues.apache.org/jira/browse/SOLR-1625 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 Attachments: SOLR-1625.patch, SOLR-1625.patch At the moment the only way to filter the returned terms is by a prefix. It would be nice it the filter could also be done by regular expression -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-343) Constraining date facets by facet.mincount
[ https://issues.apache.org/jira/browse/SOLR-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-343: Attachment: SOLR-343.patch Updated this patch to work with the current trunk and added tests Constraining date facets by facet.mincount -- Key: SOLR-343 URL: https://issues.apache.org/jira/browse/SOLR-343 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.2 Environment: Solr 1.2+ Reporter: Raiko Eckstein Priority: Minor Attachments: DateFacetsMincountPatch.patch, SOLR-343.patch It would be helpful to allow the facet.mincount parameter to work with date facets, i.e. constraining the results so that it would be possible to filter out date ranges in the results where no documents occur from the server-side. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1625) Add regexp support for TermsComponent
Add regexp support for TermsComponent - Key: SOLR-1625 URL: https://issues.apache.org/jira/browse/SOLR-1625 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 At the moment the only way to filter the returned terms is by a prefix. It would be nice it the filter could also be done by regular expression -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1625) Add regexp support for TermsComponent
[ https://issues.apache.org/jira/browse/SOLR-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1625: - Attachment: SOLR-1625.patch Add regexp support for TermsComponent - Key: SOLR-1625 URL: https://issues.apache.org/jira/browse/SOLR-1625 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Uri Boness Priority: Minor Fix For: 1.5 Attachments: SOLR-1625.patch At the moment the only way to filter the returned terms is by a prefix. It would be nice it the filter could also be done by regular expression -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1351) facet on same field different ways
[ https://issues.apache.org/jira/browse/SOLR-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1351: - Attachment: SOLR-1351.patch Took the approach as described above. The only difference is that instead of the id parameter I reused the key parameter already supported by this component. The idea is that now, when the key local param is specified, all the specific facet params need to use the key instead of the field name. {code} q=*:*facet=truefacet.field={!key=cat1}catf.cat1.facet.sort=truef.cat1.facet.limit=20f.cat1 facet.mincount=1facet.field={!key=cat2}catf.cat2.facet.sort=falsef.cat2.facet.count=0 {code} This not only applies to simple filed facets but also to date facets: {code} q=*:*facet=truefacet.date={!key=foo}bdayf.foo.facet.date.start=1976-07-01T00:00:00.000Z f.foo.facet.date.end=1976-07-01T00:00:00.000Z+1MONTHf.foo.facet.date.gap=+1DAY f.foo.facet.date.other=allfacet.date={!key=bar}bday f.bar.facet.date.end=1976-07-01T00:00:00.000Z+7DAYf.bar.facet.date.gap=+1DAY {code} facet on same field different ways -- Key: SOLR-1351 URL: https://issues.apache.org/jira/browse/SOLR-1351 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Fix For: 1.5 Attachments: SOLR-1351.patch There is a general need to facet on the same field in different ways (different prefixes, different filters). We need a way to express this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1351) facet on same field different ways
[ https://issues.apache.org/jira/browse/SOLR-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754506#action_12754506 ] Uri Boness commented on SOLR-1351: -- This is something that I've done in the far past (Solr 1.2) and they way I see it, facets should be identified by a unique idea rather than by the field name and the facet results will then be grouped by these ids. I think this can be done by just adding one extra parameter in the form: {code} f.fieldName.facet.id {code} This parameter will practically mean that all other specific parameter for field facet will need to use this id instead of the field name, that is: Assuming we have a field called cat to represent a category. Right now (without an id) we ca do: {code} q=*:*facet=truefacet.field=catf.cat.facet.sort=truef.cat.facet.limit=20f.cat.facet.mincount=1 {code} with introducing the id: {code} q=*:*facet=truefacet.field=catf.cat.facet.id=categoryf.category.facet.sort=truef.category.facet.limit=20f.category.facet.mincount=1 {code} Now to support multiple configurations: {code} q=*:*facet=truefacet.field=catf.cat.facet.id=cat1f.cat1.facet.sort=truef.cat1.facet.limit=20f.cat1facet.mincount=1f.cat.facet.id=cat2f.cat2.facet.sort=falsef.cat2.facet.count=0 {code} Note that even after introducing the id param, backward compatibility can easily be maintained - we just determine that when the id param is not specified, the field name is the default id. From experience, I can tell you that adding this feature not only will enable multiple facets on the same field, but IMO will also make it much easier to develop search clients and tools on top of Solr. If this solution sounds reasonable, I can start working on a patch for it. facet on same field different ways -- Key: SOLR-1351 URL: https://issues.apache.org/jira/browse/SOLR-1351 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Fix For: 1.5 There is a general need to facet on the same field in different ways (different prefixes, different filters). We need a way to express this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1311) pseudo-field-collapsing
[ https://issues.apache.org/jira/browse/SOLR-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754509#action_12754509 ] Uri Boness commented on SOLR-1311: -- Wouldn't be an idea to try and merge this code with the original field collapsing patch? Quite a bit of work was done recently on that patch to make it more extensible. So for example, you now have a _Collapser_ interface that encapsulates the actual collapsing algorithm, and my guess is that your algorithm can probably fit there. Indeed when the corpus is large, adjacent field collapsing can turn into a performance issue, and having this pseudo algorithm seems to make a lot of sense. So for example, using the original field collapsing patch, it would be nice if we could just define another parameter called collapse.type which will hold one of three values: adjacent, pseudo-adjacent, and non-adjacent. BTW, I haven't looked at your patch yet and I don't know how well it works with faceting? But integrating it with the original patch will enable you that support (i.e. before/after collapse facet counts support) automatically. pseudo-field-collapsing --- Key: SOLR-1311 URL: https://issues.apache.org/jira/browse/SOLR-1311 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.4 Reporter: Marc Sturlese Fix For: 1.5 Attachments: SOLR-1311-pseudo-field-collapsing.patch I am trying to develope a new way of doing field collapsing based on the adjacent field collapsing algorithm. I have started developing it beacuse I am experiencing performance problems with the field collapsing patch with big index (8G). The algorith does adjacent-pseudo-field collapsing. It does collapsing on the first X documents. Instead of making the collapsed docs disapear, the algorith will send them to a given position of the relevance results list. The reason I just do collapsing in the first X documents is that if I have for example 60 results and I am showing 10 results per page, I really don't need to do collapsing in the page 3 or even not in the 3000. Doing this I am noticing dramatically better performance. The problem is I couldn't find a way to plug the algorithm as a component and keep good performance. I had to hack few classes in SolrIndexSearcher.java This patch is just experimental and for testing purposes. In case someone finds it interesting would be good do find a way to integrate it in a better way than it is at the moment. Advices are more than welcome. Functionality: In solrconfig.xml we specify the pseudo-collapsing parameters: str name=plus.considerMoreDocstrue/str str name=plus.considerHowMany3000/str str name=plus.considerFieldname/str (at the moment there's no threshold and other parameters that exist in the current collapse-field patch) plus.considerMoreDocs one enables pseudo-collapsing plus.considerHowMany sets the number of resultant documents in wich we want to apply the algorithm plus.considerField is the field to do pseudo-collapsing If the number of results is lower than plus.considerHowMany the algorithm will be applyed to all the results. Let's say there is a query with 60 results and we've set considerHowMany to 3000 (and we already have the docs sorted by relevance). What adjacent-pseudo-collapse does is, if the 2nd doc has to be collapsed it will be sent to the pos 2999 of the relevance results array. If the 3th has to be collpased too will go to the position 2998 and successively like this. The algorithm is not applyed when a sortspec is set or plus.considerMoreDocs is set to false. It neighter is applyed when using MoreLikeThisRequestHanlder. Example with a query of 9 results: Results sorted by relevance without pseudo-collapse-algorithm: doc1 - collapse_field_value 3 doc2 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc6 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 Results pseudo-collapsed with plus.considerHowMany = 5 doc1 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc2 - collapse_field_value 3* doc6 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 Results pseudo-collapsed with plus.considerHowMany = 9 doc1 - collapse_field_value 3 doc3 - collapse_field_value 4 doc4 - collapse_field_value 7 doc5 - collapse_field_value 6 doc7 - collapse_field_value 5 doc8 - collapse_field_value 1 doc9 - collapse_field_value 2 doc6 - collapse_field_value 6* doc2 - collapse_field_value 3* *pseudo-collapsed documents --
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754523#action_12754523 ] Uri Boness commented on SOLR-236: - Martijn, I think a more appropriate way to fix the threading issue is to bind the collapseRequest to the request context and drop the class field all together. So: {code} public void prepare(ResponseBuilder rb) throws IOException { super.prepare(rb); rb.req.getContext().put(collapseRequest, resolveCollapseRequest(rb)); } {code} and {code} public void process(ResponseBuilder rb) throws IOException { CollapseRequest collapseRequest = rb.req.getContext().remove(collapseRequest); if (collapseRequest == null) { super.process(rb); return; } doProcess(rb, collapseRequest); } {code} Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1351) facet on same field different ways
[ https://issues.apache.org/jira/browse/SOLR-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12754542#action_12754542 ] Uri Boness commented on SOLR-1351: -- Another option is to define the id as a local param: {code} q=*:*facet=truefacet.field={!id=category}catf.category.facet.sort=truef.category.facet.limit=20f.category.facet.mincount=1 {code} and for multiple configurations: {code} q=*:*facet=truefacet.field={!id=cat1}catf.cat1.facet.sort=truef.cat1.facet.limit=20f.cat1facet.mincount=1facet.field={!id=cat2}catf.cat2.facet.sort=falsef.cat2.facet.count=0 {code} I guess it plays nicer with the new functionality in 1.4 facet on same field different ways -- Key: SOLR-1351 URL: https://issues.apache.org/jira/browse/SOLR-1351 Project: Solr Issue Type: Improvement Reporter: Yonik Seeley Fix For: 1.5 There is a general need to facet on the same field in different ways (different prefixes, different filters). We need a way to express this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752417#action_12752417 ] Uri Boness commented on SOLR-1071: -- Looks good! As for the naming, I really like your suggestion (in one of the comments above) to replace suggestion with alternatives. So the client code can look something like: {code} response.getSuggestions().get(hell).getAlternatives().get(0); {code} One more thing - I think it will be more intuitive to use a SimpleOrderedMap instead of a NamedList for the suggestions node. for the xml response it won't make much difference I guess, but for the json one it will be more intuitive and easier to work with. So to take your example above, you'd get something like: {code} spellcheck: { suggestions: { hell:{ numFound:2, startOffset:0, endOffset:4, origFreq:0, alternatives:[ { word:dell, freq:4 }, { word:all, freq:4 } ] }, correctlySpelled:false}}} {code} spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Yonik Seeley Fix For: 1.4 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752448#action_12752448 ] Uri Boness commented on SOLR-1071: -- bq. It already does... it's just the client code that checks for NamedList (the parent of SimpleOrderedMap) No... sorry, I mean the top most suggestions node. line 182 in the patched class. spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Yonik Seeley Fix For: 1.4 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752472#action_12752472 ] Uri Boness commented on SOLR-1071: -- bq. I guess it depends on how important order is in the top-level suggestions list? I guess the order is not that important, it's just that using a SimpleOrderedMap will output a more intuitive JSON output to work with IMO. bq. It would break back compat for the non-extended results too (for JSON and friends). True... I didn't think about that one. hmm... well... I guess you can keep it as is then. I mean, it's not like you cannot work with the current format after all :-) spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Yonik Seeley Fix For: 1.4 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752841#action_12752841 ] Uri Boness commented on SOLR-1071: -- cool! thanks for the effort Yonik. I've updated the wiki so you can focus on the release ;-) spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Yonik Seeley Fix For: 1.4 Attachments: SOLR-1071.patch, SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12750659#action_12750659 ] Uri Boness commented on SOLR-1071: -- Because there are issues with the current format is json (and perhaps also in other formats)... see comments above spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Grant Ingersoll Fix For: 1.4 Attachments: SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749693#action_12749693 ] Uri Boness commented on SOLR-1163: -- Hi Lance, Great feedback, thanks! bq.You did not mention the console button at the lower right corner. This is very very useful! (well, you always have to leave some room for surprises ;-)) Obviously the two issues are bugs. I'll try to find some time this week to fix them and upload a new patch. Cheers, Uri Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12746395#action_12746395 ] Uri Boness commented on SOLR-1163: -- bq. Does a GWT client application have a clean license? If having a pure Apache 2 license is considered to be clean, then yes. bq. Are there any other GWT apps in the Apache project? No as far as I know. But you do have [LucidGaze|http://www.lucidimagination.com/Downloads/Certified-Distributions#lucidgaze] which is a Solr monitoring tool and I think it's also a GWT application. bq. +1. This is great. Thanks, you can also vote for it ;-) bq. The Simile project has some nice data explorer UIs. The Simile-Widget gallery displays them. Thanks for the suggestion. I know this project, but from my experience some of their widgets don't perform really well. Personally, when it comes to data visualization I think flash is the best technology we have at the moment and it's quite easy to interact with it via Javascript and GWT (that's how Google does for most of their applications/services: analytics, finances, etc..) Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741302#action_12741302 ] Uri Boness commented on SOLR-1099: -- bq. there are a number of oddities (things like using the complete text of a field as the key or name in a map value, listing the value twice, requiring a uniqueKey) Yes, I know... I didn't feel best with it as well, but that how the original analysis handler worked so I just followed that {quote} And that got me thinking why there are SolrJ classes dedicated to it... and I'm not sure that we should take up space for that. IMO, common things in SolrJ should have easier, more type safe interfaces and uncommon, advanced features should be accessed via the generic APIs in order to keep the interfaces smaller and more understandable for the general user. {quote} Why wouldn't you want SolrJ support for it? IMO, it would be great to have SolrJ support for every request handler that ships out of the box with Solr. It makes the user's life simpler and easier to use Solr this way. And as far as space is concerned... how much does it really add to the overall size of solrj jar? In any case, we're not talking of megabytes here... and for most people it doesn't really matter - I think it's more important to provide a simple and user friendly API to work with, and if the cost is to add a few extra classes I think it's a pretty cheap price to pay. It is true (I also mentioned it before) that it's not a major functionality that will be used often... but it is useful to have for tooling support - We're using it in one of the tools that we've created and the admin website can use it as well. FieldAnalysisRequestHandler --- Key: SOLR-1099 URL: https://issues.apache.org/jira/browse/SOLR-1099 Project: Solr Issue Type: New Feature Components: Analysis Affects Versions: 1.3 Reporter: Uri Boness Assignee: Shalin Shekhar Mangar Fix For: 1.4 Attachments: AnalisysRequestHandler_refactored.patch, analysis_request_handlers_incl_solrj.patch, AnalysisRequestHandler_refactored1.patch, FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, SOLR-1099.patch, SOLR-1099.patch The FieldAnalysisRequestHandler provides the analysis functionality of the web admin page as a service. This handler accepts a filetype/fieldname parameter and a value and as a response returns a breakdown of the analysis process. It is also possible to send a query value which will use the configured query analyzer as well as a showmatch parameter which will then mark every matched token as a match. If this handler is added to the code base, I also recommend to rename the current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have them both inherit from one AnalysisRequestHandlerBase class which provides the common functionality of the analysis breakdown and its translation to named lists. This will also enhance the current AnalysisRequestHandler which right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12735856#action_12735856 ] Uri Boness commented on SOLR-1071: -- bq. I debated the two different structures for a while and ultimately decided that people would have to deal with it no matter what. My suspicion was that people either use extendedResults or not and that they don't mix them, but perhaps I was wrong. Even if they do mix them, they still need code for recognizing when there is a difference (unless they are just spitting back out the raw, which means it doesn't matter anyway), so I don't know if it matters either way. Since this is out in the wild already, I think we should just fix the bug. I guess you're right - the users will have to handle the differences between the results anyway spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Grant Ingersoll Fix For: 1.4 Attachments: SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12735859#action_12735859 ] Uri Boness commented on SOLR-1071: -- I'm not sure it's a bug in the JSONRW, it seems to me that it was intentionally implemented to behave in this manner. It is confusing though, and indeed when developing components one has to keep in mind the consequences of using a _SimpleOrderedMap_ vs. a simple _NamedList_. I think there are several ways to tackle this: 1. Do nothing. In which case people should always know the consequences of using a _SimpleOrderedMap_ vs. a simple _NamedList._ *Advantages:* you probably don't break existing functionality. No code changes need to take place. *Disadvantages:* (as you mentioned) more error prone - easier to introduce such bugs when writing new components. People need to know the best practices which are not enforced. 2. In the _SimpleOrderedMap_, keep track of duplicate keys. If a _SimpleOrderedMap_ hold duplicate keys then it should not be rendered as a JSON object, but more like a normal _NamedList_ *Advantages:* you probably break nothing.. if components already use duplicate keys in a _SimpleOrderedMap_ then most probably they've introduced this same bug. *Disadvantage:* Inconsistent in the sense that in different occasions a _SimpleOrderedMap_ will be rendered differently. If duplicate keys are added, then there's no added value in choosing _SimpleOrderedMap_ over a normal _NamedList_. Which brings me to the last option 3. Make sure that _SimpleOrderedMap_ does not accept duplicates. Either by enforcing it (e.g. by throwing an exception) or just by overriding the values. *Advantages:* Gives the _SimpleOrderedMap_ a true meaning and a reason to exist. With this in place, it will be clear when and how it can be used. No changes need to be applied to the JSONRW. *Disadvantages:* Existing functionality might break, yet again... if duplicate keys are already used than this bug is introduced anyway. According to the Javadoc, the _SimpleOrderedMap_ implementation intentionality doesn't prevent duplicate keys... so there must be a reason for that. Personally, I'm for option 3. The current implementation of _SimpleOrderedMap_ doesn't seem to add any functionality to the _NamedList_ class, so it seems to me this class was created just as a hint for the response writers to render it differently. The name SimpleOrderedMap also suggest a Map-like functionality which doesn't support duplicate keys. But again, I'm not sure about the original reasons for not preventing duplicate keys in the first place, so there might be something I'm missing here. spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Grant Ingersoll Fix For: 1.4 Attachments: SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1133) solr-common 1.4-SNAPSHOT is not in maven2 repository
[ https://issues.apache.org/jira/browse/SOLR-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness resolved SOLR-1133. -- Resolution: Invalid In 1.4 solr-common.jar isbundled with solr-solrj.jar solr-common 1.4-SNAPSHOT is not in maven2 repository Key: SOLR-1133 URL: https://issues.apache.org/jira/browse/SOLR-1133 Project: Solr Issue Type: Bug Reporter: Uri Boness Fix For: 1.4 Looking at the apache maven2 repository ([http://people.apache.org/repo/m2-snapshot-repository/]) solr-common-1.4-SNAPSHOT is missing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr
[ https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727415#action_12727415 ] Uri Boness commented on SOLR-773: - I guess it is possible to configure the executor service via the configuration of the query parser. That said, having a way to configure executor services in solr config will eliminate some code duplication. I don't think it's a good practice to have on executor service for all components to use - the last thing you want is to have component depend on each other in terms of race conditions over threads. I think it is better to fine tune each component with a thread pool of its own. Incorporate Local Lucene/Solr - Key: SOLR-773 URL: https://issues.apache.org/jira/browse/SOLR-773 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Attachments: lucene-spatial-2.9-dev.jar, lucene.tar.gz, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, spatial-solr.tar.gz Local Lucene has been donated to the Lucene project. It has some Solr components, but we should evaluate how best to incorporate it into Solr. See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12722763#action_12722763 ] Uri Boness commented on SOLR-1163: -- Thanks Yonik. The console is in the application. If you look down at the lower right corner, you'll find a small console icon (a la FireBug), click on it and the console will open up. Where do I see it fitting? Well, if http://localhost:8983/solr/admin is for the admin page, then I guess http://localhost:8983/solr/explorer can be for the explorer. I don't know, what do you think? Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type
[ https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12711075#action_12711075 ] Uri Boness commented on SOLR-1123: -- Indeed this is just for convenience and should not be in a high priority, but I definitely see it as a nice to have one. Just to clarify, the suggestion is not to have another request parameter (that would probably be too much as you mentioned) but instead add a configuration parameter in solrconfig. So you'll be able to define the json response writer as follows: {code:xml} queryResponseWriter name=json class=org.apache.solr.request.JSONResponseWriter bool name=useJsonContentTypetrue/bool /queryResponseWriter {code} Change the JSONResponseWriter content type -- Key: SOLR-1123 URL: https://issues.apache.org/jira/browse/SOLR-1123 Project: Solr Issue Type: Improvement Reporter: Uri Boness Fix For: 1.5 Attachments: JSON_contentType_incl_tests.patch Currently the jSON content type is not used. Instead the palin/text content type is used. The reason for this as I understand is to enable viewing the json response as as text in the browser. While this is valid argument, I do believe that there should at least be an option to configure this writer to use the JSON content type. According to [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to be application/json (and not text/x-json). The reason this can be very helpful is that today you have plugins for browsers (e.g. [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page with application/json content type in a user friendly manner (just like xml is supported). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1123) Change the JSONResponseWriter content type
[ https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12711133#action_12711133 ] Uri Boness commented on SOLR-1123: -- I think that would be the best option. The problem right now is in the current class hierarchy of the response writers. Basically, I think the QueryResponseWriter interface should change to: {code} public interface QueryResponseWriter extends NamedListInitializedPlugin { public void write(OutputStream out, SolrQueryRequest request, SolrQueryResponse response) throws IOException; public String getContentType(SolrQueryRequest request, SolrQueryResponse response); } {code} Note: this interface will play nicer with the binary response writer Then we can have an AbstractTextResponseWriter which will serve as a parent for all non-binary response writers: {code} public abstract class AbstractTextResponseWriter extends NamedListInitializedPlugin { public final static String CONTENT_TYPE_PARAM = contentType; public static String DEFAULT_CONTENT_TYPE=text/plain; charset=UTF-8; private final String contentType; protected AbstractTextResponseWriter() { this(DEFAULT_CONTENT_TYPE); } protected AbstractTextResponseWriter(String defaultContentType) { this.contentType = defaultContentType; } public void init(NamedList args) { String configuredContentType = (String) args.get(CONTENT_TYPE_PARAM); if (configuredContentType != null) { contentType = configuredContentType;; } } public String getContentType(SolrQueryRequest request, SolrQueryResponse response) { return contentType; } public final void write(OutputStream out, SolrQueryRequest request, SolrQueryResponse response) throws IOException { OutputStreamWriter writer = new OutputStreamWriter(out, UTF-8); write(writer, request, response); } protected abstract void write(Writer writer, SolrQueryRequest request, SolrQueryResponse response) throws IOException; } {code} This will make it easy for every response writer to define its default content type, yet it will still allow to override this default using the contentType parameter in solrconfig. (I assume here that there's no need to customize the content type for the binary response writer as it's internal and specific for the current implementation). Change the JSONResponseWriter content type -- Key: SOLR-1123 URL: https://issues.apache.org/jira/browse/SOLR-1123 Project: Solr Issue Type: Improvement Reporter: Uri Boness Fix For: 1.5 Attachments: JSON_contentType_incl_tests.patch Currently the jSON content type is not used. Instead the palin/text content type is used. The reason for this as I understand is to enable viewing the json response as as text in the browser. While this is valid argument, I do believe that there should at least be an option to configure this writer to use the JSON content type. According to [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to be application/json (and not text/x-json). The reason this can be very helpful is that today you have plugins for browsers (e.g. [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page with application/json content type in a user friendly manner (just like xml is supported). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1163: - Attachment: graphics.zip solr-explorer.patch Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1163: - Attachment: solr-explorer.patch fixed the groupId in the pom Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12705406#action_12705406 ] Uri Boness commented on SOLR-1099: -- {quote}I guess contrary to the javadocs, you need to specify either analysis.fieldname or analysis.fieldtype along with analysis.fieldvalue to make it work. They are optional but one of them must be present.{quote} That's true, one of them must be set. {quote}On second thought, we could just use the default search field if both fieldname and fieldtype are not specified.{quote} Sounds like a reasonable fallback. FieldAnalysisRequestHandler --- Key: SOLR-1099 URL: https://issues.apache.org/jira/browse/SOLR-1099 Project: Solr Issue Type: New Feature Components: Analysis Affects Versions: 1.3 Reporter: Uri Boness Assignee: Shalin Shekhar Mangar Fix For: 1.4 Attachments: AnalisysRequestHandler_refactored.patch, analysis_request_handlers_incl_solrj.patch, AnalysisRequestHandler_refactored1.patch, FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch, SOLR-1099.patch The FieldAnalysisRequestHandler provides the analysis functionality of the web admin page as a service. This handler accepts a filetype/fieldname parameter and a value and as a response returns a breakdown of the analysis process. It is also possible to send a query value which will use the configured query analyzer as well as a showmatch parameter which will then mark every matched token as a match. If this handler is added to the code base, I also recommend to rename the current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have them both inherit from one AnalysisRequestHandlerBase class which provides the common functionality of the analysis breakdown and its translation to named lists. This will also enhance the current AnalysisRequestHandler which right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1133) solr-common 1.4-SNAPSHOT is not in maven2 repository
[ https://issues.apache.org/jira/browse/SOLR-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12703593#action_12703593 ] Uri Boness commented on SOLR-1133: -- Actually, the same issue is with solr-lucen-contrib library (the pom's are there, but not the jar) solr-common 1.4-SNAPSHOT is not in maven2 repository Key: SOLR-1133 URL: https://issues.apache.org/jira/browse/SOLR-1133 Project: Solr Issue Type: Bug Affects Versions: 1.3 Reporter: Uri Boness Fix For: 1.3.1 Looking at the apache maven2 repository ([http://people.apache.org/repo/m2-snapshot-repository/]) solr-common-1.4-SNAPSHOT is missing -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1122) Move the lib directory of the velocity contrib out of the src directory.
Move the lib directory of the velocity contrib out of the src directory. Key: SOLR-1122 URL: https://issues.apache.org/jira/browse/SOLR-1122 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Uri Boness Priority: Minor Fix For: 1.4 Currently the lib folder is located under the {{trunk/contrib/velocity/src/main/solr}} folder I guess it should be in {{turk/contrib/velocity}} instead (will also be consistent with the other contrib folders). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1071) spellcheck.extendedResults returns an invalid JSON response when count 1
[ https://issues.apache.org/jira/browse/SOLR-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701492#action_12701492 ] Uri Boness commented on SOLR-1071: -- One more thing to consider: now this component is a bit inconsistent with its response format. When extendedResults is used, the suggestions are put in an array called alternatives, while when it's not used the suggestions are put in an array called suggestion. I think it will be wise to consider changing the later to alternatives as well, but of course it will break backward compatibility and as this component is probably widely used it's a risk. Another option is at least temporary for 1.4 release add support for another parameter (something like, spellcheck.version=1.3) that will then signal the component to render the response in the 1.3 format - a bit ugly, but it will at least solve the compatibility issues. spellcheck.extendedResults returns an invalid JSON response when count 1 -- Key: SOLR-1071 URL: https://issues.apache.org/jira/browse/SOLR-1071 Project: Solr Issue Type: Bug Components: spellchecker Affects Versions: 1.3 Reporter: Uri Boness Assignee: Grant Ingersoll Fix For: 1.3.1 Attachments: SpellCheckComponent_fix.patch, SpellCheckComponent_new_structure.patch, SpellCheckComponent_new_structure_incl_test.patch When: wt=json spellcheck.extendedResults=true spellcheck.count 1, the suggestions are returned in the following format: suggestions:[ amsterdm,{ numFound:5, startOffset:0, endOffset:8, origFreq:0, suggestion:{ frequency:8498, word:amsterdam}, suggestion:{ frequency:1, word:amsterd}, suggestion:{ frequency:8, word:amsterdams}, suggestion:{ frequency:1, word:amstedam}, suggestion:{ frequency:22, word:amsterdamse}}, beak,{ numFound:5, startOffset:9, endOffset:13, origFreq:0, suggestion:{ frequency:379, word:beek}, suggestion:{ frequency:26, word:beau}, suggestion:{ frequency:26, word:baak}, suggestion:{ frequency:15, word:teak}, suggestion:{ frequency:11, word:beuk}}, correctlySpelled,false, collation,amsterdam beek]}} This is an invalid json as each term is associated with a JSON object which holds multiple suggestion attributes. When working with a JSON library only the last suggestion attribute is picked up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1123) Change the JSONResponseWriter content type
Change the JSONResponseWriter content type -- Key: SOLR-1123 URL: https://issues.apache.org/jira/browse/SOLR-1123 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Uri Boness Fix For: 1.3.1 Currently the jSON content type is not used. Instead the palin/text content type is used. The reason for this as I understand is to enable viewing the json response as as text in the browser. While this is valid argument, I do believe that there should at least be an option to configure this writer to use the JSON content type. According to [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to be application/json (and not text/x-json). The reason this can be very helpful is that today you have plugins for browsers (e.g. [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page with application/json content type in a user friendly manner (just like xml is supported). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1123) Change the JSONResponseWriter content type
[ https://issues.apache.org/jira/browse/SOLR-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1123: - Attachment: JSON_contentType_incl_tests.patch This patch is a simple implementation for this functionality. The writer can be configured with a {{userJsonContentType}} boolean parameter that when set to {{true}} the content type for the output will be application/json instead of text/plain. For backward compatibility reasons, when this parameter is absent, the text/plain content type will be used. Change the JSONResponseWriter content type -- Key: SOLR-1123 URL: https://issues.apache.org/jira/browse/SOLR-1123 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Uri Boness Fix For: 1.3.1 Attachments: JSON_contentType_incl_tests.patch Currently the jSON content type is not used. Instead the palin/text content type is used. The reason for this as I understand is to enable viewing the json response as as text in the browser. While this is valid argument, I do believe that there should at least be an option to configure this writer to use the JSON content type. According to [RFC4627|http://www.ietf.org/rfc/rfc4627.txt] the json content type needs to be application/json (and not text/x-json). The reason this can be very helpful is that today you have plugins for browsers (e.g. [JSONView|http://brh.numbera.com/software/jsonview]) that can render any page with application/json content type in a user friendly manner (just like xml is supported). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12701094#action_12701094 ] Uri Boness commented on SOLR-1099: -- Actually, there is not dependency between the handlers and SolrJ. SolrJ comes with its own {{FieldAnalysisRequest}} and {{DocumentAnalysisRequest}} classes (which extend the {{SolrRequest}} class). The inner classes in the handlers are used to represent analysis requests on the server side. Another thing. I believe that the default names for the handlers as you defined in the default solrconfig.xml (i.e. analysis/field and analysis/document) are better than the ones I came up with :-). The only thing left to do is to update these defaults in the SolrJ request classes: {{FieldAnalysisRequest}} and {{DocumentAnalysisRequests}}. FieldAnalysisRequestHandler --- Key: SOLR-1099 URL: https://issues.apache.org/jira/browse/SOLR-1099 Project: Solr Issue Type: New Feature Components: Analysis Affects Versions: 1.3 Reporter: Uri Boness Assignee: Shalin Shekhar Mangar Fix For: 1.4 Attachments: AnalisysRequestHandler_refactored.patch, analysis_request_handlers_incl_solrj.patch, AnalysisRequestHandler_refactored1.patch, FieldAnalysisRequestHandler_incl_test.patch, SOLR-1099.patch The FieldAnalysisRequestHandler provides the analysis functionality of the web admin page as a service. This handler accepts a filetype/fieldname parameter and a value and as a response returns a breakdown of the analysis process. It is also possible to send a query value which will use the configured query analyzer as well as a showmatch parameter which will then mark every matched token as a match. If this handler is added to the code base, I also recommend to rename the current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have them both inherit from one AnalysisRequestHandlerBase class which provides the common functionality of the analysis breakdown and its translation to named lists. This will also enhance the current AnalysisRequestHandler which right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uri Boness updated SOLR-1099: - Attachment: analysis_request_handlers_incl_solrj.patch latest patch. This one includes SolrJ support. FieldAnalysisRequestHandler --- Key: SOLR-1099 URL: https://issues.apache.org/jira/browse/SOLR-1099 Project: Solr Issue Type: New Feature Components: Analysis Affects Versions: 1.3 Reporter: Uri Boness Assignee: Shalin Shekhar Mangar Fix For: 1.4 Attachments: AnalisysRequestHandler_refactored.patch, analysis_request_handlers_incl_solrj.patch, AnalysisRequestHandler_refactored1.patch, FieldAnalysisRequestHandler_incl_test.patch The FieldAnalysisRequestHandler provides the analysis functionality of the web admin page as a service. This handler accepts a filetype/fieldname parameter and a value and as a response returns a breakdown of the analysis process. It is also possible to send a query value which will use the configured query analyzer as well as a showmatch parameter which will then mark every matched token as a match. If this handler is added to the code base, I also recommend to rename the current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have them both inherit from one AnalysisRequestHandlerBase class which provides the common functionality of the analysis breakdown and its translation to named lists. This will also enhance the current AnalysisRequestHandler which right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1099) FieldAnalysisRequestHandler
[ https://issues.apache.org/jira/browse/SOLR-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12700787#action_12700787 ] Uri Boness commented on SOLR-1099: -- Not all input can be sent as input parameters (the documents will still be sent as a request body via a POST) but of course it's still possible to fold everything in one handler. It just feels like putting too much logic responsibility on a single handler which increases code complexity and makes it harder to maintain (at least in my opinion). The deprecation also provides users who already use the current ARH a chance to move to the DocumentARH (which has a different response format) FieldAnalysisRequestHandler --- Key: SOLR-1099 URL: https://issues.apache.org/jira/browse/SOLR-1099 Project: Solr Issue Type: New Feature Components: Analysis Affects Versions: 1.3 Reporter: Uri Boness Assignee: Shalin Shekhar Mangar Fix For: 1.4 Attachments: AnalisysRequestHandler_refactored.patch, analysis_request_handlers_incl_solrj.patch, AnalysisRequestHandler_refactored1.patch, FieldAnalysisRequestHandler_incl_test.patch The FieldAnalysisRequestHandler provides the analysis functionality of the web admin page as a service. This handler accepts a filetype/fieldname parameter and a value and as a response returns a breakdown of the analysis process. It is also possible to send a query value which will use the configured query analyzer as well as a showmatch parameter which will then mark every matched token as a match. If this handler is added to the code base, I also recommend to rename the current AnalysisRequestHandler to DocumentAnalysisRequestHandler and have them both inherit from one AnalysisRequestHandlerBase class which provides the common functionality of the analysis breakdown and its translation to named lists. This will also enhance the current AnalysisRequestHandler which right now is fairly simplistic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.