[jira] Commented: (SOLR-215) Multiple Solr Cores
[ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525054 ] Henri Biestro commented on SOLR-215: Rakesh, The patch needs to be applied to the Solr source 1.3 dev trunk. Getting the source is decribed in http://lucene.apache.org/solr/version_control.html (and I suggest you also read the FAQ here http://wiki.apache.org/solr/FAQ ). Instructions to apply the patch are described in the Jira issue (as well as a description of its applicability usefulness; are you sure you need this patch?) Regards Henri Quoted from: http://www.nabble.com/-jira--Created%3A-%28SOLR-215%29-Multiple-Solr-Cores-tf3651963.html#a12487432 Multiple Solr Cores --- Key: SOLR-215 URL: https://issues.apache.org/jira/browse/SOLR-215 Project: Solr Issue Type: Improvement Reporter: Henri Biestro Priority: Minor Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch WHAT: As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability. The patch file to grab is solr-215.patch.zip (see MISC session below). WHY: The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications. There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense. Multiple cores: Deployment issues within some organizations where IT will resist deploying multiple web applications. Seamless schema update where you can create a new core and switch to it without starting/stopping servers. Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas collections. Multiple indexes: Multiple language collections where each document exists in different languages, analysis being language dependant. Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes. HOW: The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured behaves exactly as the one core in 1.2; the various caches are per-core so is the info-bean-registry. What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema the core). Each core is 'named' and a static map (keyed by name) allows to easily manage them. You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. USAGE (example web deployment, patch installed): Step0 java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml Will index the 2 documents in solr.xml monitor.xml Step1: http://localhost:8983/solr/core0/admin/stats.jsp Will produce the statistics page from the admin servlet on core0 index; 2 documents Step2: http://localhost:8983/solr/core1/admin/stats.jsp Will produce the statistics page from the admin servlet on core1 index; no documents Step3: java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1; running queries from the admin interface, you can verify indexes have different content. USAGE (Java code): //create a configuration SolrConfig config = new SolrConfig(solrconfig.xml); //create a schema IndexSchema schema = new IndexSchema(config, schema0.xml); //create a core from the 2 other. SolrCore core = new SolrCore(core0, /path/to/index, config, schema); //Accessing a core: SolrCore core = SolrCore.getCore(core0); PATCH MODIFICATIONS DETAILS (per package): org.apache.solr.core: The heaviest modifications are in SolrCore SolrConfig. SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the
[jira] Updated: (SOLR-209) Multifields and multivalued facets
[ https://issues.apache.org/jira/browse/SOLR-209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Klaas updated SOLR-209: Fix Version/s: (was: 1.1.0) Multifields and multivalued facets -- Key: SOLR-209 URL: https://issues.apache.org/jira/browse/SOLR-209 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.1.0 Environment: Java Reporter: Rida Benjelloun Attachments: MultiFieldsHitsFacets.java, MultiFieldsHitsFacets.patch MultiFieldsHitsFacets, increase the performance of faceting in multiValued fields, buy creating facets from Lucene Hits. It also allows the creation of facet using multiple fields. The fields must be separated by single space Example : facet.field=subject subjectG subjectA . Rida Benjelloun [EMAIL PROTECTED] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-200) Scripts don't work when run as root in ~root and su'ing to a user
[ https://issues.apache.org/jira/browse/SOLR-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Klaas updated SOLR-200: Fix Version/s: (was: 1.1.0) Scripts don't work when run as root in ~root and su'ing to a user - Key: SOLR-200 URL: https://issues.apache.org/jira/browse/SOLR-200 Project: Solr Issue Type: Bug Affects Versions: 1.1.0 Reporter: Jürgen Hermann Priority: Minor This patch avoids an error due to permission problems when orig_dir is /root -orig_dir=$(pwd) -cd ${0%/*}/.. -solr_root=$(pwd) -cd ${orig_dir} +solr_root=$(cd ${0%/*}/.. pwd) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-345) Add support for debugging
[ https://issues.apache.org/jira/browse/SOLR-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525162 ] Mike Klaas commented on SOLR-345: - Can this issue be resolved? Add support for debugging - Key: SOLR-345 URL: https://issues.apache.org/jira/browse/SOLR-345 Project: Solr Issue Type: New Feature Components: clients - C# Affects Versions: 1.2 Reporter: Jeff Rodenburg Fix For: 1.2 Add support for the debugQuery and explainOther query parameters -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-345) Add support for debugging
[ https://issues.apache.org/jira/browse/SOLR-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Rodenburg resolved SOLR-345. - Resolution: Fixed Resolving issue Add support for debugging - Key: SOLR-345 URL: https://issues.apache.org/jira/browse/SOLR-345 Project: Solr Issue Type: New Feature Components: clients - C# Affects Versions: 1.2 Reporter: Jeff Rodenburg Fix For: 1.2 Add support for the debugQuery and explainOther query parameters -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-288) Allow custom IndexSearcher class
[ https://issues.apache.org/jira/browse/SOLR-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Klaas updated SOLR-288: Fix Version/s: (was: 1.2) Allow custom IndexSearcher class Key: SOLR-288 URL: https://issues.apache.org/jira/browse/SOLR-288 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.2 Environment: all Reporter: Jerome Eteve Priority: Minor Allows specification of the used IndexSearcher class in the schema like this: !-- Custom Searcher class that must inherit from the lucene IndexSearcher and have a contructor with only one (IndexReader r) argument -- searcher class=com.acme.lucene.search.MyGreatSearcher/ I got a patch for this for the src of version 1.2.0 , but I dont know if I can post it here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-299) Audit use of backticks in solr.py
[ https://issues.apache.org/jira/browse/SOLR-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Klaas resolved SOLR-299. - Resolution: Fixed committed r573019 Audit use of backticks in solr.py - Key: SOLR-299 URL: https://issues.apache.org/jira/browse/SOLR-299 Project: Solr Issue Type: Bug Components: clients - python Affects Versions: 1.2 Reporter: Mike Klaas Assignee: Mike Klaas Fix For: 1.3 backticks are often the wrong thing to do (they return a debugging representation of a variable). This may be superceded by the new python client, but should be fixed in the mean time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-347) AIOOBE when field name declared twice in schema
AIOOBE when field name declared twice in schema --- Key: SOLR-347 URL: https://issues.apache.org/jira/browse/SOLR-347 Project: Solr Issue Type: Bug Components: search Reporter: Hoss Man anecdotal reports that defining the same field name twice in your schema triggers a cryptic ArrayIndexOutOfBoundsException ... http://www.nabble.com/Can%27t-get-1.2-running-under-Tomcat-5.5-tf4384790.html ..we should test for this and either throw a more helpful error or ignore one and issue a warning. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-334) pluggable query parsers
[ https://issues.apache.org/jira/browse/SOLR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525199 ] Yonik Seeley commented on SOLR-334: --- Another easy-yet-useful feature would be parameter substitution. Great for separating the user query from what you do with it. If the userq parameter contained the raw user query, one could specify a dismax query via q=!dismax value=$userq pluggable query parsers --- Key: SOLR-334 URL: https://issues.apache.org/jira/browse/SOLR-334 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley One should be able to optionally specify an alternate query syntax on a per-query basis http://www.nabble.com/Using-HTTP-Post-for-Queries-tf3039973.html#a8483387 Many benefits, including avoiding the need to do query parser escaping for simple term or prefix queries. Possible Examples: fq=!term field=myfieldThe Term fq=!prefix field=myfieldThe Prefix q=!qp op=ANDa b q=!xml?xml... // lucene XML query syntax? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
boosting a query by a function of other fields
I need to be able to boost a relevancy query by the value of another field (boost as in multiply, not add in a function query to a boolean query as dismax currently does). And instead of a single field, it will actually be a function of many different fields (so most likely a function query). Instead of just hacking up a custom request handler, it seems like this functionality would be of general use in Solr, so I've started thinking about it in more general terms. Issues: - Solr still uses it's own FunctionQuery instead of what was added back to Lucene. May want to migrate first, or may want to hold off if changes are needed that would be easier to make here. - I thought about being able to include queries as a ValueSource (so one could do functions on relevancy scores). One issue with this is that some queries need rewriting to work... so rewrite functionality would need to be added to ValueSource. - some field values may be sparse... so if you multiplied the relevancy query by field myscore which only existed on a few documents, you would not want to multiply the scores for other documents by 0. So either A) treat 0 as special in the multiply function... it means not indexed and so the multiplier would be 1 B) remember which values were filled in for a field and provide a way to get that info (a DocSet?) - weighting... if a FunctionQuery is at the top level, we need a way to get back un-weighted scores (actually, looking at the code again, it looks like as long as the boost is 1, we will get back exact scores from the FunctionQuery already). - Interface: seems like SOLR-334 (pluggable query parsers) could help out here to enable existing handlers to specify a boosted query without introducing yet more different (hard-coded) HTTP query params. Thoughts? -Yonik
Re: boosting a query by a function of other fields
On 5-Sep-07, at 1:10 PM, Yonik Seeley wrote: Issues: - Solr still uses it's own FunctionQuery instead of what was added back to Lucene. May want to migrate first, or may want to hold off if changes are needed that would be easier to make here. I believe that the main thing missing is the various functions (which should be subclasses of CustomScoreQuery). - I thought about being able to include queries as a ValueSource (so one could do functions on relevancy scores). One issue with this is that some queries need rewriting to work... so rewrite functionality would need to be added to ValueSource. ValueSources don't guaranteed that getValue(docid) gets called in order, though, and seems to require a value for every doc. We can cheat on this, of course, but when I implemented this I preferred skipping it entirely (see below). - weighting... if a FunctionQuery is at the top level, we need a way to get back un-weighted scores (actually, looking at the code again, it looks like as long as the boost is 1, we will get back exact scores from the FunctionQuery already). Why do we need the unweighted score? - Interface: seems like SOLR-334 (pluggable query parsers) could help out here to enable existing handlers to specify a boosted query without introducing yet more different (hard-coded) HTTP query params. Thoughts? I've implemented exactly this. see http://issues.apache.org/jira/ browse/LUCENE-850 Changed CustomScoreQuery (ValueSource-only) to CustomBoostQuery (match/score query + arbitrary multiplicative boost query), and use it with dismax (as an arbitrary number of bq.prod parameters) as follows: // generic product boost queries ListQuery prodBoostQueries = U.parseQueryStrings(req, params.getParams(DMP.BQ_PROD)); if (null != prodBoostQueries prodBoostQueries.size() 0) { Query main = query; Query out = null; for(Query q: prodBoostQueries) { out = new CustomBoostQuery(main, q); main = out; } query = new BooleanQuery(true); query.add(out, Occur.MUST); } -Mike
Re: boosting a query by a function of other fields
: - Solr still uses it's own FunctionQuery instead of what was added : back to Lucene. May want to migrate first, or may want to hold off if : changes are needed that would be easier to make here. i haven't had a chacne to look at it yet, but i noticed Jira kick out an email recently about a patch being added to SOLR-192 ... probably worth reviewing as you consider this. : - I thought about being able to include queries as a ValueSource (so : one could do functions on relevancy scores). One issue with this is : that some queries need rewriting to work... so rewrite functionality : would need to be added to ValueSource. not neccessarily ... the getValues(IndexReader) method already sort of serves the purpose of rewrite ... a QueryValueSource class could rewrite the query in it's getValues method and use the resulting query/weight/scorer in the DocValues object it returns. : - some field values may be sparse... so if you multiplied the : relevancy query by field myscore which only existed on a few : documents, you would not want to multiply the scores for other : documents by 0. So either this sems like the kind of thing that could best be dealt with in something like FieldValueSource -- where the underlying ValueSource getting the field value decides what value to use if the doc doesn't have one. -Hoss
[jira] Commented: (SOLR-344) New Java API
[ https://issues.apache.org/jira/browse/SOLR-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525231 ] Hoss Man commented on SOLR-344: --- I've only had a chance to skim the attached PDF ... I've printed it out in the hopes that I'll find some time to read in depth your specific ideas about what the ideal Solr API should be; but there are a few things that jumped out at me that I wanted to address while they were on my mind... -- Motivation -- - Direct Java is better - A key assumption in this proposal seems to be that if you are writing a Java app, and you want to use Solr, you should not use the HTTP interface I would argue strongly against this assumption. there are *lots* of reasons why it makes sense to treat Solr as a webservice and interact with it over HTTP instead of having a tight coupling with your Java application: redundancy, load balancing, ... Even if someone had a situation where they only had one machine in their entire operation, and all of their applications ran on that machine i would still suggest installing a servlet container and using Solr that way because it's likely they will have more then one application that will want to deal with their index. Solr can make a lot of good optimizations and assumptions that go right out the window if you try to embed Solr in 2 different apps reading and writing to the same physical index directory. Even if compelling stats can be presented that the HTTP+XML/JSON overhead is in fact a bottleneck, i would still think that pursuing something like an RMI based client/server API in addition to the HTTP API would make more sense then encouraging people to use directly in the JVM of their other applications. Even the Plugin model (for embedding your custom Java code into Solr) is something i only recommend in situations where it makes a lot of sense for that logic to tied closely with the Solr or Lucene internals (ie: as part of the TokenStream, or dealing with the DocSets before they are cached, etc...) The #1 Value Add that Solr has over Lucene is the Client/Server abstraction ... there are certainly other value adds -- some small (like added TokenFilters) and some big (like the IndexSchema concept) -- and many of these could probably be refactored into the Lucene core (or a Lucene contrib) so they could be reused by other Lucene applications in addition to Solr ... but Solr *is* an application. Arguing that you shouldn't bother using a client/server relationship to deal with Solr if your application is written in Java is like arguing that you shouldn't bother using a client/server relationship to deal with MySQL if your application is written in C. - Demand for direct access - the statement a significant proportion of questions on the mailing lists are clearly from people who are attempting such integrations right now. does not serve as a clear call to action ... even if a significant number of recent questions have related to embedded Solr (and I'm not convinced the number is that significant) that one data point alone does not clearly indicate that it is important/urgent to make this easier to do. It just indicates that the people who are attempting to do this have questions about how to do it ... which isn't that suprising considering it's a relatively new concept that hasn't really been documented. Some of these people may just be assuming that they *need* to embed Solr in their existing Java applications because they don't realize it's intended to be used as a server. The [EMAIL PROTECTED] list gets lots of questions from people who misunderstand the the demo code in the Lucene distribution and think Lucene is an application that they can run on the command line to index files and search them -- that doesn't mean that the Lucene-Java project should revamp itself to focus on producing an application instead of a Library, it means the Lucene-Java community has to help educate users about: A) how they can use the Lucene library to build their own apps; and B) what apps are built on top of the Lucene library that might be useful to them. I think it would probably be more beneficial for the community as a whole if people spent more time/energy documenting the benefits/mechanisms of using Solr as a server, or improving the client APIs to make communicating with a Solr server faster/easier then it would to dedicate a lot of resources solely towards making Solr more of a library and less of an application. -- Strategy for making changes -- All that said -- i agree with you that a lot of improvements can and should be made to the internal APIs. Not because i think we need to make it easier to embed Solr, but to make it easier for new developers to work on the Solr internals (or to write plugins). if embedding Solr gets easier as a result -- great, but I don't see that as a compelling reason for change.
Some wiki cleanup thoughts
i've been thinking that there are some little things that could be done to clean the wiki up a bit, i'm happy to do them but i wanted to send out a quick ping before i spend a chunk of time on it in case someone thinks it's a bad idea... 1) find all usages of stuff like [Solr 1.2+], [Solr1.2], [Solr 1.3] that denote when a feature/param was added and change them to... ![Solr1.2](or 1.3 as needed) ...the ! will insert an attention icon to help people notice them, and the quotes will force MoinMoin to make a link out of hte text (Solr1.2 isn't a very MoinMoin friendly wiki word) 2) gut the contents of the Solr1.2 (it still has a bunch of pre 1.2 release planning) and replace it with a page explaining when 1.2 was released, where to find it, and a blurb explaining that any links to this page are used to indicate a feature that was added in this release. 3) add a Solr1.3 page with a similar blurb about how it denotes a feature that *should* be added in 1.3) (making these pages and using links instead of just text will make it eaier down the road to find them and remove them ... when we're up to Solr 2.0 we don't really need to draw attention to features added in 1.2) 4) create MoinMoin Categories for things like RequestHandlers and ResponseWriters (the MoinMoin editing screen gies you a pulldown to assign pages to categories when editing them) .. put all of the RequestHandler and ResponseWriter pages in the appropriate category, and change various places (ie: FrontPage) to use the macro for listing all pages in a category so we don't have to maintain the list manually. objections? -Hoss
Re: Some wiki cleanup thoughts
objections? It all sounds good to me! ryan
Re: Some wiki cleanup thoughts
I was considering adding quite a few pages for the solrsharp project. Any specific requests for where new content should go? -- j On 9/5/07, Chris Hostetter [EMAIL PROTECTED] wrote: i've been thinking that there are some little things that could be done to clean the wiki up a bit, i'm happy to do them but i wanted to send out a quick ping before i spend a chunk of time on it in case someone thinks it's a bad idea... 1) find all usages of stuff like [Solr 1.2+], [Solr1.2], [Solr 1.3] that denote when a feature/param was added and change them to... ![Solr1.2](or 1.3 as needed) ...the ! will insert an attention icon to help people notice them, and the quotes will force MoinMoin to make a link out of hte text (Solr1.2 isn't a very MoinMoin friendly wiki word) 2) gut the contents of the Solr1.2 (it still has a bunch of pre 1.2 release planning) and replace it with a page explaining when 1.2 was released, where to find it, and a blurb explaining that any links to this page are used to indicate a feature that was added in this release. 3) add a Solr1.3 page with a similar blurb about how it denotes a feature that *should* be added in 1.3) (making these pages and using links instead of just text will make it eaier down the road to find them and remove them ... when we're up to Solr 2.0 we don't really need to draw attention to features added in 1.2) 4) create MoinMoin Categories for things like RequestHandlers and ResponseWriters (the MoinMoin editing screen gies you a pulldown to assign pages to categories when editing them) .. put all of the RequestHandler and ResponseWriter pages in the appropriate category, and change various places (ie: FrontPage) to use the macro for listing all pages in a category so we don't have to maintain the list manually. objections? -Hoss
Re: Some wiki cleanup thoughts
: I was considering adding quite a few pages for the solrsharp project. Any : specific requests for where new content should go? Flare set a good example by using subpages... Flare gives you an overview Flare/Goals has Goals Flare/ToDo has todo ... etc. It lets you have pages with general names that don't over pollute the main namespace. there are also macros to make it easy to navigate arround subpages with a common parent... http://wiki.apache.org/solr/HelpOnMacros#head-3bd981f061f262d3605935dfe29519d9be970487 ...one thing Flare doesn't use but could is the Search macro to generate a numbered list of all subpages... [[FullSearch(title:Flare/)]] ...since i'm not really involved in Flare, i don't feel comfortable changing that ... but you could do it for SolrSharp pages. (i'm about to do something similar with other top level pages) -Hoss
Re: Some wiki cleanup thoughts
I think these wiki changes are positive, I've been meaning to create a page for the new PHP response writers introduced in SOLR-196 but haven't had time yet. I think the Response Writer section is still a little rough, given that some of the wiki pages listed there are also listed in the IntegratingSolr category. Is it worthwhile creating new pages e.g. JSONResponseWriter and putting the response writer related content from SolJSON into it (and similar for SolRuby SolPython)? I think it would make the section clearer. What does everyone think? Piete On 06/09/07, Chris Hostetter [EMAIL PROTECTED] wrote: I've done 1, 2, and 3 below. i also started on #4, doing just QueryResponseWriters. It may be my imagination (or it may be unlreated) but pages seem slower at the moment .. os before i add categories for other things ... i'm going to let it soak for a few days and see if it's my imagination.
[jira] Commented: (SOLR-344) New Java API
[ https://issues.apache.org/jira/browse/SOLR-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525287 ] Jonathan Woods commented on SOLR-344: - Hoss - I take on board a lot of what you say, and I appreciate the fact you even skimmed the PDF without immediately accusing me of hubris! I'll come back to you in a couple of days' time, when I've finished hacking my way through old Lucene-based code I hoped to have (and maybe should have) thrown away. New Java API Key: SOLR-344 URL: https://issues.apache.org/jira/browse/SOLR-344 Project: Solr Issue Type: Improvement Components: clients - java, search, update Affects Versions: 1.3 Reporter: Jonathan Woods Attachments: New Java API for Solr.pdf The core Solr codebase urgently needs to expose a new Java API designed for use by Java running in Solr's JVM and ultimately by core Solr code itself. This API must be (i) object-oriented ('typesafe'), (ii) self-documenting, (iii) at the right level of granularity, (iv) designed specifically to expose the value which Solr adds over and above Lucene. This is an urgent issue for two reasons: - Java-Solr integrations represent a use-case which is nearly as important as the core Solr use-case in which non-Java clients interact with Solr over HTTP - a significant proportion of questions on the mailing lists are clearly from people who are attempting such integrations right now. This point in Solr development - some way out from the 1.3 release - might be the right time to do the development and refactoring necessary to produce this API. We can do this without breaking any backward compatibility from the point of view of XML/HTTP and JSON-like clients, and without altering the core Solr algorithms which make it so efficient. If we do this work now, we can significantly speed up the spread of Solr. Eventually, this API should be part of core Solr code, not hived off into some separate project nor in a non-first-class package space. It should be capable of forming the foundation of any new Solr development which doesn't need to delve into low level constructs like DocSet and so on - and any new development which does need to do just that should be a candidate for incorporation into the API at the some level. Whether or not it will ever be worth re-writing existing code is a matter of opinion; but the Java API should be such that if it had existed before core plug-ins were written, it would have been natural to use it when writing them. I've attached a PDF which makes the case for this API. Apologies for delivering it as an attachment, but I wanted to embed pics and a bit of formatting. I'll update this issue in the next few days to give a prototype of this API to suggest what it might look like at present. This will build on the work already done in Solrj and SearchComponents (https://issues.apache.org/jira/browse/SOLR-281), and will be a patch on an up-to-date revision of Solr trunk. [PS: 1. Having written most of this, I then properly looked at SearchComponents/SOLR-281 and read http://www.nabble.com/forum/ViewPost.jtp?post=11050274framed=y, which says much the same thing albeit more quickly! And weeks ago, too. But this proposal is angled slightly differently: - it focusses on the value of creating an API not only for internal Solr consumption, but for local Java clients - it focusses on designing a Java API without constantly being hobbled by HTTP-Java - it's suggesting that the SearchComponents work should result in a Java API which can be used as much by third party Java as by ResponseBuilder. 2. I've made some attempt to address Hoss's point (http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#6551097579454875774) - that an API like this would need to maintain enough state e.g. to allow an initial search to later be faceted, highlighted etc without going back to the start each time - but clearly the proof of the pudding will be in the prototype. 3. Again, I've just discovered SOLR-212 (DirectSolrConnection). I think all my comments about Solrj apply to this, useful though it clearly is.] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Some wiki cleanup thoughts
: I think the Response Writer section is still a little rough, given that some : of the wiki pages listed there are also listed in the IntegratingSolr you know ... i didn't even notice that when i did it. : category. Is it worthwhile creating new pages e.g. JSONResponseWriter and : putting the response writer related content from SolJSON into it (and : similar for SolRuby SolPython)? I think it would make the section : clearer. What does everyone think? go for it. It definitely makes sense for there to be a pages specific to the response writers, seperate from the pages specific to the languages, where notes about communicating with Solr from clients written in those languages can refer to any useful response writters) (allthough this means we probably want to rename SolJSON to SolrAJAX ... but that doesn't seem like a big problem) -Hoss