How to store a HashSet in the index?
Hi, Can anyone help me on, as to how I can go about efficiently indexing (actually, storing in the index) and retrieving, a HashSet object, which contains multiple string arrays? I just want to store the HashSet in the index, and not search on it. The HashSet should be returned with the document when I perform a search on any other fields. Regards, Rishabh
[jira] Commented: (SOLR-303) Distributed Search over HTTP
[ https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549970 ] Sabyasachi Dalal commented on SOLR-303: --- I fixed the issue with the patch and it works with version 594268. Now, i am trying to make it work with the latest trunk. I am facing a problem. The FedSearchComponent needs a handle to the handler in order to execute on the local shard. I am trying to figure out how to pass the handler during component initialization. Distributed Search over HTTP Key: SOLR-303 URL: https://issues.apache.org/jira/browse/SOLR-303 Project: Solr Issue Type: New Feature Components: search Reporter: Sharad Agarwal Assignee: Yonik Seeley Attachments: fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, fedsearch.stu.patch Searching over multiple shards and aggregating results. Motivated by http://wiki.apache.org/solr/DistributedSearch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-409) Allow configurable class loader sharing between cores
[ https://issues.apache.org/jira/browse/SOLR-409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550002 ] Walter Ferrara commented on SOLR-409: - Thanks for committing this patch. I noticed that when you have just one core (with no multicore.xml), logging says null, i.e. INFO: [null] Registered new searcher This could be fixed in several ways: * by giving a (meaningful) name to the core, when multicore is not used. (like schema.getName()) * by not adding the name of the core when logging is off (which maybe means reuse logStr function, and check if Multicore is enabled) This is present also in stats.jsp where a null is printed before the uppercase bold CORE string. IMHO, we should set the name of single core when multicore is not set - this may make thing easier; set it as the name of its schema could be a solution. Allow configurable class loader sharing between cores - Key: SOLR-409 URL: https://issues.apache.org/jira/browse/SOLR-409 Project: Solr Issue Type: Sub-task Affects Versions: 1.3 Reporter: Henri Biestro Priority: Minor Fix For: 1.3 Attachments: solr-350_409.patch, solr-350_409.patch, solr-350_409.patch, solr-350_409.patch, solr-350_409.patch, solr-350_409.patch, solr-350_409_414.patch, solr-409.patch, solr-409.patch WHAT: This patch allows to configure in the multicore.xml the parent class loader of all core class loaders used to dynamically create instances. WHY: Current behavior allocates one class loader per config thus per core. However, there are cases where one would like different cores to share some objects that are dynamically instantiated (ie, where the class name is used to find the class through the class loader and instantiate). In the current form; since each core possesses its own class loader, static members are indeed different objects. For instance, there is no way of implementing a singleton shared between 2 request handlers. Originally from http://www.nabble.com/Post-SOLR215-SOLR350-singleton-issue-tf4776980.html HOW: The sharedLib attribute is extracted from the XML (multicore.xml) configuration file and parsed in the MultiCore load method. The directory path is used to create an URL class loader that will become the parent class loader of all core class loaders; since class resolution if performed on a parent-first basis, this allows sharing instances between different cores. STATUS: operational in conjunction with solr-350 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-281) Search Components (plugins)
[ https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550031 ] Sabyasachi Dalal commented on SOLR-281: --- I am updating the distributed search patch (SOLR-303) with this patch. I added the dist search components as, searchComponent name=gstat class=org.apache.solr.handler.federated.component.GlobalCollectionStatComponent / searchComponent name=mqp class=org.apache.solr.handler.federated.component.MainQPhaseComponent / searchComponent name=aqp class=org.apache.solr.handler.federated.component.AuxiliaryQPhaseComponent / requestHandler name=/search class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str /lst arr name=last-components strgstat/str strmqp/str straqp/str /arr /requestHandler But it was not working. On debugging i found that these added components were not getting registered. I made the following change in SolrCore.loadSearchComponents, // NamedListPluginLoaderSearchComponent loader = new NamedListPluginLoaderSearchComponent( xpath, searchComponents ); NamedListPluginLoaderSearchComponent loader = new NamedListPluginLoaderSearchComponent( xpath, components ); Search Components (plugins) --- Key: SOLR-281 URL: https://issues.apache.org/jira/browse/SOLR-281 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-281-ComponentInit.patch, SOLR-281-ComponentInit.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, solr-281.patch, solr-281.patch, solr-281.patch A request handler with pluggable search components for things like: - standard - dismax - more-like-this - highlighting - field collapsing For more discussion, see: http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: How to store a HashSet in the index?
On 10-Dec-07, at 12:09 AM, Rishabh Joshi wrote: Can anyone help me on, as to how I can go about efficiently indexing (actually, storing in the index) and retrieving, a HashSet object, which contains multiple string arrays? I just want to store the HashSet in the index, and not search on it. The HashSet should be returned with the document when I perform a search on any other fields. I don't know what efficient means in your context, but why not serialize to bytes and base64 encode, then store as you would a text field in Solr? -Mike
[jira] Updated: (SOLR-415) LoggingFilter for debug
[ https://issues.apache.org/jira/browse/SOLR-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-415: Attachment: SOLR-415.patch attached a revised patch as Hoss kindly suggested. LoggingFilter for debug --- Key: SOLR-415 URL: https://issues.apache.org/jira/browse/SOLR-415 Project: Solr Issue Type: Improvement Reporter: Koji Sekiguchi Priority: Trivial Attachments: SOLR-415.patch, SOLR-415.patch, SOLR-415.patch, SOLR-415.patch logging version of analysis.jsp -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
solrj for distributed search
I've been hacking on SOLR-303 (distributed search), and I started to write my own XML parsing code utilizing stax (streaming), when I realized that the code had already been written (in SolrJ). Should we use SolrJ for making and parsing the distributed search requests? One downside is that SolrJ would need to be moved into the core, but I was planning on migrating to HttpClient at some point anyway. Thoughts? -Yonik
Re: Confluence wiki vs MoinMoin
: What is really missing is that we don't (at least I don't) have a clear sense : where what type of docs should go. Some in javadocs, some on the wiki, almost : none on the forrest site. Javadocs work great since they are attached to : sources and get included in releases. But solr's users are not all javadoc : readers (nor should they be). Solr docs really should be in a non java : specific context. Once upon a time the plan (or at least my plan) was that how/what/why documentation for provided plugins (dismax, fieldtypes, analysis factories, etc...) would live (close to the code) in class level javadocs -- our users may not be javadoc readers, but we could link straight to the good pits from user centric forrest based overview pages. The wiki would be a way for users to write tips and tricks type docs. But things didn't really work out that way ... as simple as forrest is to use to generate pages, it's not the most freindly tool to add and organicly grow a set of documentation ... plus we made the decision early on to start a lot of docs on the wiki to flesh them out and make them easier to tweak with the intention of eventually migrating them to official forrest docs except that we didn't know then what we know now about the legal issues -- but even before we knew about the legal issues, no one ever really had the inclination to migrate any of the docs. even if we switch to cwiki, I still think javadocs are the best way to go for official plugin docs because of hte code/doc proximity advantages ... but if they aren't user friendly enough for typical users then maybe we could look into hacking together a custom doclet to just output the class level docs and not hte method details? : Having read all the rules, this is my proposal: +1 to the bulk of your proposal, but a few comments... I would like to suggest a step #0: There doesn't seem to be a cwiki sandbox we can use to test stuff out, so after getting a solr cwiki created, let's do some experiments with the exporting and make sure we can viably export docs that: 1) use all relative links (like forrest) 2) don't contain user comments from non cla users ..so we can be confident the exports can be included as documentation with releases before we spend a lot of time building up the new docset. : 2. We keep http://wiki.apache.org/solr/ as an unofficial sandbox and pre 1.3 : docs. Anyone can edit it, but it is not official. i'm assuming we might eventually want to migrate this to a seperate cwiki space just for our own sanity (single syntax, single look/feel, etc...) but i agree this doesn't need to happen any time soon. : For now, i think we should stick with forrest for the website and tutorial. : When the tutorial gets revisited, http://cwiki.apache.org/SOLRxSITE/ may be a i think the current site (including the tutorial) would probably make the best initial docs to put into the new cwiki to test it out since we *know* the legal issues with them are okay and we know they should be included in all releases. eliminating forrest from the equation early on would also help simplify the documentation dilution issues of having forrest docs, wiki docs, and cwiki docs all at once -- especially if in Solr 1.3 (or 1.4 ... whenever it happens) the release itself includes overview docs and a tutorial generated by forrest with other docs generated from a cwiki dump ... the odds of getting those to all hyperlink with eachother cleanly seems very low. -Hoss
Re: solrj for distributed search
I think (re)using solrj is a good idea. As a client, I'd rather have one API to use for both distributed and non-distributed calls to Solr. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Yonik Seeley [EMAIL PROTECTED] To: solr-dev@lucene.apache.org Sent: Tuesday, December 11, 2007 12:13:43 AM Subject: solrj for distributed search I've been hacking on SOLR-303 (distributed search), and I started to write my own XML parsing code utilizing stax (streaming), when I realized that the code had already been written (in SolrJ). Should we use SolrJ for making and parsing the distributed search requests? One downside is that SolrJ would need to be moved into the core, but I was planning on migrating to HttpClient at some point anyway. Thoughts? -Yonik
purpose of MultiCore default ?
Forgive me if i'm off base with some stuff here ... i'm still trying to wrap my head arround some of the new multicore stuff. Ryan's comments in SOLR-428 have made me realize that the default core means more then i thought. I had missunderstood it to be a way of specifying what the legacy singlton core should be ... but based on on SOLR-428 I'm now getting the sense that the default core identifies what core to use if no core is specified in the URL, soif this is your multicore.xml... multicore adminPath=/admin/multicore persistent=true core name=core0 instanceDir=core0 default=true/ core name=core1 instanceDir=core1 / /multicore ...then these two URLs are equivilent, correct? http://localhost:8983/solr/@core0/select?q=*:* http://localhost:8983/solr/select?q=*:* If i may ask: what is the motivation for this? isn't it fair to assume that if people want to use multiple cores they can include the core name in every URL? The one use case i can think of is that based on the SETASDEFAULT option of the MultiCoreHandler i suspect people want to do stuff like this... 1. start up server with a single core0 as default 2. use default URLs all day long... GET http://localhost:8983/solr/select?q=bar POST http://localhost:8983/solr/update ... GET http://localhost:8983/solr/select?q=foo 3. decide you want to change the schema or something, load a new core0 4. rebuild your index using core0 urls... POST http://localhost:8983/solr/@core1/update ... 5. once you're happy with core1, set it as the default, and unload core0... GET http://localhost:8983/solr/admin/multicore?action=SETASDEFAULTcore=core1 GET http://localhost:8983/solr/admin/multicore?action=UNLOADcore=core0 6. keep using core1 just like you use to use core0 with default urls... GET http://localhost:8983/solr/select?q=bar POST http://localhost:8983/solr/update ... GET http://localhost:8983/solr/select?q=foo ...this seems like a really cool use case of multicores, but it also seems like it is incompartible with the primary goal of multicores: having lots of different indexes; afterall: there's only one default, so you can only use this trick with one of your indexes. It seems like if this is the only perk or having a default core, it would make more sense to require a core name in every url (when multicore support is turned on) and replace the SETASDEFAULT operation with a RENAME operation that changes the name of a core (unloading any previous core that was using that name) ... or maybe even support multiple names per core, with some ADDNAME, REMOVENAME, and MOVENAME options... 1 /admin/multicore?action=ADDNAMEcoreDir=cores/dir0name=yak 2 /@yak/select?q=*:* 3 /admin/multicore?action=ADDNAMEcoreDir=cores/dir1name=foo 4 /@foo/select?q=*:* 5 /admin/multicore?action=ADDNAMEcoreDir=cores/dir1name=bar 6 /@bar/select?q=*:* (#4 and #6 are now equivilent) 7 /admin/multicore?action=REMOVENAMEcoreDir=cores/dir1name=foo (now #4 no longer works) 8 /admin/multicore?action=MOVENAMEcoreDir=cores/dir0name=bar (now #2 and #6 are equivilent) thoughts? -Hoss
multicore and admin pages?
I notice this in the MultiCore wiki... To access the admin pages for each core visit: http://localhost:8983/solr/admin/?core=core0 http://localhost:8983/solr/admin/?core=core1 ...trying this out using the example multicore setup didn't seem to work (the admin screen said core0 even for the second URL) -- but in general i'm curious if there's a specific desire for the admin pages to work with URLs like this (the core name as a URL param) instead of the having the core in the path like for the rest of the URLs? Sure the admin pages are (mostly) JSPs, but before the Dispatcher forwards the request/response up the chain, it could pull the core name out of the path and include it as a request attribute right? -Hoss