Re: EmbeddedSolrServer API usage
You could also use the CoreContainer to create a Core from the descriptor: CoreContainer container = new CoreContainer(); CoreDescriptor descriptor = new CoreDescriptor(container, core1, /Users/erik/apache-solr-1.3.0/example/solr); SolrCore core = container.create( descriptor ); if you are using a custom solrconfig name, you would need to call setConfigName( path ) on the descriptor. As for closing... have you tried core.close()? ryan On Oct 2, 2008, at 8:49 AM, Erik Hatcher wrote: I'm doing some Java experiments to get ready for a solr-ruby overhaul such that JRuby comes into play nicely so that EmbeddedSolrServer can be used transparently too. I've not tried this since the whole CoreContainer/CoreDescriptor stuff was added, and I don't quite understand it all. Here's what I've got: public static void main(String[] args) throws IOException, ParserConfigurationException, SAXException, SolrServerException { CoreContainer container = new CoreContainer(); SolrConfig config = new SolrConfig(/Users/erik/apache-solr-1.3.0/ example/solr, solrconfig.xml, null); CoreDescriptor descriptor = new CoreDescriptor(container, core1, /Users/erik/apache-solr-1.3.0/example/solr); SolrCore core = new SolrCore(core1, /Users/erik/apache- solr-1.3.0/example/solr/data, config, null, descriptor); container.register(core1, core, false); SolrServer solr = new EmbeddedSolrServer(container, core1); SolrQuery query = new SolrQuery(*:*); QueryResponse response = solr.query(query); System.out.println(response = + response); } This works, but has a fair bit of seemingly unnecessary duplication, and it also leaves the JVM stays running for some reason. Is this the proper way to use EmbeddedSolrServer, or are there some tips to improving the code and reducing the duplication? Also, why does the JVM keep running? Are we spinning off a thread that needs to be shut down? Is there some sort of close() call that is needed? Thanks, Erik
[jira] Resolved: (SOLR-796) remove unused SolrIndexSearcher from DUH2
[ https://issues.apache.org/jira/browse/SOLR-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley resolved SOLR-796. Resolution: Fixed remove unused SolrIndexSearcher from DUH2 - Key: SOLR-796 URL: https://issues.apache.org/jira/browse/SOLR-796 Project: Solr Issue Type: Improvement Components: update Reporter: Ryan McKinley Priority: Minor Fix For: 1.4 Attachments: SOLR-796-remove-searcher.patch Since the DUH2 does not use the searcher for deletes anymore, it does not need to be able to... Check: http://www.nabble.com/Fwd%3A-read-only-SolrCore--td19769173.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: What is a Standard SearchComponent? (SOLR-680)
On Oct 1, 2008, at 5:33 PM, Ryan McKinley wrote: I disagree with Erik that we should have people explicitly configure the components. Folks don't have to explicitly configure them, if they are just running with the example configuration - which is more likely than not. Oh, another thing about search components, I don't like the first/ last thing - I like it to be explicit, less magic. As for component registration precedence, it is the configured Component that has precedence. The Component initialization code only adds the default Component if that name is not already used. Registering your own spellcheck Component will use your component. Right, but what if someone has a stats component now in 1.3 wired into a custom request handler (but not /select), then upgrades to 1.4 with the new implicit stats built-in - then all of a sudden a request to /select will use _their_ stats component, not the built in one. Right? Ok, I see you point... since we are past 1.3, this may a moot point, but how about something like: * SearchHandler has no components registered and must be configured manually. * StandardRequestHandler (currently nothing more then extends SearchHandler) would register all components with no dependancies - it would not support things like first/last components. Users extending SearchHandler would have absolute control -- users extending StandardRequestHandler would have standard configuration - features may be added between major releases, but not removed. ryan
[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication
[ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lee updated SOLR-433: -- Attachment: SOLR-433.patch Includes Stephane's fixes for snappuller snapinstaller and some minor edits MultiCore and SpellChecker replication -- Key: SOLR-433 URL: https://issues.apache.org/jira/browse/SOLR-433 Project: Solr Issue Type: Improvement Components: replication, spellchecker Affects Versions: 1.3 Reporter: Otis Gospodnetic Fix For: 1.4 Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch With MultiCore functionality coming along, it looks like we'll need to be able to: A) snapshot each core's index directory, and B) replicate any and all cores' complete data directories, not just their index directories. Pulled from the spellchecker and multi-core index replication thread - http://markmail.org/message/pj2rjzegifd6zm7m Otis: I think that makes sense - distribute everything for a given core, not just its index. And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion. Right? Ryan: Yes, that was my thought. If an arbitrary directory could be distributed, then you could have /path/to/dist/index/... /path/to/dist/spelling-index/... /path/to/dist/foo and that would all get put into a snapshot. This would also let you put multiple cores within a single distribution: /path/to/dist/core0/index/... /path/to/dist/core0/spelling-index/... /path/to/dist/core0/foo /path/to/dist/core1/index/... /path/to/dist/core1/spelling-index/... /path/to/dist/core1/foo -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-797) Construct EmbeddedSolrServer response without serializing/parsing
Construct EmbeddedSolrServer response without serializing/parsing - Key: SOLR-797 URL: https://issues.apache.org/jira/browse/SOLR-797 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Jonathan Lee Currently, the EmbeddedSolrServer serializes the response and reparses in order to create the final NamedList response. From the comment in EmbeddedSolrServer.java, the goal is to: * convert the response directly into a named list -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-797) Construct EmbeddedSolrServer response without serializing/parsing
[ https://issues.apache.org/jira/browse/SOLR-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lee updated SOLR-797: -- Priority: Minor (was: Major) Construct EmbeddedSolrServer response without serializing/parsing - Key: SOLR-797 URL: https://issues.apache.org/jira/browse/SOLR-797 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Jonathan Lee Priority: Minor Currently, the EmbeddedSolrServer serializes the response and reparses in order to create the final NamedList response. From the comment in EmbeddedSolrServer.java, the goal is to: * convert the response directly into a named list -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-797) Construct EmbeddedSolrServer response without serializing/parsing
[ https://issues.apache.org/jira/browse/SOLR-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lee updated SOLR-797: -- Attachment: SOLR-797.patch This patch contains a first stab at transforming the NamedList without serializing it then parsing it from the serialized form. From what I can tell, all the fields (headers, facets, spelling, etc) returned from the handler in the response is valid for output except that references to actual documents need to be resolved. This patch borrows code from NamedListCodec.java and BinaryResponseWriter.java to resolve the documents. Construct EmbeddedSolrServer response without serializing/parsing - Key: SOLR-797 URL: https://issues.apache.org/jira/browse/SOLR-797 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 1.3 Reporter: Jonathan Lee Priority: Minor Attachments: SOLR-797.patch Currently, the EmbeddedSolrServer serializes the response and reparses in order to create the final NamedList response. From the comment in EmbeddedSolrServer.java, the goal is to: * convert the response directly into a named list -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: EmbeddedSolrServer API usage
Thanks Ryan - good tips, and core.close() was the missing piece, duh. Here's how it looks in JRuby: container = CoreContainer.new descriptor = CoreDescriptor.new(container, core1, /Users/erik/ apache-solr-1.3.0/example/solr) core = container.create(descriptor) container.register(core1, core, false) solr = EmbeddedSolrServer.new(container, core1) query = SolrQuery.new(*:*) response = solr.query(query) puts response core.close Perhaps there should be an overloaded CoreContainer#register(core) that uses the name from the core descriptor so core1 doesn't have to be duplicated? Erik On Oct 2, 2008, at 10:37 AM, Ryan McKinley wrote: You could also use the CoreContainer to create a Core from the descriptor: CoreContainer container = new CoreContainer(); CoreDescriptor descriptor = new CoreDescriptor(container, core1, /Users/erik/apache-solr-1.3.0/example/solr); SolrCore core = container.create( descriptor ); if you are using a custom solrconfig name, you would need to call setConfigName( path ) on the descriptor. As for closing... have you tried core.close()? ryan On Oct 2, 2008, at 8:49 AM, Erik Hatcher wrote: I'm doing some Java experiments to get ready for a solr-ruby overhaul such that JRuby comes into play nicely so that EmbeddedSolrServer can be used transparently too. I've not tried this since the whole CoreContainer/CoreDescriptor stuff was added, and I don't quite understand it all. Here's what I've got: public static void main(String[] args) throws IOException, ParserConfigurationException, SAXException, SolrServerException { CoreContainer container = new CoreContainer(); SolrConfig config = new SolrConfig(/Users/erik/apache-solr-1.3.0/ example/solr, solrconfig.xml, null); CoreDescriptor descriptor = new CoreDescriptor(container, core1, /Users/erik/apache-solr-1.3.0/example/solr); SolrCore core = new SolrCore(core1, /Users/erik/apache- solr-1.3.0/example/solr/data, config, null, descriptor); container.register(core1, core, false); SolrServer solr = new EmbeddedSolrServer(container, core1); SolrQuery query = new SolrQuery(*:*); QueryResponse response = solr.query(query); System.out.println(response = + response); } This works, but has a fair bit of seemingly unnecessary duplication, and it also leaves the JVM stays running for some reason. Is this the proper way to use EmbeddedSolrServer, or are there some tips to improving the code and reducing the duplication? Also, why does the JVM keep running? Are we spinning off a thread that needs to be shut down? Is there some sort of close() call that is needed? Thanks, Erik
[jira] Created: (SOLR-798) FileListEntityProcessor can't handle directories containing lots of files
FileListEntityProcessor can't handle directories containing lots of files - Key: SOLR-798 URL: https://issues.apache.org/jira/browse/SOLR-798 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Grant Ingersoll Priority: Minor The FileListEntityProcessor currently tries to process all documents in a single directory at once, and stores the results into a hashmap. On directories containing a large number of documents, this quickly causes OutOfMemory errors. Unfortunately, the typical fix to this is to hack FileFilter to do the work for you and always return false from the accept method. It may be possible to hook up some type of Producer/Consumer multithreaded FileFilter approach whereby the FileFilter blocks until the nextRow() mechanism requests another row, thereby avoiding the need to cache everything in the map. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: EmbeddedSolrServer API usage
On Oct 2, 2008, at 1:58 PM, Erik Hatcher wrote: Thanks Ryan - good tips, and core.close() was the missing piece, duh. Here's how it looks in JRuby: container = CoreContainer.new descriptor = CoreDescriptor.new(container, core1, /Users/erik/ apache-solr-1.3.0/example/solr) core = container.create(descriptor) container.register(core1, core, false) solr = EmbeddedSolrServer.new(container, core1) query = SolrQuery.new(*:*) response = solr.query(query) puts response core.close Perhaps there should be an overloaded CoreContainer#register(core) that uses the name from the core descriptor so core1 doesn't have to be duplicated? +1 public SolrCore register(SolrCore core, boolean returnPrev) { return register( core.getName(), core, returnPrev ); }
Re: LogoContest Process Timeline ... was: Re: [Solr Wiki] Update of LogoContest by HossMan
Hi, I have to work on some personal matter. I won't be able to deliver this week. Is it OK if I deliver this the next week? I think this should be still accetable according to original 4 week schedule. (I am sorry about that...) Lukas On Tue, Sep 30, 2008 at 9:31 PM, Lukáš Vlček [EMAIL PROTECTED] wrote: On Tue, Sep 30, 2008 at 9:12 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : May I have a question? What is PRC? Sorry: it's the Public Relations Comittee. They don't have much of a web presence, so i can't include a handy URL explaining all about them, but they are the committee established by the ASF Board to oversee all things related to Apache PR (including branding and the policies for projects Logos [rant]which projects are expected to follow, but aren't posted anywhere for people to find[/rant].) http://www.apache.org/foundation/how-it-works.html#other OK, what PRC has to do with the log design? Is there any particular constraint/request that the logo design must follow? What is it? You mentioned that the logo design has to contain a word Apache, are there any other requirements like this? : (And I am sorry for not delivering other Logo proposals ... it is due to no problem, we're all just voluneering on this afterall -- the question is do you (as a graphic artist) think 4 weeks is enough time to see some really good, creative designs come in? -Hoss 4 weeks sounds good. I will deliver more stuff by the end of this week. (Wow! did I say this publicly?) Lukas
[jira] Updated: (SOLR-55) TEST of Jira email integration
[ https://issues.apache.org/jira/browse/SOLR-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man Trash Test Account updated SOLR-55: Attachment: solr.png testing image attacment -- please ignore TEST of Jira email integration -- Key: SOLR-55 URL: https://issues.apache.org/jira/browse/SOLR-55 Project: Solr Issue Type: Task Reporter: Hoss Man Assignee: Hoss Man Priority: Trivial Attachments: solr.png Test issue to experiement with jira email integration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-55) TEST of Jira email integration
[ https://issues.apache.org/jira/browse/SOLR-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man Trash Test Account updated SOLR-55: Attachment: (was: solr.png) TEST of Jira email integration -- Key: SOLR-55 URL: https://issues.apache.org/jira/browse/SOLR-55 Project: Solr Issue Type: Task Reporter: Hoss Man Assignee: Hoss Man Priority: Trivial Test issue to experiement with jira email integration. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Fwd: LogoContest Process Timeline ... was: Re: [Solr Wiki] Update of LogoContest by HossMan
: Perhaps pushing the date to the 20th, and finishing on Thanksgiving? Yeah, that sounds like a better idea. i'll update the wiki. : Using JIRA is good because it takes care of all the IP issues and is : already there to use, but not so good because the thumbnails look : crapy and people who submit items can not delete their own. It also : falls into the problem of voting on 5 version of the same logo. I just double checked ... in spite of what the Manage Attachments page in Jira says, anyone can delete a file they themselves attached (i was pretty sure because we've had problems in the past with people removing patches because they have new approaches and the old ones get lost forever) The thumbnail issue is admittedly anoying. : Rather then voting directly off the JIRA page -- When the submissions : are closed, I suggest we build a special page for the contest and : only put 'final drafts' on that. This page will be passed by the that makes sense -- it will help eliminate the possibility of people adding new attachments in the middle of voting as well. but as i said: if people want to withdraw a logo they can take care of that via Jira before the deadline. -Hoss
Re: LogoContest Process Timeline ... was: Re: [Solr Wiki] Update of LogoContest by HossMan
: OK, what PRC has to do with the log design? Is there any particular : constraint/request that the logo design must follow? What is it? You : mentioned that the logo design has to contain a word Apache, are there any : other requirements like this? all of the guidelines and requirements they've outlined are on our wiki page... http://wiki.apache.org/solr/LogoContest -Hoss
Re: Fwd: LogoContest Process Timeline ... was: Re: [Solr Wiki] Update of LogoContest by HossMan
: Yeah, that sounds like a better idea. i'll update the wiki. I've made final updates to the process on the wiki i'll do a big announce to solr-user, [EMAIL PROTECTED] and on the Solr home page tomorow unless anyone objects soon. -Hoss
Re: What is a Standard SearchComponent? (SOLR-680)
: The reason I thought StatsComponent is default while SpellCheck is not is : that SpellChecking necessarily requires some configuration. Stats can be : there without doing anything -- it is just the cost of checking if : stats=true in the request. : : I suggest that we add *all* Components that are generally useful off the shelf : to the StandardRequestHandler. We should add documentation to say: if you I think that general philosophy is what makes the most sense ... as long as a component can be used without special configuration add it by default, but Components should be NOOPs unless activated by params (ie: facet=true). people worried about saving every last cycle can create an explicit list of components and trust that no new components will get added to the pipeline when they upgrade. ... but for people who upgrade without modifying their configs, and may not even know anything about search components, give them the new hotness by default, so when they ask how do i...? and someone says try adding @foo=truefoo.this=that to your request the hotness starts to work for them without them needing to change anything. (just like when features got added to standard/dismax before we had components) -Hoss
Re: What is a Standard SearchComponent? (SOLR-680)
: As for component registration precedence, it is the configured Component : that has precedence. The Component initialization code only adds the : default Component if that name is not already used. Registering your own : spellcheck Component will use your component. : : Right, but what if someone has a stats component now in 1.3 wired into a : custom request handler (but not /select), then upgrades to 1.4 with the new : implicit stats built-in - then all of a sudden a request to /select will use : _their_ stats component, not the built in one. Right? this never really sit well with me before either ... but i couldn't really place why it didn't sit well untill you gave that example. we can't undo the current behavior because it has it's legitimate use cases: i've subclassed QueryComponent to modify it with some custom behavior, and i've registered an instance of MyQueryComponent with the name query and now i'm relying on it to get used by default. I think the solution to this problem is education and documentation ... people customizing Solr may occasionally have to make some changes when upgrading, it's a fact of life that can't be completely avoided. Consider another equally plausible situation: i have custom response writer in my Solr 1.2 and it uses a request param named defType -- which when i upgrade to Solr 1.3 suddenly causes all sorts of things to break, because the param name i picked collides with a new param that all the request handlers i use pay attention to. In the rewuest param collision situation users are *really* screwed, because they have to change the params their clients send ... by comparison, the component name collision situation is trivial to deal with, just search and replace stats with mystats everywhere in your solrconfig.xml. -Hoss
solr 2.0 branch/sandbox?
Hey- Rather then continually point to solr 2.0 as a future future thing, i'd like to give a go at removing all configs and deprecated stuff. -- I doubt that would end up being the real direction, but as an exercise would be quite valuable to figure out what the major issues will be and see how it feels. What do you think the best way to do this is? How do you feel if I make a branch to experiment with stripping all configs out of solr perhaps: http://svn.apache.org/repos/asf/lucene/solr/branches/sandbox/ or http://svn.apache.org/repos/asf/lucene/solr/branches/sandbox/ryan/ thoughts? ryan
Re: LogoContest Process Timeline ... was: Re: [Solr Wiki] Update of LogoContest by HossMan
Hi, As for the Apache word in the logo. Is there any limitationon regarding the font? I mean the current Apache Software Foundation is pretty known and seems to use something like Arial Bold. I would expect that this should not be changed in the new Solr logo. BTW: Is there any reference Apache logo downloadable from the net in vector format? Any comments on this? Regards, Lukas On Fri, Oct 3, 2008 at 12:13 AM, Chris Hostetter [EMAIL PROTECTED]wrote: : OK, what PRC has to do with the log design? Is there any particular : constraint/request that the logo design must follow? What is it? You : mentioned that the logo design has to contain a word Apache, are there any : other requirements like this? all of the guidelines and requirements they've outlined are on our wiki page... http://wiki.apache.org/solr/LogoContest -Hoss