[jira] Updated: (SOLR-108) Some basic clean up and validation for Solr::Request::Standard
[ https://issues.apache.org/jira/browse/SOLR-108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] William Groppe updated SOLR-108: Attachment: request_standard.patch > Some basic clean up and validation for Solr::Request::Standard > -- > > Key: SOLR-108 > URL: https://issues.apache.org/jira/browse/SOLR-108 > Project: Solr > Issue Type: Improvement > Components: clients - ruby - flare > Environment: Darwin rocket 8.8.1 Darwin Kernel Version 8.8.1: Mon Sep > 25 19:42:00 PDT 2006; root:xnu-792.13.8.obj~1/RELEASE_I386 i386 i386 >Reporter: William Groppe >Priority: Minor > Attachments: request_standard.patch > > > This is a slightly different approach from the one you took. If we get a > hash in, and we want a hash out, why convert it to instance variables? This > could even be taken a step further, why not create the hash that solr will > need on initialization? Anyway, it does some basic validation, and passes > all tests. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-108) Some basic clean up and validation for Solr::Request::Standard
Some basic clean up and validation for Solr::Request::Standard -- Key: SOLR-108 URL: https://issues.apache.org/jira/browse/SOLR-108 Project: Solr Issue Type: Improvement Components: clients - ruby - flare Environment: Darwin rocket 8.8.1 Darwin Kernel Version 8.8.1: Mon Sep 25 19:42:00 PDT 2006; root:xnu-792.13.8.obj~1/RELEASE_I386 i386 i386 Reporter: William Groppe Priority: Minor This is a slightly different approach from the one you took. If we get a hash in, and we want a hash out, why convert it to instance variables? This could even be taken a step further, why not create the hash that solr will need on initialization? Anyway, it does some basic validation, and passes all tests. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-104) Update Plugins
[ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464667 ] Ryan McKinley commented on SOLR-104: I agree with points J.J's points 1-6. thanks If one were to look at only one thing, the stuff i to look at woud be how Handlers get their parameters and content streams. from my previous post: I define three basic types of request handlers in: http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/ 1) standard. This gets everything from parameters (get or post) 2) posted. This gets a reader from the posted body: http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/PostedRequestHandler.java 3) multipart. This gets an iterator over each file item using commons-upload streaming API http://jakarta.apache.org/commons/fileupload/streaming.html http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/MultipartRequestHandler.java and: http://svn.lapnap.net/solr/handler-draft/solr/src/webapp/src/org/apache/solr/servlet/SolrRequestFilter.java fills them up. It currently uses instanceof... any other ideas? > Update Plugins > -- > > Key: SOLR-104 > URL: https://issues.apache.org/jira/browse/SOLR-104 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.2 >Reporter: Ryan McKinley > Fix For: 1.2 > > Attachments: HandlerRefactoring-DRAFT-SRC.zip, > HandlerRefactoring-DRAFT-SRC.zip, HandlerRefactoring.DRAFT.patch, > HandlerRefactoring.DRAFT.patch, HandlerRefactoring.DRAFT.zip > > > The plugin framework should work for 'update' actions in addition to 'search' > actions. > For more discussion on this, see: > http://www.nabble.com/Re%3A-Handling-disparate-data-sources-in-Solr-tf2918621.html#a8305828 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] Commented: (SOLR-104) Update Plugins
Yonik Seeley (JIRA) wrote: [ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464655 ] Yonik Seeley commented on SOLR-104: --- I haven't had a chance to look at all this stuff yet, but we should take care to not try and implement too much. In some cases the right "plugin" mechanism might be the servlet spec and web.xml (made me think of it when I saw the "cookies" comment :-) Thanks for the feedback. I know i got a bit carried away, but at least I now have a clue how stuff works! I posted the code more to stimulate discussion then to suggest it is THE direction. re: cookies. Yes, cookies may be going a bit far :)
Re: switch to native locks by default?
: Ah, I hadn't realized that they might not be supported everywhere... I I'm just trusting the javadoc for NativeFSLockFactory ... i have no idea if it's accurate or not. : The current locking can also guard against mistakes though (multiple : instances of Solr trying to write to the same dir, someone opening a : Luke index on it, etc). right ... but it's only useful if all of the potential clients are using the same locking mechanism ... right now it's only safe to do any of those things if all the apps use SimpleFSLockFactory. all the more reason to make the factory and the lockDir configurable in Solr i guess. -Hoss
[jira] Commented: (SOLR-106) new facet params: facet.sort, facet.mincount, facet.offset
[ https://issues.apache.org/jira/browse/SOLR-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464661 ] Yonik Seeley commented on SOLR-106: --- I did have a reservation about always guaranteeing a sort order... I wasn't sure if it would always be easy to maintain term-sort order in future implementations. If that were the case, *maybe* it should be sort=count|term|none... might be more future-flexible, but harder to remember than a boolean. Just curious, what are your usecases for facet paging, and what percent of facet queries would have an offset other than 0? > new facet params: facet.sort, facet.mincount, facet.offset > -- > > Key: SOLR-106 > URL: https://issues.apache.org/jira/browse/SOLR-106 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Attachments: facet_params.patch > > > a couple of new facet params: > facet lists become pageable with facet.offset, facet.limit (idea from Erik) > facet.sort explicitly specifies sort order (true for count descending, false > for natural index order) > facet.mincount: minimum count for facets included in response (idea from JJ, > deprecate zeros) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-106) new facet params: facet.sort, facet.mincount, facet.offset
[ https://issues.apache.org/jira/browse/SOLR-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464658 ] J.J. Larrea commented on SOLR-106: -- +1 on direction, and based on a quick scan of the patch. It could be sort=asc|desc|none rather than true|false, but I'm not sure whether anyone would ever have a use for asc so it's probably not worth implementing. Of course extending to caching the facet tallies would dramatically speed paging. Perhaps both getFieldCacheCounts and getFacetTermEnumCounts should return a Collection, which could be a BoundedTreeSet when sorting or a List implementation when not, holding all the counts >= facet.mincount; then getTermCounts could centralize the paging and response creation and provide an object to (someday) cache. It would lose the mincount=0 optimization for getFieldCacheCounts, but how many users are really going to want mincounts=0 unless the list is small and non-sparse in which case the optimization isn't a big win anyway. > new facet params: facet.sort, facet.mincount, facet.offset > -- > > Key: SOLR-106 > URL: https://issues.apache.org/jira/browse/SOLR-106 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Attachments: facet_params.patch > > > a couple of new facet params: > facet lists become pageable with facet.offset, facet.limit (idea from Erik) > facet.sort explicitly specifies sort order (true for count descending, false > for natural index order) > facet.mincount: minimum count for facets included in response (idea from JJ, > deprecate zeros) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
solrb
I've committed some changes to solrb today. Of note: * Solr::Request::Standard - I know the parameter handling is ugly. It currently implements all of Solr's StandardRequestHandler except the highlighting parameters. I'd love for this to be much cleaner. If you have suggestions/patches I'm all for it! * examples/marc - I've implemented a very simple MARC importer. It's got mapping capabilities like this: mapping = { :id => Proc.new {|r| r['001'].value}, :subject_genre_facet => ['650v', '655a'], :subject_era_facet => '650y', :title_text => '245a' } * I've simplified the Solr schema used by solrb quite dramatically in order to get a concrete start on a faceting front-end. Right now the only fields the client should send are id, *_facet, and *_text. *_text gets copied into text, so it can be used as a general-purpose full-text search field. *_facet fields are not tokenized. I expect my changes to be a dramatic over-simplification and that this will evolve back up to handle other data types. I am keen on seeing how clean we can keep the field naming conventions as these will be used for name mappings back and forth to a Ruby domain model eventually. Erik p.s. Ed and Will, you guys on solr-dev? CC'd for good measure for now.
[jira] Commented: (SOLR-104) Update Plugins
[ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464655 ] Yonik Seeley commented on SOLR-104: --- I haven't had a chance to look at all this stuff yet, but we should take care to not try and implement too much. In some cases the right "plugin" mechanism might be the servlet spec and web.xml (made me think of it when I saw the "cookies" comment :-) > Update Plugins > -- > > Key: SOLR-104 > URL: https://issues.apache.org/jira/browse/SOLR-104 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.2 >Reporter: Ryan McKinley > Fix For: 1.2 > > Attachments: HandlerRefactoring-DRAFT-SRC.zip, > HandlerRefactoring-DRAFT-SRC.zip, HandlerRefactoring.DRAFT.patch, > HandlerRefactoring.DRAFT.patch, HandlerRefactoring.DRAFT.zip > > > The plugin framework should work for 'update' actions in addition to 'search' > actions. > For more discussion on this, see: > http://www.nabble.com/Re%3A-Handling-disparate-data-sources-in-Solr-tf2918621.html#a8305828 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-104) Update Plugins
[ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464654 ] J.J. Larrea commented on SOLR-104: -- 6. I may be going a little crazy with this soft-configuration concept, but thinking about how to support the legacy /select?qt=faceted... format leads me to think there could be a trivial (3-line handleRequestBody) NamedRequestHandler which uses one parameter to provide the name of another parameter which names another requestHandler definition which it would then invoke. With that, qt standard would allow /select?qt=dismax... to be soft-implemented; a developer who had no use for the non-URL-path selectors could strip it out, another developer who wanted to use a different parameter to set the handler could define it that way. > Update Plugins > -- > > Key: SOLR-104 > URL: https://issues.apache.org/jira/browse/SOLR-104 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.2 >Reporter: Ryan McKinley > Fix For: 1.2 > > Attachments: HandlerRefactoring-DRAFT-SRC.zip, > HandlerRefactoring-DRAFT-SRC.zip, HandlerRefactoring.DRAFT.patch, > HandlerRefactoring.DRAFT.patch, HandlerRefactoring.DRAFT.zip > > > The plugin framework should work for 'update' actions in addition to 'search' > actions. > For more discussion on this, see: > http://www.nabble.com/Re%3A-Handling-disparate-data-sources-in-Solr-tf2918621.html#a8305828 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-104) Update Plugins
[ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464653 ] J.J. Larrea commented on SOLR-104: -- I think this is a fantastic effort, so please take my comments and suggestions for improvement below in the context of my appreciation you took the time to wade deeply into the SOLR request-handling code looking for places to improve it. If the request structure is cleaned up along these lines it will make it much simpler for people to develop and contribute alternate request handlers (both update and query varieties) and further SOLRs standing as an open-source community-driven project. 1. I really like your idea of using the URL suffix to specify the handler. But it looks like you have required this to be a fixed 2-level hierarchy, with URLs of the form http://:/// which are looked up in a handler table keyed by and then . For example search/standard looks for a , loads the indicated handler class, and associates it with the config. But this hierarchy seems a little overdetermined and the implementation overcomplex. It could be argued that one would want /, for the alone to resolve to a handler class, and the handler class to be responsible for deciding how to act on the part... but any hierarchical arrangement that makes perfect sense to one person can seem wrong to another. And in your implementation I see no actual action taken by the argument other than selection of a per- default , and that seems more complexity than it's worth; there are easier ways. I would suggest the simpler approach of simply taking the entire path after to be a handler configuration name, without conforming it to any fixed hierarchy, e.g. a developer could set up ... ... ... ... In that way establishing a command/subcommand hierarchy is entirely up to the user's solrconfig.xml setup, and there is no imposed logic as to whether the different behavior between the 3 search examples is achieved through different config of the same handler class, different handler classes, or both. As for default actions, there is no need for special code, they can entirely be defined in solrconfig. For example, if a developer sets up a /search/xxx space as above, the response to a client request /search without further qualification is entirely up to what is defined in solrconfig.xml: - If there is no request handler defined under name="search" SOLR would return a standard "No handler found" message - If it has a query request handler under that name (e.g. with name="search" class="solr.StandardRequestHandler") it would get to handle less-qualified requests with developer-defined defaults. - It could be defined to explicitly invoke your UnavailableRequestHandler -- a great idea which should be extended so the error code and error message could be custom-configured with handler config params. Thus I think this free-form hierarchy would achieve greater simplicity and greater flexibility at the same time. 2. What would make this even more powerful would be the ability to "subclass" (meaning refine and/or extend) request handler configs: If the requestHandler element allowed an attribute extends="" and chained the SolrParams, then one could do something like: 0.01 text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 ... much more, per the "dismax" example in the sample solrconfig.xml ... ... and replacing the "partitioned" example ... inStock:true One could even allow the extending requestHandler to set a different handler class, in the case where the difference in behavior requires a different handler implementation but can share all or part of the params. 3. Structuring the code and action under /add is conceptually limiting because update-style request plugins such as SQL-based or CSV-based (and certainly XML-based) should still be able to add, replace, and delete, either based on internal logic or external command. Your code suggests further refactoring improvements along those lines. For example, in your SQLUpdateHandler example you call: AddUpdateCommand cmd = UpdateUtils.getAddUpdateCommandFromParams( params ); and then for each assembled Document cmd.doc = docmap.toDocument( schema ); UpdateUtils.addDoc( cmd ); [which does SolrCore.getSolrCore().getUpdateHandler().addDoc( cmd );] addedDocumentCount++; Lets say: A. We standardized on action=... as a way to define an action in a param B. A new method UpdateUtils.getUpdateCommandFromParams( params ) would use the action= param to decide which xxxUpdateCommand class to instantiate -- though this might be better placed as a static class method getCommandFromParams defined in UpdateCommand itself. (Perhaps once it decodes the action param it could call another UpdateCo
[jira] Commented: (SOLR-106) new facet params: facet.sort, facet.mincount, facet.offset
[ https://issues.apache.org/jira/browse/SOLR-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464652 ] Hoss Man commented on SOLR-106: --- Yonik: i think all of these params make sense. i only skimmed the code cahnges breifly but they look sound to me. Andreas: there was some discussion on the list about being able to specify a "prefix" that facet field values must match to be listed (which would trivial and efficient for TermEnum based facets, and doable for FieldCache facets) ... that makes a lot of sense to me in a "type ahead" word completion type of application (like google suggest) but the situation you describe doesn't relaly make sense to me -- if your client is only interested in documents in certain categories, then don't you want to filter the *documents* by do just those categories (at which point the facets will also be filtered) ... can you start a thread on solr-user describing your situation/scaling issue? > new facet params: facet.sort, facet.mincount, facet.offset > -- > > Key: SOLR-106 > URL: https://issues.apache.org/jira/browse/SOLR-106 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Attachments: facet_params.patch > > > a couple of new facet params: > facet lists become pageable with facet.offset, facet.limit (idea from Erik) > facet.sort explicitly specifies sort order (true for count descending, false > for natural index order) > facet.mincount: minimum count for facets included in response (idea from JJ, > deprecate zeros) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: switch to native locks by default?
On 1/14/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : Any reason we shouldn't switch to native locks by default? : In the future, read-only readers won't lock at all, but there will : still be a write lock for writers. just to clarify for people who may not be familiar (i had to go reread the LockFactory docs to remember) Native Locks may not be supported on all OSes, but have the perk of being cleaned up automatically in the event of a JVM crash. Ah, I hadn't realized that they might not be supported everywhere... I thought I saw some comments about switching Lucene to use that by default. Are you thinking we would add a new solrconfig.xml option to control this and default it to native locks? Not a bad idea... this is the first time i've noticed the SingleInstanceLockFactory ... in most cases couldn't we use that for Solr? I imagine we'll always need to support lock files if we wanted to make sure we can do some of hte "can i point Solr and an index created by _" use cases people have been asking about lately, ... but for the "normal" usage of Solr, isn't that good enough? The current locking can also guard against mistakes though (multiple instances of Solr trying to write to the same dir, someone opening a Luke index on it, etc). -Yonik
Re: memcached SolrCache implementation?
: I'd like to see about making a SolrCache implementation that uses the : memcached library @ http://www.whalin.com/memcached/. I believe that : this would be useful for replicated sites, allowing all the search nodes : to use a shared global cache. it's not something i've ever considered, mainly because i've always thought of the power of a SolrCache being in it's locality. the one thing you'd really have to watch out for is the machines being in sync ... i don't know if memcached has the notion of managing multiple dynamicly named caches, but you need to make sure that when a machine asks for a key, it's using the correct cache based on the IndexReader it currently has open -- if another slave gets snapshots from the master a little bit faster and starts populating the cache with DocSets using the new docIds you don't want to be getting those on your slightly behind slave. : seems like this is doable. I don't believe it's possible to implement : autowarming, as there doesn't seem to be a way to get a list of cached : keys from the memcached library. you could let memcached cache the key=>vals but you could maintain an inmemory list of the recent (or frequently) accessed keys on this tier -- the keys are really all you need for autowarming. : If so, I'm not certain how to pass up additional arguments to a custom : SolrCache implementation - I'd like to replace queryResultCache with my : implementation, but will need to provide additional configuration - : memcache server string, etc. Is it permissible to add custom attributes : to the queryResultCache configuration node? off the top of my head, i'm pretty sure that you can add any attribute you want and it will get passed to the init method in the "Map args" .. i'm not 100% certain though. -Hoss
Re: switch to native locks by default?
: Any reason we shouldn't switch to native locks by default? : In the future, read-only readers won't lock at all, but there will : still be a write lock for writers. just to clarify for people who may not be familiar (i had to go reread the LockFactory docs to remember) Native Locks may not be supported on all OSes, but have the perk of being cleaned up automatically in the event of a JVM crash. Are you thinking we would add a new solrconfig.xml option to control this and default it to native locks? this is the first time i've noticed the SingleInstanceLockFactory ... in most cases couldn't we use that for Solr? I imagine we'll always need to support lock files if we wanted to make sure we can do some of hte "can i point Solr and an index created by _" use cases people have been asking about lately, ... but for the "normal" usage of Solr, isn't that good enough? -Hoss
[jira] Updated: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-86: - Attachment: simple-post-using-urlconnection-approach.patch > [PATCH] standalone updater cli based on httpClient > --- > > Key: SOLR-86 > URL: https://issues.apache.org/jira/browse/SOLR-86 > Project: Solr > Issue Type: New Feature > Components: update >Reporter: Thorsten Scherler > Attachments: simple-post-using-urlconnection-approach.patch, > solr-86.diff, solr-86.diff > > > We need a cross platform replacement for the post.sh. > The attached code is a direct replacement of the post.sh since it is actually > doing the same exact thing. > In the future one can extend the CLI with other feature like auto commit, > etc.. > Right now the code assumes that SOLR-85 is applied since we using the servlet > of this issue to actually do the update. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464650 ] Hoss Man commented on SOLR-86: -- Independent of the work being done in SOLR-20, i think it would be *very* usefull to have a pure java replacement for post.sh -- both so people on machines wihout bash/curl can try out the simple tutorial, and as a bare bones example of how to do a simple POST to Solr in java. Thorsten: depending on SOLR-85 seems unrelated to the goal of "a cross platform replacement for the post.sh" ... I'm also not convinced this is really a use case where depending on HttpClient (and all *it* requires) really makes sense ... if the goal is a simple demonstrative tool then it should have as few dependencies as possible right? I've been playing arround with your attachment a bit and i've got an alternate version i'd like your feedback on ... for simplicity i left the code in the util package of the main code tree, and modified the main build.xml so that "ant example" would create example/post.jar used like so... java -jar example/post.jar http://localhost:8983/solr/update example/exampledocs/*.xml attachment to follow. > [PATCH] standalone updater cli based on httpClient > --- > > Key: SOLR-86 > URL: https://issues.apache.org/jira/browse/SOLR-86 > Project: Solr > Issue Type: New Feature > Components: update >Reporter: Thorsten Scherler > Attachments: simple-post-using-urlconnection-approach.patch, > solr-86.diff, solr-86.diff > > > We need a cross platform replacement for the post.sh. > The attached code is a direct replacement of the post.sh since it is actually > doing the same exact thing. > In the future one can extend the CLI with other feature like auto commit, > etc.. > Right now the code assumes that SOLR-85 is applied since we using the servlet > of this issue to actually do the update. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-20) A simple Java client for updating and searching
[ https://issues.apache.org/jira/browse/SOLR-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464648 ] J.J. Larrea commented on SOLR-20: - Regarding Hoss' point #3, perhaps it's time to reorganize into something like /solr/server/... /solr/client/... /solr/webapp/,,, (or /solr/server/webapp) /solr/shared/... "To build client XXX check out /solr/client or just /solr/client/java/XXX and /solr/shared" Shared would include external constants and exceptions. > A simple Java client for updating and searching > --- > > Key: SOLR-20 > URL: https://issues.apache.org/jira/browse/SOLR-20 > Project: Solr > Issue Type: New Feature > Components: clients - java > Environment: all >Reporter: Darren Erik Vengroff >Priority: Minor > Attachments: DocumentManagerClient.java, DocumentManagerClient.java, > solr-client-java-2.zip.zip, solr-client-java.zip, solr-client-sources.jar, > solr-client.zip, solr-client.zip, solr-client.zip, SolrClientException.java, > SolrServerException.java > > > I wrote a simple little client class that can connect to a Solr server and > issue add, delete, commit and optimize commands using Java methods. I'm > posting here for review and comments as suggested by Yonik. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-107) Iterable NamedList with java5 generics
[ https://issues.apache.org/jira/browse/SOLR-107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-107: --- Attachment: IterableNamedList.patch > Iterable NamedList with java5 generics > -- > > Key: SOLR-107 > URL: https://issues.apache.org/jira/browse/SOLR-107 > Project: Solr > Issue Type: Improvement >Reporter: Ryan McKinley >Priority: Trivial > Attachments: IterableNamedList.patch > > > Iterators and generics are nice! > this patch adds both to NamedList.java -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (SOLR-107) Iterable NamedList with java5 generics
Iterable NamedList with java5 generics -- Key: SOLR-107 URL: https://issues.apache.org/jira/browse/SOLR-107 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Trivial Attachments: IterableNamedList.patch Iterators and generics are nice! this patch adds both to NamedList.java -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] Updated: (SOLR-20) A simple Java client for updating and searching
Thanks for the feedback! Should this reply go to the solr-dev or be posted on JIRA? 1) i like the name solrj, ... Do you think the package name should change? org.apache.solr.client.SolrClient vs org.apache.solr.client.solrj.SolrClient; 3) I'm really not fond of "ParamNames.java" being a copy ... I agree - but don't know any good solution. maybe the main ant task could copy a few files from: /src/java/org/apache ... /client/java/org/apache ... Perhaps the client should parse the results into a 'NamedList'... 4) one thing we should really try to support in a client ... public class AbstractSolrQuery implements SolrQuery { protected abstract SolrParams getSolrParams(); public String getQueryString() { ... your current code, looping over getSolrParams() ... } } agreed. thats a good idea 5) what is the purpose of SolrClientStub ? I use it for testing and to temporarily disable sending stuff to solr. I have some hooks set up to send stuff to solr using a SolrClient - changing to a stub is an easy way to bypass the behavior. It may not belong in the official release... perhaps under /test. 6) what is the purpose of SolrDocumentable being an empty interface? ... it seems like you could replace SolrDocumentable, ... public interface SolrDocument { public Map getSolrDocumentFields(); } public abstract class SolrDocumented implements SolrDocument { protected abstract SolrDocument getSolrDocument(); public Map getSolrDocumentFields() { return getSolrDocument().getSolrDocumentFields() } } aaah, if only java allowed multiple inheritance! The two versions arose out of converting existing projects to use solr. In some cases it made sense to have the objects make their own document, in others id does not. Originally, i had the structure you suggest, but the 'SolrDocumented' objects are already in a fixed hierarchy. Then you wouldn't need that instanceof code in SolrClientImpl If it were more then one instanceof I would be more worried about it... but i'm open to suggestion. Note that we should probably support field and document boosts as well, ... public int getDocumentBoost(); public Map getFieldBoosts() ...to SolrDocument. good idea 7) The ResultsParser and QueryResults classes seem to suffer the same limitation ... Maybe the best is to have the ResultsParser parse a 'NamedList', then have various QueryResults that know what to do with the results. 8) ... i think it's completely practical to focus on client code which currently supports only the XmlResponseWriter output -- especially with the solrj ResultsParser class currently having a single public method... agreed. i'll remove the XML exceptions... - - - - - Thanks for your feedback. I apologize for filling up your inbox these past few days! thanks ryan On 1/14/07, Hoss Man (JIRA) <[EMAIL PROTECTED]> wrote: [ https://issues.apache.org/jira/browse/SOLR-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-20: - Summary: A simple Java client for updating and searching (was: A simple Java client with Java APIs for add(), delete(), commit() and optimize().) (NOTE: revised summary since this issue has moved beyond just updating) I *finally* had a chance to look this over, here's a few comments in no particular order... 1) i like the name solrj, i think this code should definitely live in client/java/solrj so that there is the potential for other java client code that is independent (if nothing else, i suspect something like SOLR-86 might be handy) ... we should probably put solrj in the package name as well. 2) i wouldn't worry about having a special package for the exceptions ... they've got exception in their name, no ones going to be confused. 3) I'm really not fond of "ParamNames.java" being a copy of the constants in "SolrParams.java", or XML.java being copied, or the xpp jar being duplicated ... it seems like we should just pull in those (compiled) classes at build time ... but that would require that the whole Solr tree be checked out, and there seems to be interest in making it possible to "svn checkout client/lang/impl" and build that in isolation ... perhaps we could use svn:externals to pull in specific utility classes and jars from other places in the tree? (although based on what I've read today, branching for releases would be hard since all of the svn:external props would have to be updated). what do people think in general about how the client code can/should/shouldn't depend on the core server code? 4) one thing we should really try to support in a client is executing query requests against non-standard request handlers ... handlers that might take in request params that we can't even imagine. The SolrQuery class has explicit setters for many of the params that the built in request handlers support, but there is no easy way for people t
[jira] Commented: (SOLR-103) SQL Upload Plugin
[ https://issues.apache.org/jira/browse/SOLR-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464643 ] Ryan McKinley commented on SOLR-103: check: http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/add/SQLUpdateHandler.java http://svn.lapnap.net/solr/handler-draft/solr/src/test/org/apache/solr/handler/SQLUpdateTest.java If you run the example from SOLR-104, mess with the parameters on: http://localhost:8983/solr/up.html > SQL Upload Plugin > - > > Key: SOLR-103 > URL: https://issues.apache.org/jira/browse/SOLR-103 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.2 >Reporter: Ryan McKinley > Fix For: 1.2 > > > Solr needs an easy way to upload lots of files directly from SQL. > See also: SOLR-66 (CSV uploader) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Closed: (SOLR-30) Java client code for performing searches against a Solr instance
[ https://issues.apache.org/jira/browse/SOLR-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man closed SOLR-30. Resolution: Duplicate > Java client code for performing searches against a Solr instance > > > Key: SOLR-30 > URL: https://issues.apache.org/jira/browse/SOLR-30 > Project: Solr > Issue Type: New Feature > Components: clients - java >Reporter: Philip Jacob >Priority: Minor > Attachments: solrsearcher-client.zip > > > Here are a few classes that connect to a Solr instance to perform searches. > Results are returned in a Response object. The Response encapsulates a > List> that gives you access to the key data in the results. > This is the main part that I'm looking for comments on. > There are 2 dependencies for this code: JDOM and Commons HttpClient. I'll > remove the JDOM dependency in favor of regular DOM at some point, but I think > that the HttpClient dependency is worthwhile here. There's a lot that can be > exploited with HttpClient that isn't demonstrated in this class. The purpose > here is mainly to get feedback on the API of SolrSearcher before I start > optimizing anything. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-104) Update Plugins
[ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464640 ] Ryan McKinley commented on SOLR-104: I just upload a new version of HandlerRefactoring-DRAFT-SRC.zip In addition to the 8 files above, also delete: src/java/org/apache/solr/request/SolrQueryResponse.java src/webapp/src/org/apache/solr/servlet/SolrServlet.java src/webapp/src/org/apache/solr/servlet/SolrUpdateServlet.java There is also a clean copy on: http://svn.lapnap.net/solr/handler-draft/solr/ This should be easier to install - or look at (without having to install) This version converts everything to use the new framework rather then keeping /select and /update on the old one. It also includes a draft proposal on how to deal with deal with GET vs POST body vs multipart content. It passes all the tests and seems to work exactly as before (with a few exceptions) * /update content is returned with a ResponseWriter * [my-BUG] I am unable to get some posted content to read its stream properly. I had to modify: http://svn.lapnap.net/solr/handler-draft/solr/example/exampledocs/post.sh to call: curl $URL --data-binary '' -H 'Content-type:text/xml;' rather then just: curl $URL --data-binary '' (any ideas?) - - - - - - - - I define three basic types of request handlers in: http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/ 1) standard. This gets everything from parameters (get or post) 2) posted. This gets a reader from the posted body: http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/PostedRequestHandler.java 3) multipart. This gets an iterator over each file item using commons-upload streaming API http://jakarta.apache.org/commons/fileupload/streaming.html http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/MultipartRequestHandler.java I *think* this takes care of every case... is anything missing? The [http://svn.lapnap.net/solr/handler-draft/solr/src/webapp/src/org/apache/solr/servlet/SolrRequestFilter.java RequestFilter] manages setting the reader or iterator for the proper handlers. When you run the example, i added the page http://localhost:8983/solr/up.html that should help you see a little of it in action. I added an example for each type: http://svn.lapnap.net/solr/handler-draft/solr/src/java/org/apache/solr/handler/add/ - - - This added: commons-io-1.2.jar mysql-connector-java-5.0.4.jar commons-fileupload-20070107.jar to the library. If we want to get rid of commons-io, I am only using IOUtils.java > Update Plugins > -- > > Key: SOLR-104 > URL: https://issues.apache.org/jira/browse/SOLR-104 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.2 >Reporter: Ryan McKinley > Fix For: 1.2 > > Attachments: HandlerRefactoring-DRAFT-SRC.zip, > HandlerRefactoring-DRAFT-SRC.zip, HandlerRefactoring.DRAFT.patch, > HandlerRefactoring.DRAFT.patch, HandlerRefactoring.DRAFT.zip > > > The plugin framework should work for 'update' actions in addition to 'search' > actions. > For more discussion on this, see: > http://www.nabble.com/Re%3A-Handling-disparate-data-sources-in-Solr-tf2918621.html#a8305828 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (SOLR-20) A simple Java client for updating and searching
[ https://issues.apache.org/jira/browse/SOLR-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-20: - Summary: A simple Java client for updating and searching (was: A simple Java client with Java APIs for add(), delete(), commit() and optimize().) (NOTE: revised summary since this issue has moved beyond just updating) I *finally* had a chance to look this over, here's a few comments in no particular order... 1) i like the name solrj, i think this code should definitely live in client/java/solrj so that there is the potential for other java client code that is independent (if nothing else, i suspect something like SOLR-86 might be handy) ... we should probably put solrj in the package name as well. 2) i wouldn't worry about having a special package for the exceptions ... they've got exception in their name, no ones going to be confused. 3) I'm really not fond of "ParamNames.java" being a copy of the constants in "SolrParams.java", or XML.java being copied, or the xpp jar being duplicated ... it seems like we should just pull in those (compiled) classes at build time ... but that would require that the whole Solr tree be checked out, and there seems to be interest in making it possible to "svn checkout client/lang/impl" and build that in isolation ... perhaps we could use svn:externals to pull in specific utility classes and jars from other places in the tree? (although based on what I've read today, branching for releases would be hard since all of the svn:external props would have to be updated). what do people think in general about how the client code can/should/shouldn't depend on the core server code? 4) one thing we should really try to support in a client is executing query requests against non-standard request handlers ... handlers that might take in request params that we can't even imagine. The SolrQuery class has explicit setters for many of the params that the built in request handlers support, but there is no easy way for people to build other queries. I think it might make sense if SolrQuery was an interface that just defined the methods needed by the SolrClient -- probably just getQueryString(). Then their can be a SimpleSolrQuery that has all of the setters in the current SolrQuery class, possibly using a general baseclass with an impl of getQueryString that uses some SolrParams... public class AbstractSolrQuery implements SolrQuery { protected abstract SolrParams getSolrParams(); public String getQueryString() { ... your current code, looping over getSolrParams() ... } } 5) what is the purpose of SolrClientStub ? 6) what is the purpose of SolrDocumentable being an empty interface? ... it seems like you could replace SolrDocumentable, SolrDocument, and SolrDocumented with something like this... public interface SolrDocument { public Map getSolrDocumentFields(); } public abstract class SolrDocumented implements SolrDocument { protected abstract SolrDocument getSolrDocument(); public Map getSolrDocumentFields() { return getSolrDocument().getSolrDocumentFields() } } Then you wouldn't need that instanceof code in SolrClientImpl Note that we should probably support field and document boosts as well, but field boosts don't really need to be specified in the Map since they apply to the whole field and not the individual values, so we could just add... public int getDocumentBoost(); public Map getFieldBoosts() ...to SolrDocument. 7) The ResultsParser and QueryResults classes seem to suffer the same limitation that i was mentioning about the SolrQuery class -- they assume a very specific response structure (only one doc list, an optional facet block, an optional highlighting block, an optional debug block) ... I think since the ResultsParser already understands the all of the various tags that are used, it should be easy to do this as long as the QueryResult object becomes a more general container that any named data can be shoved into (just like SolrQueryResponse is on the server side) ... then a "SimpleQueryResults" class could be written that had the convenience methods that make sense when using StandardRequestHandler or DisMaxRequestHandler. 8) There was a comment in SOLR-30 regarding the issue of that code only parsing the XML response ... i think it's completely practical to focus on client code which currently supports only the XmlResponseWriter output -- especially with the solrj ResultsParser class currently having a single public method... public QueryResults process( Reader reader ) throws SolrClientException, SolrServerException, XmlPullParserException, IOException ...i think if we removed XmlPullParserException from that list of exceptions (it could always be wrapped in a SolrClientException, or a new SolrClientParseException) we have a really simple API where other ResultParser classes could be writt
[jira] Updated: (SOLR-104) Update Plugins
[ https://issues.apache.org/jira/browse/SOLR-104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-104: --- Attachment: HandlerRefactoring-DRAFT-SRC.zip > Update Plugins > -- > > Key: SOLR-104 > URL: https://issues.apache.org/jira/browse/SOLR-104 > Project: Solr > Issue Type: Improvement > Components: update >Affects Versions: 1.2 >Reporter: Ryan McKinley > Fix For: 1.2 > > Attachments: HandlerRefactoring-DRAFT-SRC.zip, > HandlerRefactoring-DRAFT-SRC.zip, HandlerRefactoring.DRAFT.patch, > HandlerRefactoring.DRAFT.patch, HandlerRefactoring.DRAFT.zip > > > The plugin framework should work for 'update' actions in addition to 'search' > actions. > For more discussion on this, see: > http://www.nabble.com/Re%3A-Handling-disparate-data-sources-in-Solr-tf2918621.html#a8305828 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
client/ruby/flare/vender and svn:externals
I don't want to discourage all the cool Runyness that's going on right now ... but because of the svn:externals property set on client/ruby/flare/vender a simple "svn checkout" of the Solr trunk is pretty damn big. Perhaps it would be better if the Rails trunk was fetched as a a build time dependency of the Flare build file ... can you do svn operations in a Rakefile? [EMAIL PROTECTED]:~/lucene/solr$ du -sh . 77M . [EMAIL PROTECTED]:~/lucene/solr$ du -sh * 20K build.xml 16K CHANGES.txt 53M client 9.6Mexample 28K KEYS.txt 1.7Mlib 16K LICENSE.txt 4.0KNOTICE.txt 8.0KREADME.txt 2.4Msite 11M src [EMAIL PROTECTED]:~/lucene/solr$ du -sh client/ruby/flare/vendor/rails/ 39M client/ruby/flare/vendor/rails/ -Hoss
[jira] Commented: (SOLR-106) new facet params: facet.sort, facet.mincount, facet.offset
[ https://issues.apache.org/jira/browse/SOLR-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12464596 ] Andreas Hochsteger commented on SOLR-106: - What about a facet filter query to only return facets which match a certain query? In our case we use categories which are organized hierarchically via a path syntax. Documets have different categories attached which is indexed by a solr field. Now certain apps (which deal with these documets) are interested in a specific category subpath. If you use faceted searching you get all categories back - not only those in which the application is interested in. Currently we use a workaround which uses an additional solr search field that just contains the categories for the application, but this doesn't scale very well. Would it be possible to add an additional filter query for facets to limit the facets which are actually returned in the faceted search? Thoughts? > new facet params: facet.sort, facet.mincount, facet.offset > -- > > Key: SOLR-106 > URL: https://issues.apache.org/jira/browse/SOLR-106 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Yonik Seeley > Attachments: facet_params.patch > > > a couple of new facet params: > facet lists become pageable with facet.offset, facet.limit (idea from Erik) > facet.sort explicitly specifies sort order (true for count descending, false > for natural index order) > facet.mincount: minimum count for facets included in response (idea from JJ, > deprecate zeros) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira