Re: All facet.fields for a given facet.query?
On Tue, 2007-06-19 at 11:09 -0700, Chris Hostetter wrote: I solve this problem by having metadata stored in my index which tells my custom request handler what fields to facet on for each category ... How do you define this metadata? Cheers, Martin but i've also got several thousand categories. If you've got less then 100 categories, you could easily enumerate them all with default facet.field params in your solrconfig using seperate requesthandler instances. : What do the experts think about this? you may want to read up on the past discussion of this in SOLR-247 ... in particular note the link to the mail archive where there was assitional discussion about it as well. Where we left things is that it might make sense to support true globging in both fl and facet.field, so you can use naming conventions and say things like facet.field=facet_* but that in general trying to do something like facet.field=* would be a very bad idea even if it was supported. http://issues.apache.org/jira/browse/SOLR-247 -Hoss -- Martin Grotzke http://www.javakaffee.de/blog/ signature.asc Description: This is a digitally signed message part
Re: All facet.fields for a given facet.query?
On Tue, 2007-06-19 at 19:16 +0200, Thomas Traeger wrote: Hi, I'm also just at that point where I think I need a wildcard facet.field parameter (or someone points out another solution for my problem...). Here is my situation: I have many products of different types with totally different attributes. There are currently more than 300 attributes I use dynamic fields to import the attributes into solr without having to define a specific field for each attribute. Now when I make a query I would like to get back all facet.fields that are relevant for that query. I think it would be really nice, if I don't have to know which facets fields are there at query time, instead just import attributes into dynamic fields, get the relevant facets back and decide in the frontend which to display and how... Do you really need all facets in the frontend? Would it be a solution to have a facet ranking in the field definitions, and then decide at query time, on which fields to facet on? This would need an additional query parameter like facet.query.count. E.g. if you have a query with q=foo+AND+prop1:bar+AND+prop2:baz and you have fields prop1 with facet-ranking 100 prop2 with facet-ranking 90 prop3 with facet-ranking 80 prop4 with facet-ranking 70 prop5 with facet-ranking 60 then you might decide not to facet on prop1 and prop2 as you have already a constraint on it, but to facet on prop3 and prop4 if facet.query.count is 2. Just thinking about that... :) Cheers, Martin What do the experts think about this? Tom -- Martin Grotzke http://www.javakaffee.de/blog/ signature.asc Description: This is a digitally signed message part
Re: All facet.fields for a given facet.query?
Martin Grotzke schrieb: On Tue, 2007-06-19 at 19:16 +0200, Thomas Traeger wrote: Hi, I'm also just at that point where I think I need a wildcard facet.field parameter (or someone points out another solution for my problem...). Here is my situation: I have many products of different types with totally different attributes. There are currently more than 300 attributes I use dynamic fields to import the attributes into solr without having to define a specific field for each attribute. Now when I make a query I would like to get back all facet.fields that are relevant for that query. I think it would be really nice, if I don't have to know which facets fields are there at query time, instead just import attributes into dynamic fields, get the relevant facets back and decide in the frontend which to display and how... Do you really need all facets in the frontend? no, only the subset with matches for the current query. Would it be a solution to have a facet ranking in the field definitions, and then decide at query time, on which fields to facet on? This would need an additional query parameter like facet.query.count. E.g. if you have a query with q=foo+AND+prop1:bar+AND+prop2:baz and you have fields prop1 with facet-ranking 100 prop2 with facet-ranking 90 prop3 with facet-ranking 80 prop4 with facet-ranking 70 prop5 with facet-ranking 60 then you might decide not to facet on prop1 and prop2 as you have already a constraint on it, but to facet on prop3 and prop4 if facet.query.count is 2. Just thinking about that... :) Cheers, Martin One step after the other ;o), the ranking of the facets will be another problem I have to solve, counts of facets and matching documents will be a starting point. Another idea is to use the score of the documents returned by the query to compute a score for the facet.field... Tom
Re: All facet.fields for a given facet.query?
On Wed, 2007-06-20 at 12:59 +0200, Thomas Traeger wrote: Martin Grotzke schrieb: On Tue, 2007-06-19 at 19:16 +0200, Thomas Traeger wrote: [...] I think it would be really nice, if I don't have to know which facets fields are there at query time, instead just import attributes into dynamic fields, get the relevant facets back and decide in the frontend which to display and how... Do you really need all facets in the frontend? no, only the subset with matches for the current query. ok, that's somehow similar to our requirement, but we want to get only e.g. the first 5 relevant facets back from solr and not handle this in the frontend. Would it be a solution to have a facet ranking in the field definitions, and then decide at query time, on which fields to facet on? This would need an additional query parameter like facet.query.count. [...] One step after the other ;o), the ranking of the facets will be another problem I have to solve, counts of facets and matching documents will be a starting point. Another idea is to use the score of the documents returned by the query to compute a score for the facet.field... Yep, this is also different for different applications. I'm also interested in this problem and would like to help solving this problem (though I'm really new to lucene and solr)... Cheers, Martin Tom -- Martin Grotzke http://www.javakaffee.de/blog/ signature.asc Description: This is a digitally signed message part
Re: problems getting data into solr index
Mike is talking about solr.py, the python script, I'm talking about Solr itself. I think your problem is in the former. You should play around with unicode in python for awhile. Remember that your terminal itself probably doesn't support utf-8, the biggest problem I run into is doing print utf8string Python forces you to be good about this stuff, but it's a steep climb. Google for python unicode and read the various tutorials to get a handle on it. -b On Jun 20, 2007, at 9:38 AM, vanderkerkoff wrote: Hello Mike, Brian My brain is approcahing saturation point and I'm reading these two opinoins as opposing each other. I'm sure I'm reading it incorrectly, but they seem to contradict each other. Are they? Brian Whitman wrote: Solr has no problems with proper utf8 and you don't need to do anything special to get it to work. Check out the newer solr.py in JIRA. Mike Klaas wrote: Perhaps this is why: solr.py expects unicode. You can pass it ascii, and it will transparently convert to unicode fine because that is the default codec. If you end up with utf-8, it will try to convert to unicode using the ascii codec and fail. -- View this message in context: http://www.nabble.com/problems- getting-data-into-solr-index-tf3915542.html#a11213488 Sent from the Solr - User mailing list archive at Nabble.com. -- http://variogr.am/ [EMAIL PROTECTED]
snapinstaller safety
Hi, Looking at src/scripts/snapinstaller more closely, I saw this block of code: # install using hard links into temporary directory # remove original index and then atomically copy new one into place logMessage installing snapshot ${name} cp -lr ${name}/ ${data_dir}/index.tmp$$ /bin/rm -rf ${data_dir}/index mv -f ${data_dir}/index.tmp$$ ${data_dir}/index Is there a technical reason why this wasn't written as: logMessage installing snapshot ${name} cp -lr ${name}/ ${data_dir}/index.tmp$$ \ /bin/rm -rf ${data_dir}/index \ mv -f ${data_dir}/index.tmp$$ ${data_dir}/index This feels a little safer to me - I'd hate to have the main index rm -rf-ed if the cp -lr command failed for some reason (e.g. disk full), but maybe Bill Au Co. have a good reason for not using 's. There may be other places in various scripts that this might be applicable to, but this is the first place I saw the extra safety possibility. Thanks, Otis
Slave/Master swap
Hi, I saw https://issues.apache.org/jira/browse/SOLR-265 (Make IndexSchema updateable in live system) which made me think of something I wished for a while back. Having a single Solr Master and a couple of Solr Slaves is a common setup. If any of the Slaves fails, a decent LB knows not to talk to it until it's back up. What happens when the single Solr Master fails? One (cheap) way to deal with that might be to promote one of the Solr Slaves to the new Master role. If the snapshooter script is called manually on the Master, the appropriate monitoring tools would need to start the same calls on the new Master (former Slave) box. But if the snapshooter is configured via solrconfig.xml to run after commit and/or optimize, we'd have to swap solrconfig.xml and restart Solr on the ex-Slave to make it the new Master (and also make some changes in the LB VIPs, most likely). I'm wondering if there are slicker ways to do this, ways that would minimize the downtime, for instance. Perhaps, just like Will Johnson is trying to make IndexSchema updateable in a live system, the snapshooter could be turned on/off programatically, say via a special request handler. Thanks, Otis
Re: SolrSharp example
Hi Michael - Moving this conversations to the general solr mailing list... 1. SolrSharp example solution works with schema.xml from apache-solr-1.1.0-incubating.If I'm using schema.xml from apache-solr-1.2.0 example program doesn't update index... I didn't realize the solr 1.2 release code sample schema.xml was different from the solr 1.1 version. In my implementation, I had solr 1.1 already installed and upgraded to 1.2 by replacing the war file (per the instructions in solr.) So, the example code is geared to go against the 1.1schema. For the example code, adding the timestamp field in the ExampleIndexDocument public constructor such as: this.Add(new IndexFieldValue(timestamp, DateTime.Now.ToString (s)+Z))); will take care of the solr 1.2 schema invalidation issue. The addition of the @default attribute on this field in the schema is not presently accommodated in the validation routine. If I'm not mistaken, the default attribute value will be applied for all documents without that field present in the xml payload. This would imply that any field with a default attribute is not required for any implemented UpdateIndexDocument. I'll look into this further. 2. When I run example with schema.xml from apache-solr-1.1.0-incubating program throw Exception Hmmm, can't really help you with this one. It sounds as if solr is incurring an error when the xml is posted to the server. Try the standard step-through troubleshooting routines to see what messages are being passed back from the server. -- j On 6/19/07, Michael Plax [EMAIL PROTECTED] wrote: Hello Jeff, thank you again for updating files. I just run with some problems. I don't know what is the best way to report them solr maillist/solrsharp jira. 1. SolrSharp example solution works with schema.xml from apache-solr-1.1.0-incubating. If I'm using schema.xml from apache-solr-1.2.0 example program doesn't update index because: line 33: if (solrSearcher.SolrSchema.IsValidUpdateIndexDocument(iDoc)) return false. update falls because of configuration file schema.xml file: line 265: field name=word type=string indexed=true stored=true/ ... line 279:field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/ those fields word, timestamp don't pass validation in SolrSchema.csline 217. 2. When I run example with schema.xml from apache-solr-1.1.0-incubating program throw Exception System.Exception was unhandled Message=Http error in request/response to http://localhost:8983/solr/update/; Source=SolrSharp StackTrace: at org.apache.solr.SolrSharp.Configuration.SolrSearcher.WebPost(String url, Byte[] bytesToPost, String statusDescription) in E:\SOLR-CSharp\src\Configuration\SolrSearcher.cs:line 229 at org.apache.solr.SolrSharp.Update.SolrUpdater.PostToIndex(IndexDocument oDoc, Boolean bCommit) in E:\SOLR-CSharp\src\Update\SolrUpdater.cs:line 70 at SolrSharpExample.Program.Main(String[] args) in E:\SOLR-CSharp\example\Program.cs:line 35 at System.AppDomain.nExecuteAssembly(Assembly assembly, String[] args) at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args) at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly () at System.Threading.ThreadHelper.ThreadStart_Context(Object state) at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.ThreadHelper.ThreadStart() xmlstring value from oDoc.SerializeToString() ?xml version=\1.0\ encoding=\utf-8\?add xmlns:xsi=\ http://www.w3.org/2001/XMLSchema-instance\http://www.w3.org/2001/XMLSchema-instance%5C xmlns:xsd=\http://www.w3.org/2001/XMLSchema\;docfieldhttp://www.w3.org/2001/XMLSchema%5C%22%3E%3Cdoc%3E%3Cfieldname=\id\101/fieldfield name=\name\One oh one/fieldfield name=\manu\Sony/fieldfield name=\cat\Electronics/fieldfield name=\cat\Computer/fieldfield name=\features\Good/fieldfield name=\features\Fast/fieldfield name=\features\Cheap/fieldfield name=\includes\USB cable/fieldfield name=\weight\1.234/fieldfield name=\price\99.99/fieldfield name=\popularity\1/fieldfield name=\inStock\True/field/doc/add I checked all features from Solr tutorial, they are working. I'm running solr on Windows XP Pro without firewall. Do you know how to solve those problems? Do you recommend to handle all communication by maillist/jira ? Regards Michael
page rank
Hello folks, I am using solr to index web contents. I want to know is that possible to tell solr about rank information of contents? For example, I give each content an integer number. And I hope solr take this number into consideration when it generates search result. (larger number, more priority) Best Regards, David
Re: page rank
Hi David. Yes you can. Just define a field as a slong type field: field name=numberField type=slong / It can be used to sort (sort=numberField desc) or to boost your score (it will depend on the RequestHandler you are going to use). In terms of score which RequestHandler are you planning to use? If using dismax you can define a boost function: recip(rord(numberField),1,1000,1000) I hope it helps. Regards, Daniel Alheiros On 20/6/07 16:47, David Xiao [EMAIL PROTECTED] wrote: Hello folks, I am using solr to index web contents. I want to know is that possible to tell solr about rank information of contents? For example, I give each content an integer number. And I hope solr take this number into consideration when it generates search result. (larger number, more priority) Best Regards, David http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Re: Slave/Master swap
: I'm wondering if there are slicker ways to do this, ways that would : minimize the downtime, for instance. Perhaps, just like Will Johnson is : trying to make IndexSchema updateable in a live system, the snapshooter : could be turned on/off programatically, say via a special request : handler. and easy way to do that would be to modify the configuration of RunExecutableListener in the solrconfig.xml to execute a wrapper script around snapshooter that only runs it if a flag file exists on disk. the problem is there are other things you typically want differnet between a master and a slave ... uses of the QuerySenderListener (it could also be modified to check for flag file i suppose)m cache sizes, and cache autowarming. -Hoss
Re: Faceted Search!
: Thanks Chris for replying my question. So I'm thinking about using a : CMS and when somebody publishes a page in CMS, I would generated this : well structure XML file and feed that xml to Solr to generate the index : on those data. Then, I can simply do faceted search using the correct : Lucene query format, rite? Do you have any other ideas or comment on my : CMS approach? that sounds fine ... as long as you have well structured data and you aren't trying to extract it from unstructured HTML. -Hoss
Re: All facet.fields for a given facet.query?
: to make it clear, i agree that it doesn't make sense faceting on all : available fields, I only want faceting on those 300 attributes that are : stored together with the fields for full text searches. A : product/document has typically only 5-10 attributes. : : I like to decide at index time which attributes of a product might be of : interest for faceting and store those in dynamic fields with the : attribute-name and some kind of prefix or suffix to identify them at : query time as facet.fields. Exactly the naming convention you mentioned. but if the facet fields are different for every document, and they use a simple dynamicField prefix (like facet_* for example) how do you know at query time which fields to facet on? ... even if wildcards work in facet.field, usingfacet.field=facet_* would require solr to compute the counts for *every* field matching that pattern to find out which ones have positive counts for the current result set -- there may only be 5 that actually matter, but it's got to try all 300 of them to find out which 5 that is. this is where custom request handlers that understand that faceting metadata for your documents becomes key ... so you can say when querying across the entire collection, only try to facet on category and manufacturer. if the search is constrained by category, then lookup other facet options to offer based on that category name from our metadata store, etc... -Hoss
Re: Multi-language indexing and searching
: So far it sounds good for my needs, now I'm going to try if my other : features still work (I'm worried about highlighting as I'm going to return a : different field)... i'm not really a highlighting guy so i'm not sure ... but if you're okay with *simple* highlighting you can probably just highlight your title field (using a whitespace analyzer or something) and get decent results without needing to worry about the fact that you are using differnet langauges. -Hoss
Re: Faceted Search!
Hi Chris, thank you for the reply. I was reading other posting regarding faceted search and seems like they are using the filtering capability of Lucene for that. If that the case, can we have control over the label of categories? For example: in shopper.com when we search for camera gives us the cluster by price, pixal, manufacture and so on. and if we are feeding the xml file to Solr server for faceted search, how can we define the sub-categories. let's say from the above example, the category price has different sub-categories like less than 100 ,100-200? I'm guessing, we explicit define this in XML feed file, but I could be very wrong. In any case, can you please give me the short example achieve that implementation. Well, thanks once again. Cheers, Niraj Chris Hostetter [EMAIL PROTECTED] wrote: : Thanks Chris for replying my question. So I'm thinking about using a : CMS and when somebody publishes a page in CMS, I would generated this : well structure XML file and feed that xml to Solr to generate the index : on those data. Then, I can simply do faceted search using the correct : Lucene query format, rite? Do you have any other ideas or comment on my : CMS approach? that sounds fine ... as long as you have well structured data and you aren't trying to extract it from unstructured HTML. -Hoss - Food fight? Enjoy some healthy debate in the Yahoo! Answers Food Drink QA.
Re: SolrSharp example
On 6/20/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote: This is a log that I got after runnning SolrSharp example. I think example program posts not properly formatted xml. I'm running Solr on Windows XP, Java 1.5. Are those settings could be the problem? Solr1.2 is pickier about the Content-type in the HTTP headers. I bet it's being set incorrectly. Ahh, good point. Within SolrSearcher.cs, the WebPost method contains this setting: oRequest.ContentType = application/x-www-form-urlencoded; Looking through the CHANGES.txt file in the 1.2 tagged release on svn: 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using the new request dispatcher (SOLR-104). This requires posted content to have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'. The response format matches that of /select and returns standard error codes. To enable solr1.1 style /update, do not map /update to any handler in solrconfig.xml (ryan) For SolrSearcher.cs, it sounds as though changing the ContentType setting to text/xml may fix this issue. I don't have a 1.2 instance to test this against available to me right now, but can check this later. Michael, try updating your SolrSearcher.cs file for this content-type setting to see if that resolves your issue. thanks, jeff r.
Re: problems getting data into solr index
On 20-Jun-07, at 6:38 AM, vanderkerkoff wrote: Hello Mike, Brian My brain is approcahing saturation point and I'm reading these two opinoins as opposing each other. I'm sure I'm reading it incorrectly, but they seem to contradict each other. Are they? solr.py takes unicode and encodes it as utf-8 to send to Solr. -Mike
Re: Faceted Search!
: define the sub-categories. let's say from the above example, the : category price has different sub-categories like less than 100 : ,100-200? I'm guessing, we explicit define this in XML feed file, but : I could be very wrong. In any case, can you please give me the short : example achieve that implementation. Well, thanks once again. there's nothing out of the box from Solrthat will do this, it's something you would need to implement either in the lcient or in a custom request handler ... Solr's Simple Faceting support is esigned to be just that: simple. but the underlying methods/mechanisms of computing DocSet intersetions can be used by any custom requets handler to generate application specific results. I've got 3 or 4 indexes that use the out of the box SimpleFacet support Solr provides, but the major faceting we do (product based facets) all uses custom request handlers so we can have very exact control on all of this kind of stuff driven by our data management tools. -Hoss
Re: Slave/Master swap
Hi, Yes, I thought of flag file + wrapper script tricks, but that didn't sound super elegant either, and the other differences in behaviour between master and slave are also true. H, I've always wanted to try DRDB (http://www.drbd.org/). Master-(Master+Slaves) replication via DRDB? I imagine it would be expensive... So if I want to turn a Slave into a Master, the best thing to do is to swap solrconfigs and restart the ex-Slave to turn it into a Master. The more expensive solution might be to have Solr instances run on top of a SAN and then one could really have multiple Master instances, one in stand-by mode and ready to be started as the new Master if the current Master decides to go on vacation. Any flaws there? Out of curiosity, how does CNet handle Master redundancy? Otis - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, June 20, 2007 9:40:51 PM Subject: Re: Slave/Master swap : I'm wondering if there are slicker ways to do this, ways that would : minimize the downtime, for instance. Perhaps, just like Will Johnson is : trying to make IndexSchema updateable in a live system, the snapshooter : could be turned on/off programatically, say via a special request : handler. and easy way to do that would be to modify the configuration of RunExecutableListener in the solrconfig.xml to execute a wrapper script around snapshooter that only runs it if a flag file exists on disk. the problem is there are other things you typically want differnet between a master and a slave ... uses of the QuerySenderListener (it could also be modified to check for flag file i suppose)m cache sizes, and cache autowarming. -Hoss
Re: Slave/Master swap
: The more expensive solution might be to have Solr instances run on top : of a SAN and then one could really have multiple Master instances, one : in stand-by mode and ready to be started as the new Master if the i *believe* that if you have two solr isntances pointed at the same physical data directory (SAN or otherwise) but you only send update/commit commands to one, they won't interfere with eachother. so concievable you can have both masters up and running and your failover approach if the primary goes down is just to start sending updates to the secondary. you'll loose any unflushed changes that hte primary had in memory, but those are lost anyway. don't trust me on that though, test it out yourself. : curiosity, how does CNet handle Master redundancy? I don't know how much i'm allowed to talk about our processes and systems for redundency, disastery recovery, fallover, etc... but i don't think i'll upset anyone if i tell you: as far as i know, we've never needed to take advantage of them with a solr master. ie: we've never had a solr master crash so hard we had to bring up another one in it's place ... knock on wood. (that probably has more to do with having good hardware then anything else though). (and no, i honestly don't know what hardware we use ... i don't bother paying attention, i let hte hardware guys worry about that) -Hoss
Re: Slave/Master swap
Right, that SAN con 2 Masters sounds good. Lucky you with your lonely Master! Where I work hw failures are pretty common. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, June 20, 2007 11:43:02 PM Subject: Re: Slave/Master swap : The more expensive solution might be to have Solr instances run on top : of a SAN and then one could really have multiple Master instances, one : in stand-by mode and ready to be started as the new Master if the i *believe* that if you have two solr isntances pointed at the same physical data directory (SAN or otherwise) but you only send update/commit commands to one, they won't interfere with eachother. so concievable you can have both masters up and running and your failover approach if the primary goes down is just to start sending updates to the secondary. you'll loose any unflushed changes that hte primary had in memory, but those are lost anyway. don't trust me on that though, test it out yourself. : curiosity, how does CNet handle Master redundancy? I don't know how much i'm allowed to talk about our processes and systems for redundency, disastery recovery, fallover, etc... but i don't think i'll upset anyone if i tell you: as far as i know, we've never needed to take advantage of them with a solr master. ie: we've never had a solr master crash so hard we had to bring up another one in it's place ... knock on wood. (that probably has more to do with having good hardware then anything else though). (and no, i honestly don't know what hardware we use ... i don't bother paying attention, i let hte hardware guys worry about that) -Hoss
Re: All facet.fields for a given facet.query?
Chris Hostetter schrieb: : to make it clear, i agree that it doesn't make sense faceting on all : available fields, I only want faceting on those 300 attributes that are : stored together with the fields for full text searches. A : product/document has typically only 5-10 attributes. : : I like to decide at index time which attributes of a product might be of : interest for faceting and store those in dynamic fields with the : attribute-name and some kind of prefix or suffix to identify them at : query time as facet.fields. Exactly the naming convention you mentioned. but if the facet fields are different for every document, and they use a simple dynamicField prefix (like facet_* for example) how do you know at query time which fields to facet on? ... even if wildcards work in facet.field, usingfacet.field=facet_* would require solr to compute the counts for *every* field matching that pattern to find out which ones have positive counts for the current result set -- there may only be 5 that actually matter, but it's got to try all 300 of them to find out which 5 that is. I just made a quick test by building a facet query with those 300 attributes. I realized, that the facets are build out of the whole index, not the subset returned by the initial query. Therefore I have a large number of empty facets which I simply ignore. In my case the QueryTime is somewhat higher (of course) but it is still at some milliseconds. (wow!!!) :o) So at this state of my investigation and in my use case I don't have to worry about performance even if I use the system in a way that uses more resources than necessary. this is where custom request handlers that understand that faceting metadata for your documents becomes key ... so you can say when querying across the entire collection, only try to facet on category and manufacturer. if the search is constrained by category, then lookup other facet options to offer based on that category name from our metadata store, etc... Faceting on manufacturers and categories first and than present the corresponding facets might be used under some circumstances, but in my case the category structure is quite deep, detailed and complex. So when the user enters a query I like to say to him Look, here are the manufacturers and categories with matches to your query, choose one if you want, but maybe there is another one with products that better fit your needs or products that you didn't even know about. So maybe you like to filter based on the following attributes. Something like this ;o) The point is, that i currently don't want to know too much about the data, I just want to feed it into solr, follow some conventions and get the most out of it as quickly as possible. Optimizations can and will take place at a later time. I hope to find some time to dig into solr SimpleFacets this weekend. Regards, Tom
Re: page rank
Also if you are using the standard request handler you can use the val hack: foo:bar _val_:recip(rord(numberField),1,1000,1000) You can find more info about this here: http://wiki.apache.org/solr/FunctionQuery -Nick On 6/21/07, Daniel Alheiros [EMAIL PROTECTED] wrote: Hi David. Yes you can. Just define a field as a slong type field: field name=numberField type=slong / It can be used to sort (sort=numberField desc) or to boost your score (it will depend on the RequestHandler you are going to use). In terms of score which RequestHandler are you planning to use? If using dismax you can define a boost function: recip(rord(numberField),1,1000,1000) I hope it helps. Regards, Daniel Alheiros On 20/6/07 16:47, David Xiao [EMAIL PROTECTED] wrote: Hello folks, I am using solr to index web contents. I want to know is that possible to tell solr about rank information of contents? For example, I give each content an integer number. And I hope solr take this number into consideration when it generates search result. (larger number, more priority) Best Regards, David http://www.bbc.co.uk/ This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Re: Multiple doc types in schema
This sounds like a potentially good use-case for SOLR-215! See https://issues.apache.org/jira/browse/SOLR-215 Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: solr-user@lucene.apache.org; Jack L [EMAIL PROTECTED] Sent: Wednesday, June 6, 2007 6:58:10 AM Subject: Re: Multiple doc types in schema : This is based on my understanding that solr/lucene does not : have the concept of document type. It only sees fields. : : Is my understanding correct? it is. : It seems a bit unclean to mix fields of all document types : in the same schema though. Or, is there a way to allow multiple : document types in the schema, and specify what type to use : when indexing and searching? it's really just an issue of semantics ... the schema.xml is where you list all of the fields you need in your index, any notion of doctype is entire artificial ... you could group all of the fields relating to doctypeA in one section of the schema.xml, then have a big !-- ##...## -- line and then list the fields in doctypeB, etc... but wat if there are fields you use in both doctypes ? .. how much you mix them is entirely up to you. -Hoss
Rejecting fields with null values
I'm not sure if this is possible or not, but, is there a way to do a search and reject fields that are empty or have null values like the pseudo code below? ?q=test+AND+(NOT+field_b:NULL) If this is not currently supported, does anyone think this is not a god idea to be implemented? Thanks, -- Thiago Jackiw acts_as_solr = http://acts-as-solr.railsfreaks.com
Re: RAMDirecotory instead of FSDirectory for SOLR
Hi Jeryl, Three weeks later - any luck with Solr + Terracotta? Thanks, Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Jeryl Cook [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, June 1, 2007 3:59:21 AM Subject: RE: RAMDirecotory instead of FSDirectory for SOLR i have Terracotta to work with Lucene , and it works find with the RAMDirectory...i am trying to get it to work with SOLR(Hook the RAMDirectory..)..., when i do, ill post the findings,problems,etc..Thanks for feedback from everyone.Jeryl Cook /^\ Pharaoh /^\ http://pharaohofkush.blogspot.com/ ..Act your age, and not your shoe size.. -Prince(1986) Date: Thu, 31 May 2007 18:24:26 -0700 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: RE: RAMDirecotory instead of FSDirectory for SOLR Jeryl, If you need any help getting Terracotta to work under Lucene or if you have any questions about performance tuning and/or load testing, you can also use the Terracotta community resources (mailing lists, forums, IRC, whatnot): http://www.terracotta.org/confluence/display/orgsite/Community. We'd be more than happy to help you get this stuff working. Cheers, Orion Jeryl Cook wrote:Thats the thing,Terracotta persists everything it has in memory to the disk when it overflows(u can set how much u want to use in memory), or when the server goes offline. When the server comes back the master terracotta simply loads it back into the memory of the once offline worker..identical to the approach SOLR already does to handle scalability, this allows unlimited storage of the items in memory, ... you just need to cluster the RAMDirectory according to the sample giving by TerracottaHowever i read some of the post here...I read some say: i wonder how performance will be.,etci was trying to get it working..andload test the hell out it, and see how it acts with large amounts of data, and how it ompares with SOLR using typical FSDirectory approach.i plan to post findings..Jeryl Cook /^\ Pharaoh /^\ http://pharaohofkush.blogspot.com/ ..Act your age, and not your shoe size..-Prince(1986) Date: Thu, 31 May 2007 13:51:53 -0700 From: [EMAIL PROTECTED] To: solr-user@lucene.apache.org Subject: RE: RAMDirecotory instead of FSDirectory for SOLR : board, looks like i can achieve this with the embedded version of SOLR : uses the lucene RAMDirectory to store the index..Jeryl Cook yeah ... adding asolrconfig.xml option for using a RAMDirectory would be possible ... but almost meaningless for most people (the directory would go away when the server shuts down) ... even for use cases like what you describe (hooking in terrecota) it wouldn't be enough in itself, because there would be no hook to give terracota access to it. -Hoss -- View this message in context: http://www.nabble.com/RAMDirecotory-instead-of-FSDirectory-for-SOLR-tf3843377.html#a10905062 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Rejecting fields with null values
: I'm not sure if this is possible or not, but, is there a way to do a : search and reject fields that are empty or have null values like the : pseudo code below? As an inverted index, the Lucene index Solr uses doesn't know when documents have an empty value ... it stores the inverted mapping of value=documents, so there is no way to query for field_b:NULL, let alone NOT field_b:bull you can however query forthings like: field_b:[* TO *] which requres field_b to have some value (that seems to be the use case you are after) as a general rule, if you really want to be abel to support searches for rhings like find all docs wher there is no value in field X the easiest way to achieve something like that in Solr is to configure the field with a default value in the schema ... something that would never normally appear in your data (a placeholder for 'null' so to speak) and query on that. -Hoss
Re: All facet.fields for a given facet.query?
On Wed, 2007-06-20 at 12:49 -0700, Chris Hostetter wrote: : I solve this problem by having metadata stored in my index which tells : my custom request handler what fields to facet on for each category ... : How do you define this metadata? this might be a good place to start, note that this message is almost two years old, and predates the opensourcing of Solr ... the Servlet refered to in this thread is Solr. http://www.nabble.com/Announcement%3A-Lucene-powering-CNET.com-Product-Category-Listings-p748420.html ...i think i also talked a bit about the metadata documents in my apachecon slides from last yera ... but i don't really remember, and i haven't look at them in a while... http://people.apache.org/~hossman/apachecon2006us/ thx, I'll have a look at these resources. cheers, martin -Hoss signature.asc Description: This is a digitally signed message part
Re: All facet.fields for a given facet.query?
: I realized, that the facets are build out of the whole index, not the : subset : returned by the initial query. Therefore I have a large number of empty : facets which I simply ignore. In my case the QueryTime is somewhat facet.mincount is a way to tell solr not to bother giving you those 0 counts ... you sill still get the name of hte field though so that you know it tried it. : Faceting on manufacturers and categories first and than present the : corresponding facets might be used under some circumstances, but in my case : the category structure is quite deep, detailed and complex. So when : the user enters a query I like to say to him Look, here are the : manufacturers and categories with matches to your query, choose one if you : want, but maybe there is another one with products that better fit your : needs or products that you didn't even know about. So maybe you like to : filter based on the following attributes. Something like this ;o) categories was just an example i used because it tends to be a common use case ... my point is the decision about which facet qualifies for the maybe there is another one with products that better fit your needs part of the response either requires computing counts for *every* facet constraint and then looking at them to see which ones provide good distribution, or by knowing something more about your metadata (ie: having stats that show the majority of people who search on the word canon want to facet on megapixels) .. this is where custom biz logic comes in, becuase in a lot of situations computing counts for every possible facet may not be practical (even if the syntax to request it was easier) -Hoss
Re: Rejecting fields with null values
Keep in mind filters too... they can be much more efficient if used often: ?q=testfq=field_b:[* TO *] -Yonik On 6/20/07, Thiago Jackiw [EMAIL PROTECTED] wrote: Hoss, As an inverted index, the Lucene index Solr uses doesn't know when documents have an empty value ... it stores the inverted mapping of value=documents, so there is no way to query for field_b:NULL, let alone NOT field_b:bull I see what you mean. I guess searching for fields that require to have a value like the way you explained is a good way to go. Thanks! -- Thiago Jackiw acts_as_solr = http://acts-as-solr.railsfreaks.com On 6/20/07, Chris Hostetter [EMAIL PROTECTED] wrote: : I'm not sure if this is possible or not, but, is there a way to do a : search and reject fields that are empty or have null values like the : pseudo code below? As an inverted index, the Lucene index Solr uses doesn't know when documents have an empty value ... it stores the inverted mapping of value=documents, so there is no way to query for field_b:NULL, let alone NOT field_b:bull you can however query forthings like: field_b:[* TO *] which requres field_b to have some value (that seems to be the use case you are after) as a general rule, if you really want to be abel to support searches for rhings like find all docs wher there is no value in field X the easiest way to achieve something like that in Solr is to configure the field with a default value in the schema ... something that would never normally appear in your data (a placeholder for 'null' so to speak) and query on that. -Hoss
Re: All facet.fields for a given facet.query?
On 6/20/07, Chris Hostetter [EMAIL PROTECTED] wrote: facet.mincount is a way to tell solr not to bother giving you those 0 counts ... An aside: shouldn't that be the default? All of the people using facets that I have seen always have to set facet.mincount=1 (or facet.zeros=false) -Yonik
RE: Faceted Search!
Niraj: What environment are you using? SQL Server/.NET/Windows? or something else? -Mike -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 20, 2007 4:24 PM To: solr-user@lucene.apache.org Subject: Re: Faceted Search! : define the sub-categories. let's say from the above example, the : category price has different sub-categories like less than 100 : ,100-200? I'm guessing, we explicit define this in XML feed file, but : I could be very wrong. In any case, can you please give me the short : example achieve that implementation. Well, thanks once again. there's nothing out of the box from Solrthat will do this, it's something you would need to implement either in the lcient or in a custom request handler ... Solr's Simple Faceting support is esigned to be just that: simple. but the underlying methods/mechanisms of computing DocSet intersetions can be used by any custom requets handler to generate application specific results. I've got 3 or 4 indexes that use the out of the box SimpleFacet support Solr provides, but the major faceting we do (product based facets) all uses custom request handlers so we can have very exact control on all of this kind of stuff driven by our data management tools. -Hoss
Re: Slave/Master swap
If just one master or one slave server fail, i think u maybe can use master index server. shell controlled by program is easy for me. i use php and shell_exec. 2007/6/21, Otis Gospodnetic [EMAIL PROTECTED]: Right, that SAN con 2 Masters sounds good. Lucky you with your lonely Master! Where I work hw failures are pretty common. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, June 20, 2007 11:43:02 PM Subject: Re: Slave/Master swap : The more expensive solution might be to have Solr instances run on top : of a SAN and then one could really have multiple Master instances, one : in stand-by mode and ready to be started as the new Master if the i *believe* that if you have two solr isntances pointed at the same physical data directory (SAN or otherwise) but you only send update/commit commands to one, they won't interfere with eachother. so concievable you can have both masters up and running and your failover approach if the primary goes down is just to start sending updates to the secondary. you'll loose any unflushed changes that hte primary had in memory, but those are lost anyway. don't trust me on that though, test it out yourself. : curiosity, how does CNet handle Master redundancy? I don't know how much i'm allowed to talk about our processes and systems for redundency, disastery recovery, fallover, etc... but i don't think i'll upset anyone if i tell you: as far as i know, we've never needed to take advantage of them with a solr master. ie: we've never had a solr master crash so hard we had to bring up another one in it's place ... knock on wood. (that probably has more to do with having good hardware then anything else though). (and no, i honestly don't know what hardware we use ... i don't bother paying attention, i let hte hardware guys worry about that) -Hoss -- regards jl
Re: SolrSharp example
Hello, Yonik and Jeff thank you for your help. You are right this was content-type issue. in order to run example following things need to be done: 1.Code (SolrSharp) should be changed from: src\Configuration\SolrSearcher.cs(217):oRequest.ContentType = application/x-www-form-urlencoded; to: src\Configuration\SolrSearcher.cs(217):oRequest.ContentType = text/xml; 2. In order take care of the solr 1.2 schema invalidation issue: schema.xml comment line: 265 !-- field name=word type=string indexed=true stored=true/-- comment line: 279 !-- field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/-- or as Jeff suggested: For the example code, adding the timestamp field in the ExampleIndexDocument public constructor such as: this.Add(new IndexFieldValue(timestamp, DateTime.Now.ToString(s)+Z))); Regards Michael - Original Message - From: Jeff Rodenburg [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, June 20, 2007 1:56 PM Subject: Re: SolrSharp example On 6/20/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote: This is a log that I got after runnning SolrSharp example. I think example program posts not properly formatted xml. I'm running Solr on Windows XP, Java 1.5. Are those settings could be the problem? Solr1.2 is pickier about the Content-type in the HTTP headers. I bet it's being set incorrectly. Ahh, good point. Within SolrSearcher.cs, the WebPost method contains this setting: oRequest.ContentType = application/x-www-form-urlencoded; Looking through the CHANGES.txt file in the 1.2 tagged release on svn: 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using the new request dispatcher (SOLR-104). This requires posted content to have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'. The response format matches that of /select and returns standard error codes. To enable solr1.1 style /update, do not map /update to any handler in solrconfig.xml (ryan) For SolrSearcher.cs, it sounds as though changing the ContentType setting to text/xml may fix this issue. I don't have a 1.2 instance to test this against available to me right now, but can check this later. Michael, try updating your SolrSearcher.cs file for this content-type setting to see if that resolves your issue. thanks, jeff r.
Re: Multiple doc types in schema
I see SOLR-215 from this mail. Does it now really support multi index and search it will return merged data? for example: i wanna search: aaa, and i have index1, index2, index3, index4it should return the result from index1,index2,index3, index4 and merge result by score, datetime, or other thing. Does it support NFS and how its performance? 2007/6/21, Otis Gospodnetic [EMAIL PROTECTED]: This sounds like a potentially good use-case for SOLR-215! See https://issues.apache.org/jira/browse/SOLR-215 Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: solr-user@lucene.apache.org; Jack L [EMAIL PROTECTED] Sent: Wednesday, June 6, 2007 6:58:10 AM Subject: Re: Multiple doc types in schema : This is based on my understanding that solr/lucene does not : have the concept of document type. It only sees fields. : : Is my understanding correct? it is. : It seems a bit unclean to mix fields of all document types : in the same schema though. Or, is there a way to allow multiple : document types in the schema, and specify what type to use : when indexing and searching? it's really just an issue of semantics ... the schema.xml is where you list all of the fields you need in your index, any notion of doctype is entire artificial ... you could group all of the fields relating to doctypeA in one section of the schema.xml, then have a big !-- ##...## -- line and then list the fields in doctypeB, etc... but wat if there are fields you use in both doctypes ? .. how much you mix them is entirely up to you. -Hoss -- regards jl
Re: SolrSharp example
Thanks for checking, Michael -- great find. I'm in process of readying this same fix for inclusion in the source code (I'm verifying against a full 1.2install.) The SolrField class is now also being extended to incorporate an IsDefaulted property, which will permit the SolrSchema.IsValidUpdateIndexDocument to yield true when default value fields aren't present in the update request. thanks, jeff r. On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote: Hello, Yonik and Jeff thank you for your help. You are right this was content-type issue. in order to run example following things need to be done: 1.Code (SolrSharp) should be changed from: src\Configuration\SolrSearcher.cs(217):oRequest.ContentType = application/x-www-form-urlencoded; to: src\Configuration\SolrSearcher.cs(217):oRequest.ContentType = text/xml; 2. In order take care of the solr 1.2 schema invalidation issue: schema.xml comment line: 265 !-- field name=word type=string indexed=true stored=true/-- comment line: 279 !-- field name=timestamp type=date indexed=true stored=true default=NOW multiValued=false/-- or as Jeff suggested: For the example code, adding the timestamp field in the ExampleIndexDocument public constructor such as: this.Add(new IndexFieldValue(timestamp, DateTime.Now.ToString(s)+Z))); Regards Michael - Original Message - From: Jeff Rodenburg [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, June 20, 2007 1:56 PM Subject: Re: SolrSharp example On 6/20/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote: This is a log that I got after runnning SolrSharp example. I think example program posts not properly formatted xml. I'm running Solr on Windows XP, Java 1.5. Are those settings could be the problem? Solr1.2 is pickier about the Content-type in the HTTP headers. I bet it's being set incorrectly. Ahh, good point. Within SolrSearcher.cs, the WebPost method contains this setting: oRequest.ContentType = application/x-www-form-urlencoded; Looking through the CHANGES.txt file in the 1.2 tagged release on svn: 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using the new request dispatcher (SOLR-104). This requires posted content to have a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'. The response format matches that of /select and returns standard error codes. To enable solr1.1 style /update, do not map /update to any handler in solrconfig.xml (ryan) For SolrSearcher.cs, it sounds as though changing the ContentType setting to text/xml may fix this issue. I don't have a 1.2 instance to test this against available to me right now, but can check this later. Michael, try updating your SolrSearcher.csfile for this content-type setting to see if that resolves your issue. thanks, jeff r.
Re: Multiple doc types in schema
Ignore the poor segmentation scheme (document types combined with categorizing), but this is working quite well as we get close to going live with a product. This static IndexDocKey class contains enumerator that generates Catalog keys for each type of document (POJO / Model object) that gets indexed. The indexing process assigns a Catalog key for each document type, the process extracts the catid (doc-type) as well as other information that is put into the index doc. Just an idea for you: /** * This drives much of the categorization of indexes and the subsequent query * filters. Lots of logic built into these enumarations and really are rules * that may better be injected or looked up in a true rules engine. This is the * start of system generated markup and metadata. * * @author jdow * @version %I%, %G% * @since 0.90 *p * * pre * TODO: Review to see if want to keep in the index doc, but make the enumeration * of Categories, SubCat, etc...more meaninful and order them in the right way * to facilitate getting back filtered docs in a controlled sort order. * /pre */ public class IndexDocKey implements Serializable { // STATICS public static final long serialVersionUID = 1L; public static long getSerialVersionUID() { return serialVersionUID; } @SuppressWarnings(unused) protected Category category; public IndexDocKey() { } public IndexDocKey(Category category) { this.category = category; } public void setCategory(Category category) { this.category = category; } /* * public void setCatDoc(CatDoc catdoc) { this.catdoc = catdoc; } */ public enum Category implements Serializable { SYSTEM(S, System, null, null), SYSPING(S0010, System, null, null), APPCNTEXAMPLES(AC01L, Example, ZExample.class, Person.class), APPCNTPEOPLE(AC02P, Example, ZExample.class, Person.class), APPCNTDISCUSS(AC03D, Example, ZExample.class, Person.class), APPCNTIMAGE(AC04I, Example, ZExample.class, Person.class), APPCNTFILES(AC05F, Example, ZExample.class, Person.class), APPCNTEVENT(AC02E, Example, ZExample.class, Person.class), EXAMPLE(L, Example, ZExample.class, Person.class), EXAMPLECNTPEOPLE(L00CP, Example, ZExample.class, Person.class), EXAMPLECNTDISCUSS(L00CD, Example, ZExample.class, Person.class), EXAMPLECNTIMAGE(L00CI, Example, ZExample.class, Person.class), EXAMPLECNTFILES(L00CF, Example, ZExample.class, Person.class), EXAMPLECNTEVENT(L00CE, Example, ZExample.class, Person.class), EXAMPLEIDENTITY(L00LI, Identity, Identity.class, ZExample.class ), EVENT(LCE10, Content, Content.class, Content.class), EVENTLABEL(LCE11, ContentLabel, ContentLabel.class, Content.class), EVENTCOMM(LCE12, ContentComment, ContentComment.class, Content.class), EVENTPROP(LCE13, ContentProperty, ContentProperty.class, Content.class), DISCUSS(LCD10, Content, Content.class, Content.class), DISCUSSLABEL(LCD11, ContentLabel, ContentLabel.class, Content.class), DISCUSSCOMM(LCD12, ContentComment, ContentComment.class, Content.class), DISCUSSPROP(LCD13, ContentProperty, ContentProperty.class, Content.class), IMAGE(LCI10, Content, Content.class, Content.class), IMAGELABEL(LCI11, ContentLabel, ContentLabel.class, Content.class), IMAGECOMM(LCI12, ContentComment, ContentComment.class, Content.class), IMAGEPROP(LCI13, ContentProperty, ContentProperty.class, Content.class), FILE(LCF10, Content, Content.class, Content.class), FILELABEL(LCF11, ContentLabel, ContentLabel.class, Content.class ), FILECOMM(LCF12, ContentComment, ContentComment.class, Content.class), FILEPROP(LCF13, ContentProperty, ContentProperty.class, Content.class); private String catid; private String catname; private String catdoc; private Class? catclass; private Class? catparentclass; private Category(String catid, String catdoc, Class catclass, Class catparentclass) { this.catid = catid; this.catdoc = catdoc; this.catname = this.name(); } public String getCatId() { return this.catid; } public String getCatName() { return this.catname; } public String getCatDoc() { return this.catdoc; } public Class? getCatClass() { return this.catclass; } public Class? getCatParentClass() { return this.catparentclass; } } } On 6/20/07, Otis Gospodnetic [EMAIL PROTECTED] wrote: This sounds like a potentially good use-case for SOLR-215! See https://issues.apache.org/jira/browse/SOLR-215 Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original
delete changed?
solr:1.2 curl http://192.168.7.6:8080/solr0/update --data-binary 'deletequerynodeid:20/query/delete' i remember it is ok when i use solr 1.1 does it change? it show me: HTTP Status 400 - missing content stream -- *type* Status report *message* *missing content stream* *description* *The request sent by the client was syntactically incorrect (missing content stream).* -- regards jl
RE: Faceted Search!
Hi Mike, Currently, I'm just running the demo example provided in the Solr web site on my local windows machines. I was purely looking into generating XML feed file and feeding to the Solr server. However, I was also looking into implementing having sub-categories within the categories if that make sense. For example, in the shopper.com we have the categories of by price, manufactures and so on and with in them,they are sub categories (price is sub-cat into $100, 100-200, 200-300 etc). I don't have constraint in terms of technology. If I have to implement db server I won't mind implementing it. Anyway, plz shine a light on how would you handle this issue. Any suggestion will be appericated. Thanks, Niraj Mike Austin [EMAIL PROTECTED] wrote: Niraj: What environment are you using? SQL Server/.NET/Windows? or something else? -Mike -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 20, 2007 4:24 PM To: solr-user@lucene.apache.org Subject: Re: Faceted Search! : define the sub-categories. let's say from the above example, the : category price has different sub-categories like less than 100 : ,100-200? I'm guessing, we explicit define this in XML feed file, but : I could be very wrong. In any case, can you please give me the short : example achieve that implementation. Well, thanks once again. there's nothing out of the box from Solrthat will do this, it's something you would need to implement either in the lcient or in a custom request handler ... Solr's Simple Faceting support is esigned to be just that: simple. but the underlying methods/mechanisms of computing DocSet intersetions can be used by any custom requets handler to generate application specific results. I've got 3 or 4 indexes that use the out of the box SimpleFacet support Solr provides, but the major faceting we do (product based facets) all uses custom request handlers so we can have very exact control on all of this kind of stuff driven by our data management tools. -Hoss - Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
Recent updates to Solrsharp
Thanks to Yonik, Michael, Ryan, (and others) for some recent help on various issues discovered with Solrsharp. We were able to discover a few issues with the library relative to the Solr 1.2 release. Those issues have been remedied and have been pushed into source control. The Solrsharp source code can be obtained at: http://solrstuff.org/svn/solrsharp. Recent fixes include: - Fix for broken DeleteIndexDocument xml serialization - Update to correct document posting content-type to solr 1.2 instance - Identifying schema fields with new IsDefaulted property - Updates to the example application to incorporate these fixes and the solr 1.2 sample schema - Updated documentation consistent with these changes As an aside, it would be nice to record these issues more granularly in JIRA. Could we get a component created for our client library, similar to java/php/ruby? cheers, j
Re: Recent updates to Solrsharp
On 6/21/07, Jeff Rodenburg [EMAIL PROTECTED] wrote: As an aside, it would be nice to record these issues more granularly in JIRA. Could we get a component created for our client library, similar to java/php/ruby? Done. -Yonik